There is a Department of Defense (DOD) operational need for cyber defense capabilities to defend critical infrastructure from cyber attack. Critical infrastructure systems, such as power, water and wastewater, and safety controls, affect the physical environment.
These systems traditionally relied on physical security such as physical access control. The introduction of the Industrial Internet of Things (IIOT) to traditional Operational Technology (OT) systems evolved critical infrastructure systems into cyber-physical systems, making these systems susceptible to cyber attacks such as ransomware. Extended technology refresh cycles of 20 years or more undermine the ability to address vulnerabilities with engineering upgrades. Further, OT and Information Technology (IT) experts have varying contextual approaches to their respective domains. Systems engineering principles, when deployed in the engineering and post-development phases, is a mechanism for integration of contextual information from cyber-physical systems into a model for cyber defense capabilities for highly context-sensitive critical infrastructure dynamic classes. More Situational Awareness for Industrial Control Systems (MOSAICS) is piloting an initial capability to address cyber defense of critical infrastructure. The MOSAICS capability concept was to automate the existing manual procedures to detect, mitigate and recover from a cyberattack using effective system baselining and segmentation of the ICS network to help prevent malware breach and proliferation, combined with the best of breed technologies related to analytics, visualization, decision support, and information sharing. This paper examines engineering and post deployment of a demonstration for the transition and integration into fielded systems.
In early 2016, two Combatant Commands identified an operational need to defend DoD mission-critical infrastructure. Sandia National Laboratories (SNL) and the Naval Facilities Engineering Command (NAVFAC) responded with a concept to address the operational need by bringing the best of breed tools to the DoD and named the initiative “MOSAICS,” or More Situational Awareness for Industrial Control Systems. The MOSAICS capability concept was to automate selected procedures to detect, mitigate and recover from a cyberattack, combined with the best of breed technologies related to analytics, visualization, decision support, and information sharing [1].
System studies identified three initial MOSAICS capabilities: 1) an operational capability to enable defense of control systems; 2) an Industrial Control Systems (ICS) baselining tool for Programmable Logic Controller (PLC) sensors; and 3) tailored visualizations, analytics, and automated cybersecurity orchestration for improved remediation strategies. Systems engineering principles were applied during the concept development phase to convert operational needs into an engineering-oriented view in several modes.
The MOSAICS development proposed a proof-of-concept prototype for an OT threat surface, which includes ICS and Supervisory Control and Data Acquisition (SCADA) systems to the subsystem component level of PLCs or Discrete Process Control Systems (DPCs). ICS is an operational segment within OT used to monitor and control industrial processes (e.g., power consumption on electrical grids). ICS is often managed by a SCADA system that provides Graphical User Interfaces (GUI) for operators (e.g., out-of-band operation alarm indicators). ICSs are typically either a continuous process control system managed by PLCs, or DPCs used as batch control devices.
The Department of Homeland Security (DHS) identifies 16 critical infrastructure sectors: Chemical; commercial facilities; communications; critical manufacturing; dams; defense industrial base; emergency services; energy; financial services; food and agriculture; government facilities; healthcare and public health; information technology; nuclear reactors, materials, and waste; transportation; and water and wastewater systems [2]. The initial MOSAICS Joint Capability Technology Demonstration (JCTD) prototype development is for an energy system. MOSAICS will later be applied to water, and other sectors. ML (Machine Learning) and AI (Artificial Intelligence) capabilities will be incorporated to minimize human actions where possible.
Systems engineering principles provide a mechanism to integrate cyber defense capabilities into context-sensitive critical infrastructure dynamic classes. Context-sensitive critical infrastructure dynamic classes are systems interpreted by 1) the view of the OT or IT operator, 2) the critical infrastructure sector, and 3) dynamically classified at the time of operation rather than as a static set of classes.
The OT operator manages physical processes and machinery while the IT operator manages information flows of digital data. There is a substantial distinction between static and dynamic classes of critical infrastructure systems. Each critical infrastructure sector is dynamic, and within each sector, every cyber-physical system is dynamic. The potential risk introduced by the context-sensitive critical infrastructure dynamic classes must be addressed as early as possible and revisited throughout the systems engineering lifecycle. “As a system’s diversity, connectivity, interactivity, or adaptivity increases, the risk associated with using simpler methods and simplifying assumptions also increases, and more advanced techniques may be needed. Tools and techniques apply differently to systems on a spectrum of increasing complexity” (INCOSE, 2015) [3]. Unlike many applications in machine learning, where acute consideration of training data can lead to overfitting, in cybersecurity all training data generally must be taken seriously. This may be an important branching point for the future of ML/AI in cybersecurity compared to normal machine intelligence. An automated test harness is being explored to test the consistency of the MOSAICS system.
The software has a significant influence on the design of a system as a driver. MOSAICS uses the expanding role of orchestration to implement the requirements, functionality, and behaviors of the system. Software trade-offs determine if the right quality attributes are promoted in the design. Software constraints are also limiting factors to options for making design decisions. An operating system is an example of such a constraint. Building software systems for a solution that works on Windows or Linux is a software constraint that influences the design of the system. Other examples would be the selection of an algorithm or a specific interface protocol. MOSAICS uses open Application Protocol Interfaces (APIs) to address this constraint, and to avoid vendor lock. APIs are sets of protocols used for building software applications that specify how components interact.
Tests can demonstrate necessary corrections in software code after each spiral development. The goal is to fail fast and fail early to avoid an expensive cycle of debugging codes later. Tests that should be performed such as functionality testing to ensure that the software does not crash; code review to uncover any problems; static code analysis; unit testing to make sure the unit is working as expected by testing in a range of both valid and invalid inputs; and user performance testing in a real world environment [4].
The reasons complex system developments incur risks include incomplete specifications until late in the development lifecycle, unclear requirements definitions, unaddressed risks, and a lack of required expertise or inadequate expertise in the new technology. At the time of this publication, MOSAICS is scheduled for a test during Trident Warrior 2020, an annual large-scale Navy field experiment. The Trident Warrior experiment series selects and evaluates initiatives to address capacity gaps in an operational environment. During the advanced development phase, uncertainties are resolved. Small sets of requirements are developed using spiral development, allowing for incremental releases and refinement through each iteration. The principal purpose of this approach is to reduce risk. This phase is especially critical as MOSAICS concepts significantly depart from traditional OT system security approaches. Requirements analysis reexamines the validity of the functional specifications and identifies components that require further development.
Many new complex system developments incur significant risks because they choose immature technology. In these cases there are often insufficient laboratory tests to measure the performance parameters in order to make analytical performance predictions. MOSAICS buys down risk in a laboratory by using more mature Commercial-Off-the Shelf (COTS) technology. Selection and use of COTS technology helps drive innovation and competition in the commercial sector, as it offers opportunity for not only initial implementation but also for follow-on work as the government is not in the business of lifecycle product support. This COTS approach further serves to lower risk as all of the asset inertia from field deployment has fed back design flaws for version-based incremental improvement over time. COTS is essential to speed innovation as well as incremental improvements to the warfighter. Technology Readiness Level (TRL) is a standard for evaluating the maturity of a technology to determine if it is a useable choice for complex system development. TRL 1 is the lowest level of maturity, and TRL 9 is the highest. Utilizing COTS, MOSAICS is TRL 7 and higher.
Extended technology refresh cycles of 20 to 30 years or more in critical infrastructure systems undermines the ability to address vulnerabilities. Because of the extended refresh cycle, MOSAICS enhancements will have to perform requirements well beyond those expected from similar IT systems, as there is no predecessor system for OT. The extended refresh cycle of critical infrastructure systems frequently results in the use of older technologies designed for functionality requirements rather than cybersecurity requirements. The convergence of OT and IT makes cyber-physical systems equally susceptible to cyber attacks, yet OT is contextually and dynamically distinct from IT. Components that use new technology can be attractive options for consideration of new system development to meet performance requirements for many years beyond the original design. Component expertise has varying contextual technological approaches to respective domains and varying behavioral approaches and responses, particularly to Human-Introduced Cyber Vulnerabilities (HICV) [5].
The resilience of these systems becomes a potentially valuable metric for this diverse group of systems that may be used to complement risk frameworks such as the DoD risk-based “Cybersecurity Framework,” designed for IT systems. Quantitative assessment of the resilience of networked cyber-physical systems might be measured by critical functionality based on a time-specific performance control time function (Tc (time over which system performance is evaluated)) derived by the operational input [6]. Complex adaptive systems are a challenge to discuss without a model. While a particular model may represent conditions within one system, variables to user states may carry different meanings from one system of systems to another. Several candidate approaches are used to address complexity, such as seeking to understand the big picture, observing how elements within the system change, identifying the system structure relationship to system behavior, and understanding test assumptions. The recommended solution architecture is designed to “provide robustness and timely recovery to a minimally functional state.” (INCOSE, 2015).
An example of the “design for resilience” principle may be found in the Integrated Adaptive Cyber Defense (IACD) component of MOSAICS [7]. IACD is an extensible, adaptive framework to improve the effectiveness of the system defenses. While the framework was created to address IT environments, it is being applied to an OT environment using systems engineering principles. The assumption is that if the approach were applicable for IT complexity, then the same approach would also apply to OT complexity. This application of systems engineering reuse is a benefit in installing, maintaining, and upgrading the system throughout the lifecycle of the system.
Unknown unknowns can be expected to appear during engineering design. One way to estimate the number and scope of unknown unknowns is to thoroughly examine a given percentage p of the code base for them, and then scale the number based on 1/p. Potential deficiencies are addressed in MOSAICS by employing experienced designers and testers employed “in combination with disciplined software design procedures” (Kossiakoff, 2011) [8]. This approach is relevant to hardware, as well. Potential “unknown unknowns lurk in untrusted components, can come from insider threats, and may result from externally introduced malware that can penetrate OT previously considered to be “air-gapped” in an increasingly networked computer world. Detection may be difficult, hence the need for experienced Red-Team testing, which has proven to be a critical part of security testing and evaluation. Viruses, worms, and spyware may be embedded in a system before the implementation of a defensive solution. A challenge is understanding what “normal” or “known good” looks like in the absence of a virus (if a virus is already present). Solutions today are only able to detect what is known, or, in other words, known malware. The cost of modeling and simulation technologies is prohibitively expensive for one-off (or “snowflake” systems). Without the demonstration of a “smoking gun” (i.e., existing malware in the system), few system owners will accept the high cost of new development. Since there are few rules or signatures in cyberattacks on critical infrastructure systems, assigned personnel must have both cybersecurity and OT knowledge. The culmination of engineering design is the realization of a final MOSAICS design (e.g., requirements analysis, functional analysis and design, component design, and design validation). At this point, all the modular components have to fit together to meet the operational requirements.
Part 1 of this article has described the engineering of the MOSAICS JCTD prototype pilot. In Part 2 the authors will describe the MOSAICS development of this initial cyber defensive capability for industrial control systems.
References
- Aleksandra Scalco, M. J., Steve Simske (2019). “More Situational Awareness for Industrial Control Systems (MOSAICS) Joint Capability Technology Demonstration (JCTD): A Concept Development for the Defense of Mission Critical Infrastructure. Homeland Defense & Security Information Analysis Center.
- (CISA), D. o. H. S. D. C. I. S. A. (2019). “Critical Infrastructure Sectors.” Retrieved December 6, 2019, from https://www.dhs.gov/cisa/critical-infrastructure-sectors.
- INCOSE, 17.
- Steve Simske. (2019). ENGR 501 Guest Lecture. Colorado State University (CSU).
- Terry Merz, C. F., Aleksandra Scalco= (2019). “A Context-Centered Research Approach to Phishing and Operational Technology in Industrial Control Systems.” The Journal of Information Warfare (JIW).
- Zachary A. Collier, M. P., Alexander A. Ganin, Alex Kott, Igor Linkov (2016). Security Metrics in Industrial Control Systems.
- JHU APL, (2019). “Integrated Adaptive Cyber Defense (IACD).” Integrated Adaptive Cyber Defense (IACD). Retrieved December 9, 2019.
- Alexander Kossiakoff, W. N. S., Samuel J. Seymour, and Steven M. Biemer (2011). Systems Engineering Principles and Practice, John Wiley & Sons, Inc. Publication.