Recently, I was talking with my daughter, the engineer, about testing. She is the lead engineer for payload integration and test for a large NASA space telescope. Our discussion got me to thinking about cyber testing and test metrics. From her space telescope perspective, it is very expensive to conduct tests, with some tests requiring build-out of very large and expensive facilities to simulate the effects of space. At the same time, what to test and how to test are critically important since you have no good options to correct mistakes once you put the hardware in production millions of miles away. She also mentioned how they design and test the systems to be resilient in the face of failure – with back-up systems to back-up systems. This reminded me of the plight, in many cases, of operational technology (OT) whose asset owners have severe restrictions on updating systems and they set a very high bar on maintaining uptime. But do we really design and test OT systems to be cyber resilient? What are the cyber properties of a system we really care about and try to measure? Can we determine if a software component, or a new version, or a system is more resilient than another?

In Presidential Decision Directive 21: Critical Infrastructure Security and Resilience (PDD-21), cyber resilience was defined as the “ability to prepare for and adapt to changing conditions and withstand and recover rapidly from disruptions. Therefore, being resilient means the enterprise can sense and respond to change as it is happening, not after it has happened. To this end, cyber resilient organizations should use predictive measures or threat intelligence to inform defenses. These are measures from threat intel providers of the likelihood of a type of attack that exploits a type of vulnerability, of a type of software component and configuration combination, by a type of attacker to acquire a type of capability on our network, or acquire control over a specific asset, or to disrupt a particular operation on the network.

For services that are mission-essential, or that require high or uninterrupted availability, cyber resiliency should be built into the design of systems that provide or support those services. Cyber resiliency is particularly important for a subset of critical infrastructures known as lifeline sectors or strategic infrastructures. A 2015 NIAC Report “identified [….] five sectors or subsectors to be core members of the [Strategic Infrastructure Executive] Council because of their centrality to the resilience of most of the other sectors and their national security implications when disrupted:”

  • Electricity
  • Water
  • Transportation
  • Communications
  • Financial Services

Although ideally, each strategic sector or sub-sector would be resilient, not every IT and OT system within them will be because it would be cost-prohibitive.

Before we go on in our discussion of measuring cyber resiliency properties of a system, we need to clearly separate what is a “metric” and what is a “measure.” The term metric is often used to refer to the measurement of performance, but it is clearer to define metrics and measures separately. A measure is a concrete, objective attribute, such as the percentage of systems within an organization that are fully patched, the length of time between the release of a patch and its installation on a system, or the level of access to a system that a vulnerability in the system could provide. A metric is an abstract, somewhat subjective attribute, such as how well an organization’s systems are secured against external threats or how effective the organization’s incident response team is. An analyst can approximate the value of a metric by collecting and analyzing groups of measures.

There are many examples of cyber metrics such as: risk score, attack surface,  criticality of assets, cybersecurity management maturity level, trustworthiness, PCI/HIPAA/GDPR compliance, even the “genomic” history of malware. Some cyber measures include indicators of compromise, indicators of behavior, density of vulnerability per endpoint, number and severity of unpatched vullnerabilities, incident response time, the number of packets that did not reach their destination or the number (or percentage) of legitimate packets discarded by the defense system, Another important measure is dwell time, which is the amount of time a threat actor has undetected access within a network before being completely removed. This is relevant because the longer it takes for a company to contain an attack, the more resilient it must be (and the more it will cost).

Each year we also have several reports that evaluate the previous year’s attack trends and forecast the trends for the coming year. Our cyber tools also provide dashboards where we can drill down on a specific measure of a specific artifact related to a specific cyber occurrence at a particular location of a particular system, or we can roll up these specific occurrences to trends on how well our defenses are doing. But does a risk score and these other metrics provide good indicators of how resilient my enterprise is to cyber attacks? Can I take a body blow and keep on fighting? Can I operate and still prosecute my mission in a contested, degraded cyber battlespace? Can I recover from a ransomware attack? How long can I operate my business when I am the target of a distributed denial of service attack or a pandemic? How can I measure my cyber resiliency and what metric reflects a “good” cyber resiliency?

One methodology for measuring cyber resiliency has been developed by MITRE. Called the Situated Scoring Methodology for Cyber Resiliency (SSM-CR), this tailorable scoring methodology is intended to provide System Managers with a simple relative measure of how cyber resilient a given system is, and of whether and how much different alternatives change that measure. SSM-CR is situated or context-adjusted in two ways: First, it reflects stakeholder priorities (i.e., which objectives, subobjectives, and capabilities are important). Second, performance assessments (i.e., how well prioritized capabilities are provided or how well prioritized activities are actually performed) are made with respect to stated assumptions about the operational and threat environments. An underlying threat model is also an essential input to a cyber resiliency assessment using this scoring method. MITRE has also published the Cyber Resiliency Engineering Aid – The Updated Cyber Resiliency Engineering Framework and Guidance on Applying Cyber Resiliency Techniques  describes potential interactions (e.g., dependencies, synergies, conflicts) between cyber resiliency techniques, depending on the implementation approach. It also identifies potential effects that implementations of cyber resiliency techniques could have on adversary activities throughout different stages in the cyber attack lifecycle. Finally it also includes provisional information on relative maturity and ease of adoption for representative approaches to implementing cyber resiliency techniques.

NIST has published NIST SP 800-160 vol 2 (final) – Developing Cyber Resilient Systems: A Systems Security Engineering Approach – which provides design criteria and measures for verifying and validating the cyber resilience of a system using system engineering approaches. You can find out more about this approach in my interview with NIST’s Ron Ross – the lead author for this publication.

There are other sources from NIST that provide methods for cyber testing and measurement, including:

Cyber resilience is also being measured at the human level as described in a study by the National Institutes of Health. The Susceptibility and Resilience to Cyber Threat (SRCT) is an immersive scenario decision program which measures susceptibility to cyber threat and malicious behavior as well as protective resilience actions via participant responses/decisions to emails, interactions with security dialogs, and computer actions in a real-world simulation. Data was collected from a sample of 190 adults (76.3% female), between the ages of 18–61 (mean age = 26.12). Personality, behavioral tendencies, and cognitive preferences were measured with previously validated protocols and self-report measures.

Organizations are using multiple technologies to identify and track security measures and compile metrics, including security analytics platforms; threat intelligence systems, vulnerability management tools; governance, risk, and compliance platforms; and vendor risk management platforms. But multiple tools can compound the issues around measurement and reporting, creating different versions of the truth and uncertainty regarding the true cyber resiliency posture of an organization.

So what makes a measure useful in aiding our desire to determine a rubric for cyber resiliency of our enterprise? According to NIST there are several practical principles that we can use to make measures more focused, accurate, and measurable. The first principle is you can’t reliably measure what you haven’t clearly defined. So as a first step it is imperative to determine what outcomes or goals you are you trying to achieve and why. From a cyber resilience perspective, some key goals according to NIST include:

  • Anticipate – maintain a state of informed preparedness for adversity,
  • Withstand – continue essential mission or business functions despite adversity
  • Recover – restore mission or business functions during and after adversity
  • Adapt – modify mission or business functions and/or supporting capabilities to predicted changes in the technical, operational, or threat environments.

The “why” are you measuring should reflect a benefit that is significant enough to offset the cost of measuring it. The second principle is “measures must yield quantifiable information (percentages, averages, and numbers).” Measures need to be quantifiable so that comparisons can be made to track performance or so that resources can be directed. The third principle is “data that supports the measures needs to be readily obtainable” – i.e., can the measure actually be collected and verified given the tools and methods at your disposal? It also means knowing the accuracy, timeliness, and the limitations of the measurements. A good illustration is patching time on our servers. We need to make sure we know the percentage of servers that aren’t covered by our scanner. After all, “90% server vulnerabilities fixed within service-level agreement” becomes decidedly less impressive if we know that only 50% of servers are being scanned. This third principle is closely aligned to the fourth principle, which is “only repeatable processes should be considered for measurement.” This 4th principle reflects the notion that if you have too many variables in what you are measuring, then you really can’t trust the measurements you get. The 3rd and 4th principles also mean that you must have confidence in the measurement methods used and that you can actually make an impact on the activity or resources you are measuring, which relates to the fifth principle – you need to make sure you know what to do when you get an answer to the original “why” question – i.e., what does the resulting measure compel you to do and do you have the resources to take this action? Essentially the purpose of measuring is to reduce uncertainty regarding a belief you have (e.g., I am / I am not being attacked) and/or to perform a workflow (e.g., a threshold has been reached to trigger an action).

So what cyber resiliency properties should I measure and how do I know I am “resilient enough?” An effective approach is to align metrics to industry-accepted frameworks such as the guidance listed above. Aligning to a framework gives an indication of how well a metrics program covers the breadth of cyber resiliency measures contained in the framework and if there’s any gaps that need filling. Frameworks can help provide a familiar structure for a metrics program and naturally provide higher levels at which you can summarize analysis and provide an effective overview for business stakeholders. For example, the NIST Cyber Resiliency Framework covered in NIST SP 800-160, V2 defines implementation approaches and techniques, as depicted in the following figure, for engineering cyber resiliency into the design of a system. The concept here is that having multiple, cohesive resiliency techniques incorporated appropriately in the design of a system will improve its overall cyber resiliency.

Another proprietary framework is Lockheed Martin’s Cyber Resiliency Level (CRL™). The CRL helps employ common risk- and engineering-based approaches, and leverages common assessment tools such as the National Institute of Standards and Technology’s (NIST) Risk Management Process and Risk Management Framework, Lockheed Martin’s threat-driven methodologies, and the Department of Defense’s Cyber Table Top (CTT) Guidebook to produce a resiliency level score similar to a maturity model like the DHS Cybersecurity Capability Maturity Model Version 2.0.  

So once you have figured out what to measure for cyber resiliency, the next question is how to measure. There are many different approaches you can use to measure cyber resiliency properties. You can test in your own lab or in an independent evaluation lab like FedRAMP or a Common Criteria Lab. Government system owners can also choose fraom a vareity of cyber ranges like the Cyber Battle Lab and Cybertropolis, or a synthetic environment. Or if you want to build your own cyber range, you can go to this site at the Security Engineering Institute and download some open source tools.

One example of a commercial cyber range is Cloud Range. Cloud Range can replicate your network infrastructure, offering multi-segment virtual replicas of enterprise IT and OT networks that include application servers, database servers, email servers, switches, routers and PLCs. The range environment can be tailored to include the exact cybersecurity tools your team uses every day, including SIEM, Firewall, endpoint security, or forensic tools, to achieve realistic evaluations and training. Cloud Range includes an ICS (Industrial Control System) security package for critical infrastructure organizations, which emulates an industrial control network and the most recent ICS/OT attack scenarios.

Underwriters Lab also offers cyber assessments on products. The UL Cybersecurity Assurance Program (UL CAP) aims to minimize risks by creating standardized, testable criteria for assessing software vulnerabilities and weaknesses in IoT products and systems. This helps reduce exploitation, address known malware, enhance security controls, and expand security awareness. This certification service allows companies to build trust, credibility and value across industries and ecosystems such as smart buildings, medical devices, industrial control systems, critical infrastructure and smart homes. Based on the UL 2900 Series of Standards and other industry standards, UL CAP’s full suite of advisory, testing and certification services is designed to help organizations manage their cybersecurity risks and validate their cybersecurity capabilities to the marketplace.

DHS also provides a Cyber Resilience Review (CRR) that enables a comparative assessment for cyber resiliency. The CRR is a no-cost, voluntary, non-technical assessment to evaluate an organization’s operational resilience and cybersecurity practices. The CRR assesses enterprise programs and practices across a range of ten domains including risk management, incident management, service continuity, and others. The assessment is designed to measure existing organizational resilience as well as provide a gap analysis for improvement based on recognized best practices.

As another approach to measuring resiliency, you can perform a simulation of the system under test or some of its critical components by applying a tool like Scalable Network’s EXata Cyber or EXata CPS to evaluate a network design or OT system for resiliency to cyber attack. More about Scalable’s offerings can be found in this interview with Active Cyber™ here.

Measuring cyber resiliency of a system should lead to further adaptation of the system to meet evolving needs. For example, OT systems, since their computation interacts with the physical world, must explicitly deal with events distributed in space and time. Therefore, timing and spatial information, along with other physical and logical properties such as physical laws, safety, or power constraints, resources, and security characteristics should be captured in a composable manner in the design of cyber-resilient systems. Such design abstractions, however, may necessitate a dramatic rethinking of the traditional split between programming languages and operating systems, as well as the allocation of cyber resilient properties at the software/hardware level given performance, flexibility, and power tradeoffs.

Also, to tolerate intermittent failures, we will likely need to apply [AI/ML] algorithms that do not rely on the accuracy of one computation. Resiliency of operation, given the potential of imprecise computations, will gain more relevance; developing algorithms using those principles will be extremely valuable on the road to resilient systems. However, it is important to remember that OT systems are real-time systems and the complexity of verifying temporal logic specifications is exponential. That is, like the physical counterpart, a perfect software component is also rare and will remain that way. This has profound implications. We need to invent a resilient OT system architecture in which the safety and security critical services of large and complex systems can be guaranteed by a small subset of resilient modules and their interactions; the design of this subset will have to be formally specified and verified. Resilient system models have to incorporate fault models and recovery policies that reflect the scale, lifetime, distributed control and replace/reparability of components. Safety requires attention in a larger context as well.

Finally, OT systems are deeply embedded and their resiliency will evolve in place. The verification and validation of OT system is not a one-time event; it should be life cycle process that produces an explicit body of evidence for the certification of critical services. Large OT systems will have many different resiliency properties encompassing stability, robustness, schedulability, security, each of which will employ a different set of protocols and will need to be analyzed using a different theory.

The resiliency of an OT system is highly dependent on the resiliency of the supply chain of components – both hardware and software – that go into it. The criticality of this dependency has become quite evident in these pandemic times. On the software supply chain side, most software includes other software. Software changes and evolves over time due to optimization, new features, security fixes, and so forth. As a result, software producers throughout the supply chain have to continually evaluate how changes might impact their software. This includes changes to 3rd-party components used to compose software. Because of the complex web of dependencies in software supply chains, any change can have far-reaching effects. How can organizations make confident, informed decisions? How can they manage the complexity of their software supply chain in a sustainable manner?

Recently, there has been some standards action, known as the Software Bill of Materials (SBOM) led by NTIA to get back some control and transparency on what is inside a software package. Automated tracking of the provenance of software components will improve the trust and security of software by helping identify potentially vulnerable components (even prior to purchase).  Organizations can also use a SBOM to decide what code components might raise red flags (e.g., a cryptographic library that offers substandard protection) and which components might have already been vetted by a trusted source. If the SBOM includes signature hashes of the components, an organization can verify the sourcing of third-party components to limit the risk that counterfeit or backdoored components slipped into the supply chain of the supplier. An organization can be made aware of end-of-life components for which future support will not be available. While these components may not be a risk when the software is first acquired, any problem found later in the component will most likely not be fixed. An organization can also use the SBOM to track the evolution of the cyber resiliency of a software package over time by understanding the resiliency of each component. Resiliency scores could be assigned to each component by software producers and/or users based on tests and measurements using some of the criteria listed in this article.

The effort by NTIA is focused on developing the SBOM standard which encompasses several other initiatives for software tracking including SWIDSPDXCycloneDXPackage URL, Common Platform Enumeration (CPE) and others.

Next time I will be looking at some new methods to assess the “resiliency” and “trustworthiness” of OT systems.


And thanks to my subscribers and visitors to my site for checking out ActiveCyber.net! Please give us your feedback because we’d love to know some topics you’d like to hear about in the area of active cyber defenses, PQ cryptography, risk assessment and modeling, autonomous security, digital forensics, securing OT / IIoT and IoT systems, Augmented Reality, or other emerging technology topics. Also, email chrisdaly@activecyber.net if you’re interested in interviewing or advertising with us at Active Cyber™.