My most recent article discussed the first 5 of my top 10 recommended security capabilities for OT and IIoT systems. Here they are again for your reference.

Capability 1: Real-time visibility and compliance tracking of assets that may have limited function and power
Capability 2: Real-time anomaly detection including increased use of AI/ML technology and big data analytics
Capability 3: Strong, comprehensive authentication 
Capability 4: Trusted systems and trusted data
Capability 5: Threat-Informed defenses

In this article, I finish off the top 10 capabilities for securing OT and IIoT systems. So check out the article below and let me know your top 10. Or let me know if you have a product or capability that should be part of the top 10.


Capability 6: Cyber Hygiene Best Practices

The Fortinet 2019 OT Security Trends report of May 2019 noted … “structural problems are exacerbated by security hygiene practices within many OT environments that may be unintended due to digital transformation efforts.” These structural problems lead to increased risk due to gaps in protection and vulnerabilities in the OT and IIoT environments. Security frameworks are meant to provide guidelines for OT asset owners large and small to use to reduce their cyber security risks. Of the many guidance and standards documents that are out there, the NIST Cybersecurity Framework for Manufacturing, IEC 62443, the CERT Resilience Management Model (CERT-RMM), and the Center for Internet Security’s Controls ICS Companion Guide are a few guidance documents I recommend for getting started on good hygiene:

  • NIST Cybersecurity Framework Manufacturing Profile: The U.S. government’s NIST Cybersecurity (National Institute of Standards and Technology) Framework offers a roadmap for identifying opportunities to improve a manufacturer’s current cyber security posture, and evaluate their ability to manage their OT environment for risk. It also presents a standard approach for developing an ongoing cyber security plan. The chart below is an extract from the Profile showing a mapping between framework controls and activities for the RESPOND objective. In addition, the National Cybersecurity Center for Excellence, which is hosted by NIST, is embarking on a new project titled Detecting and Protecting Against Data Integrity Attacks in Industrial Control System Environments that is a practical application of the Framework’s Manufacturing Profile. The NCCoE project team will leverage the NIST Engineering Laboratory to provide a comprehensive approach that manufacturing organizations can use to address the challenge of protecting OT systems against data integrity attacks by leveraging the following cybersecurity capabilities: behavioral anomaly detection, security incident and event monitoring, OT application white-listing, malware detection and mitigation, change control management, user authentication and authorization, access control least privilege, and file-integrity checking mechanisms. This project will result in a publicly available NIST Cybersecurity Practice Guide, a detailed implementation guide of the practical steps needed to implement the cybersecurity reference design that addresses this challenge.

  • IEC 62443: IEC 62443 – Security for Industrial Automation and Control Systems: Technical Security Requirements for IACS Components provides a framework for addressing and mitigating security vulnerabilities in OT systems. It outlines technical standards for the components used in industrial control systems, including embedded devices, network assets and software.
  • CERT-Resilience Management Model (RMM) and its resilience management methodologies help organizations consider resilience to be a foundational property of all policies, plans, processes, and procedures. CERT-RMM has more than 200 resilience management practices spread across 26 process areas, ranging from Asset Definition and Management, to External Dependencies Management, to Vulnerability Analysis and Resolution.
  • Center for Internet Security’s (CIS) Controls ICS Companion Guide provides guidance on how to apply the security best practices found in CIS Controls Version 7 to Industrial Control System environments. For each top-level CIS Control, there is a brief discussion of how to interpret and apply the CIS Control in such environments, along with any unique considerations or differences from common IT environments.

If you need technical help in setting up and assessing your cyber hygiene practices, you should also consider these services from CISA at DHS – National Cybersecurity Assessments and Technical Services (NCATS). These services are available at no cost to federal agencies, state and local governments, critical infrastructure, and private organizations generally. The offered services include:

  • Cyber Hygiene: Vulnerability Scanning: helps secure your internet-facing systems from weak configuration and known vulnerabilities, and encourages the adoption of modern security best practices. DHS performs regular network and vulnerability scans and delivers a weekly report for your action. Once initiated, this service is mostly automated and requires little direct interaction.
  • Phishing Campaign Assessment (PCA): measures your team’s propensity to click on email phishing lures. Phishing is commonly used as a means to breach an organization’s network. The assessment occurs over a 6 week period, and the results can be used to provide guidance for anti-phishing training and awareness.
  • Risk and Vulnerability Assessment (RVA): allows you to select from a menu of several network security services, including network mapping and vulnerability scanning, phishing engagements, web application or database evaluations, a full penetration test. The assessment period differs by the number and type of services requested, but a typical RVA will take place over a two week period.
  • Validated Architecture Design Review (VADR) evaluates your systems, networks, and security services to determine if they are designed, built, and operated in a reliable and resilient manner. VADRs are based on standards, guidelines, and best practices and are designed for Operational Technology (OT) and Information Technology (IT) environments. A VADR includes Architecture Design Review, System Configuration and Log Review, Network Traffic Analysis.

For do-it-yourselfers and as a supplement to the NCATS services, I recommend employing the Cyber Security Evaluation Tool (CSET®). CSET provides a systematic, disciplined, and repeatable approach for evaluating an organization’s security posture. CSET is a desktop software tool that guides asset owners and operators through a step-by-step process to evaluate industrial control system (ICS) and information technology (IT) network security practices. Users can evaluate their own cybersecurity stance using many recognized government and industry standards and recommendations. The CSET download has moved to GitHub: https://github.com/cisagov/cset#cset-921.

Good hygiene also requires cyber threat awareness, so be sure to track alerts and advisories from ICS CERT and ICS ISAC.

DHS CISA is also looking to establish a vulnerability disclosure program (VDP) and has issued Binding Operational Directive 20-01 (draft) Develop and Publish a Vulnerability Disclosure Policy. More information about the directive can be found at the Assistant Director’s blog post. A binding operational directive (BOD) is a compulsory direction to federal, executive branch departments and agencies for purposes of safeguarding federal information and information systems. Federal agencies are required to comply with DHS-developed directives. According to the draft BOD, “By putting a vulnerability disclosure policy in place, agencies make it easier for the public to know where to send a report, what types of testing are authorized for which systems, and what communication to expect. When agencies integrate vulnerability reporting into their existing cybersecurity risk management activities, they can weigh and fix a wider array of concerns.” It is similar to a bug bounty program but there is no payment required in this VDP.

Reporting vulnerabilities is only half the battle. It can be tricky for security teams unfamiliar with the ins and outs of specific OT and IIoT devices to identify which vulnerabilities represent major problems and which don’t. If your teams don’t understand the context in which a device operates, it can lead to drastic steps such as unnecessarily isolating a seemingly vulnerable device from the network. Most devices are application-specific and have limited memory and computing power. They also rarely have the full OS loaded, and many of the security controls are also not available for mitigation. It’s important for end users to develop a network security baseline specific to IIoT devices, rather than trying to take the IIoT device and fitting it into their current network security guidelines.

Patch management is also a cyber hygiene activity that often has different rule and processes than what is typically followed for IT systems. OT systems often have “always on” requirements. This makes rebooting and patching non-viable strategies for many systems. Furthermore, the software that executes processes in many of these systems is often old and requires extensive analysis and testing to meet safety requirements; it cannot be easily changed because the “downtime” cost of implementing changes is prohibitive.

ISA-TR62443-2-3, Security for industrial automation and control systems Part 2-3: Patch management in the IACS environment, describes requirements for asset owners and industrial automation and control sytem (IACS or aka OT) product suppliers that have established and are now maintaining an IACS patch management program. It is built on a strong underlying theme: applying patches is a risk management decision. It serves to stress the point that unpatched or otherwise out-of-date control systems can present risks that are just as serious as more traditional process safety issues. Given the geographically distributed nature of some of the OT and IIoT environments, software over the air updates (SOTA) are becoming a preferred method for distributing patches. For example, the introduction of SOTA updates in the automotive industry offers both the Original Equipment Manufacturer and the driver many advantages such as cost savings through inexpensive over the air bug fixes. Furthermore, it enables enhancing the capabilities of future vehicles throughout their life-cycle. However, before making SOTA a reality for any kind of safety-critical OT or IIoT functions, major challenges must be deeply studied and resolved: namely the related security risks and the required high system safety. The security concerns are primarily related to the attack and manipulation threats of wireless connected and update-capable OT and IIoT systems. Also, there are extant concerns for all OT and IIoT systems due to the complexity of the update process. Many of the OT / IIoT environments are comprised as system of systems, therefore, there remain many questions about  what system is to be patched, what order to patch, and how to back out a patch – the functional safety requirements must be fulfilled despite the agility offered by software updates. In consideration of these concerns, most asset owners are using SOTA today primarily as a means of transporting updates – with human intervention for applying the patch and with very minimal to no automated patching.

Configuration management is also becoming a bigger cyber hygiene concern due to the proliferation and greater capacity of IIoT devices / sensors, sensitivity over supply chain weaknesses, and as updates to baselines become more prevalent in OT system environments. There are many aspects to proper configuration management, among the most critical capabilities are purchasing devices that are secure by default (no unnecessary services or unneeded functionality, no default passwords, etc.), hardening systems prior to deployment – see CIS benchmarks for details, maintaining a secure supply chain, being able to automatically detect drift from prescribed baselines, and building security into the development process. Each of these concerns will be a topic of future articles and interviews here on ActiveCyber.net.

As the risk to OT / IIoT systems rises, developing and testing a recovery plan/escalation process become critical cyber hygiene activities. A recent Ponemon/Siemens survey reportAssessing Operational Readiness Of The Global Utilities Sector – Caught in the Crosshairs: Are utilities keeping up with the industrial cyber threat? noted many poor hygiene factors that contribute to higher risk for OT systems. The report noted that 35% of the respondents have no response plan for cyber attacks. Respondents also reported that, on average, responses to malware attacks took 72 days after an outage. The slow response times and lack of preparation indicate major opportunities to improve preparedness. A well-documented system architecture that outlines assets, users, dependencies, controls, data types and flows is critical to understanding business impacts due to an incident, while also prioritizing recovery objectives and helping to perform risk analyses of the system environment.

Capability 7: Distributed and Assured Boundary Defense – Zero Trust Architecture

IIoT-based application architectures are being designed for inherent scalability, agility of deployment (distributed centralized, hybrid), and availability of devices. At the same time, the characteristics of IIoT-based applications bring with them modified/enhanced security requirements. A few examples of these characteristics and their security impacts are:

  • the sheer number of IIoT devices commonly found in deployments results in more interconnections and more types of communication links to be protected,
  • devices can come and go out dynamically, however, the “always up” mandate for operating in critical infrastructure environments demands secure service discovery and recovery capabilities,
  • there is no concept of a network perimeter so all devices must be treated as non-trustworthy,
  • the fine-grained functionality of each device requires fine-grained authorizations, however, this may require security policies to be centrally defined and the configurations reflecting them to be defined in each device to enable consistent enforcement across the environment.

Implementing an IIoT infrastructure inherently expands organizations’ attack surfaces with new attack vectors. As shown in the end-to-end chart below, with IIoT, there are multiple points of possible compromise – on the sensor itself, during pre-processing at the edge, as data is in transit, as data is stored and manipulated by middleware, as data is being analyzed within an application, and so on. Security requirements and tools must address IIoT security challenges from end to end.

For example, IIoT devices with actuators have the ability to make changes to physical systems and thus affect the physical world. The potential impact of this needs to be explicitly recognized and addressed from cybersecurity and privacy perspectives. In a worst-case scenario, a compromise could allow an attacker to use an IIoT device to endanger human safety, damage or destroy equipment and facilities, or cause major operational disruptions.

Additionally, every IIoT device has at least one enabled network interface capability and may have more than one. IIoT network interfaces often enable remote access to physical systems that previously could only be accessed locally. Manufacturers, vendors, and other third parties may be able to use remote access to IIoT devices for management, monitoring, maintenance, and troubleshooting purposes. This may put the physical systems accessible through the IIoT devices at much greater risk of compromise.

A zero trust architecture provides the bedrock for cyber defenses at each interface boundary of an IIoT or OT environment.  A zero trust architecture is about avoiding assumed trust and making explicit policies about how systems and users communicate. It entails network segmentation utilizing firewalls, command registers, and data diodes along with defense in depth principles. Network boundaries are monitored for intrusion and all interfaces are authenticated prior to allowing a user or device onto the network. The boundary between the OT and IT environment must also apply zero trust principles. Many attackers are looking to find paths into IT environments via the security gaps created by the implicit trust relationships often present between OT and IT environments.

Physical boundaries need to be monitored for OT and IIoT systems as well. Physical boundaries are often monitored using optical sensors – they require no power or battery, can be ruggedized for harsh conditions, and are immune to Electronic Magnetic Interference (EMI), and virtually impossible to circumvent.

Capability 8: “Cyber” Adaptive Relay -> Automated Orchestration

Because OT / IIoT systems interact with the physical world, they are subject to the time constraints of the physical process they are executing. These processes are generally time-aware and deadline-sensitive. As a result, security processes must fit within the time constraints of the application. Current IT cybersecurity controls may need to be modified significantly, or be completely replaced, because those IT security solutions cannot meet the timing criteria required by these OT / IIoT systems. Further, the tight time constraints on addressing attacks largely rule out human-in-the-loop solutions. This drives the need for continuous, autonomous, real-time monitoring, detection, and response.

Adaptive relay is a protection philosophy that has been around for a while on the physical side of OT environments which permits and seeks to make adjustments to various protection functions in order to make them more attuned to prevailing system conditions (e.g., power loads). A detailed knowledge of interactions between the system components is required to execute adaptive relay. Automated orchestration is the corrollary to adaptive relay on the cyber side.

Managing responses on a timely basis to any cyber attack requires close interaction between cyber sensors, physical sensors, cyber and physical controls, and cyber / physical “adaptive relays.” Therefore, the supporting cyber services (e.g., authentication/authorization, security monitoring, load balancing, secure discovery, control enforcement, system recovery, etc.) for OT/IIoT-based applications should be tightly coordinated with controls for physical systems through a dedicated infrastructure, such as a Service Mesh or message fabric.

service mesh is a configurable, low‑latency infrastructure layer designed to handle a high volume of network‑based interprocess communication among application infrastructure services using application programming interfaces (APIs). A service mesh ensures that communication among containerized and often ephemeral application infrastructure services is fast, reliable, and secure. The mesh provides critical capabilities including service discovery, load balancing, encryption, observability, traceability, authentication and authorization, and support for the circuit breaker pattern. A service mesh allows you to integrate into any logging platform, or telemetry or policy system to run a distributed architecture, and provides a uniform way to secure, connect, and monitor services. With better visibility into traffic, and out-of-box failure recovery features, asset owners can catch issues before they cause problems, making the OT/IIoT environment more robust.

As depicted in the figure of a full stack orchestration capability, the Service Mesh / Message Fabric provides the underlying infrastructure for running automated playbooks by an orchestrator or controller. Some other key features to look for in orchestrators include:

  • No-Code Playbooks: Codeless playbooks discard the hurdle of programming languages and enable users to build and modify playbooks without Python or other coding. Orchestrators that support codeless playbooks should allow easy add/delete to the portfolio of integrated tools, change threat intelligence sources, and upgrade software versions without editing code.
  • MITRE ATT&CK Automation: Orchestrators should support the MITRE ATT&CK model for threat attack vectors. That means kill chain playbooks, and guided investigations/threat hunting. Plus, the orchestrator should operationalize the framework by mapping and correlating events, identifying risks, and focusing resources on suspicious behavior and critical threats.
  • Full-Lifecycle IR: Effective incident response requires a comprehensive case management system. The orchestrator should provide users with link analysis, forensics, evidence and custody tracking, plus collaboration tools.

The Service Mesh must also be designed to be resilient and safeguard against service failures by shedding services and failing fast when the underlying systems approach their limits. A resilient mesh can focus on situations where the environmental conditions have deliberately and intentionally been manipulated by malefactors. Therefore, it will help to address uncertainty, situations where the distribution of possible outcomes produced by the interaction of the system with its environment are NOT known, often because the environmental conditions that produce the impacts are unknown or not well understood. One example of a resiliency technique is circuit breaking. Circuit breaking is the idea of setting a threshold for failed responses from an instance of a service and cut off forwarding requests to that instance when the failure is above the threshold (when the circuit breaker trips). This mitigates the possibility of a cascaded failure, allows time to analyze logs, implement the necessary fix, and push an update for the failing instance.

Bio-inspired resiliency – an example is CyPhyMASC – promise advances towards smarter infrastructure systems and services, significantly enhancing their reliability, security, performance and safety. CyPhyMASC intelligently mixes and matches heterogeneous tools and control logic from various sources towards continually evolving cyber defense capabilities. CyPhyMASC is also elastic where situation-driven monitoring, analysis, sharing and control (MASC) solutions can be dispatched through a dynamic set of sensor and actuator software capsules (Service Mesh) to the protected assets rather than using pre-deployed MASC components.

Most current technologies did not consider that cyber and physical convergence needs a new paradigm to treat cyber and physical components seamlessly. The control loops must be closed across both the cyber world and physical world. The system must have the capability to effectively counter-act uncertainties, faults, failures and attacks. The DARPA research and development of formal specification based automatic generation of system behavior monitoring, the steering of computation trajectories, and the use of analytically redundant modules based on different principles, while still in infancy, is an encouraging development. We also need theory and tools to design and ensure well-formed dependency relations between components with different criticality as they share resources and interact.

Capability 9: Digital Twin and Model-Based Systems Engineering

Large-scale distributed IIoT / OT systems, no matter how they’re architected, have defining characteristics: they are complex systems with an open loop architecture that may allow a minor error to cascade into system failure. These characteristics provide many opportunities for small, localized failures to escalate into system-wide catastrophic failures. As more of the components become IP-enabled and connected, and as concerns over cybersecurity take center stage, continuous real-time simulations of the system need to be done to identify attack vectors and vulnerabilities [such as through applying the MITRE ATT&CK model], determine impacts, and reconfigure the physical and cyber protection systems. Thus, as and when there is a change in the OT / IIoT system caused by a physical problem or a cyber attack, continuous real-time simulations can help digital protection devices as well as cyber controls to get updated automatically, or alternatively the simulation could recommend a Course of Action (CoA) to a human user. System administrators will also be able to respond more intelligently to a cyber attack if they are able to anticipate how disruptions in the ongoing workflow will affect the mission and can further evaluate the effectiveness of CoAs undertaken to mitigate and to circumvent the effects of cyberattacks. This simulation of the OT system is called a digital twin.

Building a digital twin starts with model-based systems engineering (MBSE). MBSE addresses the complexity of systems with a model-centric, frontloaded engineering methodology. The MBSE approach was popularized by INCOSE when it kicked off its MBSE Initiative in January 2007. Goals included increased productivity, by minimizing unnecessary manual transcription of concepts when coordinating the work of large teams. The MBSE approach is outlined in INCOSE’s “MBSE 2020 Vision”, with a methodology focusing on distributed but integrated model management.

The effectiveness of digital twin simulation models is enhanced with the accuracy of the model. Therefore, the MBSE approach should be co-engineered with interacting physical / cyber components connected over diverse networks. For example, multi-domain sensor/actuator integration should be modeled, including interacting cyber and physical nodes, as a way to validate what is happening in the OT / IIoT environment.

The digital twin should also help cyber analysts to understand risk posture and how vulnerabilities can be designed out or mitigated through additional controls. One product that helps to bring this MBSE approach to light is KDM Analytics Blade Risk Manager tool. It applies MBSE in a top down risk analysis, and then performs a bottoms-up vulnerability / compliance analysis of the environment and controls based on scans and using a controls framework such as RMF. As the figure below shows, as part of top-down risk analysis, the KDM Blade ingests and models information about the system using an architecture model (e.g., UML), or lists of assets and information flows, or other descriptions of the system under consideration. Next, it applies cybersecurity knowledge such as attack trees and vulnerability data to assess the varying degrees of exploitability and impact of cyber-attacks, diversity of network topology, configurations and vendor products; and, to assess the influence of critical nodes, attack timing, stepping stones, pivot points, and attack launch locations. Finally it develops a quantitative risk posture based on valuations of assets at risk. The bottoms-up vulnerability analysis identifies the most critical and risky components of a cyber system for further focused security analysis and risk mitigation.

The modeling of exploitability and impact of cyber attacks needs to be integrated in models that reflect the physical resilience of a system. The cyber / resilience metrics should be based on the four elements (robustness, rapidity, resourcefulness and redundancy) of an R4 framework. A dynamic integrated network model is needed to simulate the performance of interdependent infrastructure systems over time following disruptive events such as a cyber attack or physical disaster. The models for each of the aforementioned properties for the networks interconnecting sensors, actuators and controllers should be developed.

Capability 10: Identity-Based Networking

The expected M2M traffic explosion for IIoT will require smart control of network access on a service-by-service basis. As will the requirement to integrate multiple access technologies in a single solution, where access technologies could include Wi-Fi®, LTE, and new 5G and millimeter-wave radios.

Identity based networking is similar to placing a Security Guard at each switch port. It allows only the authorized users to get network access and places unauthorized users into Guest VLANs. It also prevents unauthorized access points. Access is based on identity and it allows the assignment by group or individual at the time of authentication.

Identity based networking has tunneling between domains. An example for such a system may contain a WAN, two VLANs and a network database. The two VLANs are coupled to the WAN with the network database containing their information. When a client who is authorized to work on second VLAN attempts to connect to the first VLAN, a switch in the WAN looks up in the database to determine whether the client is authorized on the second VLAN or not. Then he is connected to the second VLAN through VLAN tunneling.

One identity-based networking approach growing in popularity is called Software-Defined Perimeter (SDP). The SDP framework was developed by the Cloud Security Alliance (CSA) to control access to resources. SDP was derived from a previous capability developed for the DoD called First Packet Authentication (FPA) and is based on RFC 4226 (HOTP). Under SDP, the device posture and identity are verified before access to application infrastructure is granted. Application infrastructure is effectively “black” (a DoD term meaning the infrastructure cannot be detected), without visible DNS information or IP addresses. One way this is accomplished, as per the SDP specification, is by single packet authorization (SPA). With SPA the first packet, which ususally contains a random initial sequence number, is transmitted by the client with a cryptographic token instead. The receiving gateway drops any packets until it receives one with the cryptographic token that is registered at the gateway.

The inventors of these systems claim that a SDP mitigates the most common network-based attacks, including: server scanning, denial of service, SQL injection, operating system and application vulnerability exploits, man-in-the-middle, cross-site scripting (XSS), cross-site request forgery (CSRF), pass-the-hash, pass-the-ticket, and other attacks by unauthorized users. There was an open source implementation done paid by DHS and highlighted by ActiveCyber.net here.

Another twist on the identity-based networking approach that I also like is known as Host Identity Protocol or HIP – which is based on a set of RFCs with an implementation provided by Tempered Networks. Their solution leverages the HIP protocol, which separates the identity of a host from its location by replacing IP addresses with cryptographic identity addresses, as shown in the figure below.

It introduces a Host Identity (HI) name space, based on a public key security infrastructure. HIP effectively decouples the transport layer from the network layer, and allows the upper layers of the stack to use a Host Identity (HI) in their socket APIs instead of an IP address. HIP establishes secure end-to-end communications between cryptographic identities and binds local and remote application interfaces to these identities.

I recommend any of these identity-based networking approaches as important zero trust mechanisms for your OT enclaves. They offer a way to segment your network, separate your IT and OT environments, and they can also be used to protect your cloud environment.


So this concludes my “top 10” recommended security capabilities for OT and IIoT systems. Here they are all listed together one more time:

Capability 1: Real-time visibility and compliance tracking of assets that may have limited function and power

Capability 2: Real-time anomaly detection including increased use of AI/ML technology and big data analytics

Capability 3: Strong, comprehensive authentication 

Capability 4: Trusted systems and trusted data

Capability 5: Threat-Informed defenses

Capability 6: Cyber hygiene best practices

Capability 7: Distributed and assured boundary defense – Zero Trust Architecture

Capability 8: “Cyber” adaptive relay -> Automated orchestration

Capability 9: Digital Twin and Model-Based Systems Engineering

Capability 10: Identity-Based Networking

I would love to hear your opinions about my top 10 or what your top 10 looks like.

And thanks to my subscribers and visitors to my site for checking out ActiveCyber.net! Please give us your feedback because we’d love to know some topics you’d like to hear about in the area of active cyber defenses, PQ cryptography, risk assessment and modeling, autonomous security, digital forensics, securing OT / IIoT and IoT systems, Augmented Reality, or other emerging technology topics. Also, email chrisdaly@activecyber.net if you’re interested in interviewing or advertising with us at Active Cyber™.