Mr. Scott Musman discusses how game theory provides important contributions to cyber risk analysis modeling and to understanding mission impacts resulting from cyber attack. Learn the background and details of how the MITRE Cyber Security Game works in this interview with ActiveCyber.
Recently I have been conducting some research into analytical approaches to risk assessment. My investigation led me to some technical papers published a few years back by MITRE on modeling cyber risk using game-theoretic techniques which I found quite interesting. So I jumped at the chance last April to attend a model-based engineering conference at NIST where the MITRE author would be presenting. I heard a variety of innovative presentations around approaches to model-based systems engineering in a manufacturing setting, however, Mr. Musman’s presentation struck the biggest chord with me due to its cyber connection. He also kindly accepted my invitation for an interview with ActiveCyber regarding his Cyber Security Game approach to cyber risk assessment. So read the interview below to learn about his multi-year journey to develop a game theory-based risk assessment model and how modeling is sure to make its way into your security life in this cyber risk-driven world.
Spotlight on Mr. Scott Musman, MITRE
» Title: Principal Engineer
» Website: https://www.mitre.org/publications/technical-papers/evaluating-the-impact-of-cyber-attacks-on-missions
» Email: smusman@mitre.org
» Linkedin: linkedin.com/in/scott-musman-5b7124
Read his bio below.
June 12, 2017
Chris Daly, ActiveCyber: Could you provide an overview of your cyber risk assessment method and how it is different from other cyber risk assessment tools?
Scott Musman, MITRE: Pretty much any type of risk assessment is difficult, and fraught with complexity. Cyber risk assessments are no exception. Today they are mainly derived from methods that require a multi-disciplinary team to work an assessment top-to-bottom, and you are often at the mercy of the quality of the assessment team. There are several common places where traditional methods go wrong, such as failing to consider some critical hazard combination, or not understanding dependencies between resources. The Cyber Security Game (CSG) method tries to avoid these pitfalls by not making limiting assumptions. CSG models might still be incorrect, but if so they’re incorrect because of a modeling error – not because the method forces you to assume something about your system that isn’t true.
There’s also a limitation in the traceable artifacts that are produced in risk assessments that identify which risk factors contribute to the assessment results. A lot of information often persists only in the heads of the assessment team. All risk methods have a consequence (impact) part, and a probability of occurrence part. If you can’t clearly identify where those contributions come from it’s hard to have confidence that the assessment is usefully correct, and it’s harder to understand the changes in risk when aspects of the system, mission, or threats change.
CSG is model-based, and works by applying theoretically defensible algorithms to computational models of the system and the mission. The algorithms that run against the models ensure that the assessments are consistent and comprehensive. This focus on modeling the system doesn’t solve every problem, since modeling is still hard, but what you need to capture in the models are the essential details of the system that make it capable of securely accomplishing its function. If you don’t understand those aspects of your system, then no useful risk assessment is possible. An important model in CSG is the mission consequence model, which captures how cyber incidents manifest themselves as mission impacts. This clearly focuses its assessment on improving mission outcomes, rather than only making cyber assets more resilient.
With CSG, once you have developed models of the mission impacts, the IT network, and your defense measures (a default model of the attacker is included), you can run the program and have it complete an end-to-end cyber risk and mitigation assessment. The program basically plays out an attacker/defender game where the defender lays out defensive measures and then the attacker assesses multi-step attack paths to cause impacts. As the defender employs defensive measures, the program identifies how the defensive measures have reduced the risks to the mission. CSG includes a portfolio analysis engine to allow it to identify the best set of defensive measures to employ for any given investment level.
ActiveCyber: Could you explain the genesis of your model and the journey you have taken to get the model to its present state?
Musman: It’s certainly been a journey. Sometimes it takes persistence to realize a vision. I was originally drawn to this problem for a different purpose, by work I was involved in before coming (back) to MITRE. Doing DARPA funded work on responding automatically to cyber-attacks it became very clear that effective response to an incident depends on knowing what the system being attacked is supposed to be trying to accomplish. Sometimes the impact of a response action might be worse than the impacts of the attack you’re trying to stop. Yet at the time no-one was explicitly modeling the purpose of the system to capture these needs. This was around the time that cybersecurity started shifting from “information assurance” to “mission assurance”, and focusing more on risk management rather than compliance.
In 2009, MITRE funded research on Cyber Mission Impact Assessment (CMIA) and Response. We developed a workable solution to the mission impact assessment problem, using process modeling. We never got to the “and response” part in that project, but we demonstrated the impact assessment using a COTS process modeling tool, which wasn’t intended to do what we were doing with it. While we technically showed that the CMIA approach works well, the COTS process modeling tool wasn’t a good delivery platform to support doing these cyber impact assessments for our sponsors. Nor did we ever succeed in being able to use the COTS process modeling tool as something we could embed in a larger program that would allow us to use it as the impact part of assessing risk, or for cyber response assessment.
To finally get this capability, I bit the bullet and prototyped a process modeling tool that specifically does cyber mission impact assessment. I wasn’t being funded to develop the CMIA tool. I was mainly being paid to assess cyber risks, and developing the CMIA tool and then CSG would make it possible to do that in a systematic way, where computing cyber risk requires considering how the attacker adapts to deployed defensive measures. From my background in Artificial Intelligence, the process of considering of attacker and defender moves essentially means game playing, so the rest of the CSG solution was just understanding what was needed to program the security risk assessment as an attacker/defender game.
CSG is just a traditional game program, like a chess or backgammon program. Unfortunately, unlike traditional games where the game board and player moves are fixed, in CSG each system and mission we want to analyze must be modeled separately. For cyber risk assessment this manifests itself as needing a system topology model, a defender model; and an attacker model, in addition to the mission impact model. These models combine to produce a risk score for the system. Everything else that CSG does just traditional game playing, using the risk score as the metric by which defender assesses the state of the game.
CSG is another increment in capability. It codifies a number of capabilities that cyber risk assessors have been doing manually, making it more likely to produce consistent results. There’s still things I’d like to add to it, so it can be used to answer additional questions. Remember the “automated cyber response” problem I was working on some (what feels like) 200 years ago? Give or take a few needed modifications, CSG finally has all the pieces needed to solve that problem.
ActiveCyber: Your model for assessing cyber risk is quite comprehensive in that it entails a variety of analytic formulations and methods as well as linkages to mission impact. Could you describe the flow of analysis through the different model formulations to perform a risk assessment?
Musman: Despite its apparent complexity, CSG simply incorporates the same characteristics about a system that we want humans to consider when analyzing cybersecurity risks associated with a system. Starting with incident consequences, Cyber Mission Impact Assessment (CMIA) models capture the operational mission context, and relates IT dependent activities to the cyber assets that support them. CMIA explicitly looks across the entire set of incident types to consider incidents such as interception of assets, modification of assets, fabrication of assets, etc. since in cyberspace impacts come from more than just disruption or interruption events.
Unlike other risk formalisms that assess impacts using qualitative labels such as severe, medium, or minor, CMIA uses quantitative assessments of impact, such as dollar amounts in business-oriented missions. This is important since the impact severity of some incidents can hugely outweigh others, and the risk model needs to capture that. CMIA models also incorporate purely security related impacts in addition to mission activity outcome impacts, since most systems also have security requirements that will result in impacts if they are not met. An example of this might be the use of IT in medicine. Even though you might be using the IT to fill medical prescriptions or make patient referrals, it is a security requirement to ensure that patient PHI/PII is not intercepted by non-authorized parties, or that patient information cannot be modified in an unauthorized way. So, while not being able to fill a prescription, or make a patient referral using the IT might be annoying (since that’s what the system is supposed to let you do), that impact is minor as compared endangering the life of a patient, the ability to misuse the IT to illegally obtain access to drugs, or being able to commit widespread insurance fraud. Basically, CMIA provides a formal model-based approach to capture the consequence part of the risk equation.
The other part of computing risk is to understand the probability that the impacts will occur. Part of that is based on the threat, the existence of vulnerabilities. But when systems are engineered, you can’t predict vulnerabilities that will emerge a year from when it goes operational. You must design the system to be as resistant as possible to vulnerabilities that get discovered, or even more importantly to an attacker who might specifically want to target your system and who is willing to devote the effort to identify the vulnerabilities needed to compromise it. This is where the employment of security and resilience design principles come in. CSG explicitly incorporates the analytical linkage between aspects of the systems cyber architecture and how it affects the probability that attackers can create impacts. These include system aspects such as least privilege, segmentation, diversity, and security and resilience measures. CSG has a mathematical model of how changing these system characteristics affect the attack probabilities. Surprisingly, very few cyber risk methods incorporate explicit knowledge of the system’s network topology, or bother to represent component type — other than in the head of the assessor. Without this information, a method can’t assess whether you’re diversifying the right things, or segmenting the system to limit the damage an incident can cause, etc.
Since CSG has algorithmized the computations of computing a mission risk for a system, we have focused on the system models needed to describe the system. Since the system details found in the models are needed anyway, the models capture your knowledge and assumptions about the system. In addition to the CMIA model described above, CSG needs a network topology that also includes knowledge of access rules and component type information. It also needs models of the defender methods, and we provide a default model of the attacker. Someone using CSG just has to develop these models, then they can leave the actual risk assessment to the program that applies the risk algorithms to those models. If anything ever changes, updating the risk assessment now just means updating any affected models, and rerunning the program.
The risk algorithms do the same things that a human analyst would do. For example some risk assessments will leverage an attack tree, without specifying where that attack-tree comes from. Since CSG has a model of the IT network, it uses a program to develop an attack tree so humans don’t have to do it. The attack trees consider all possible impact targets, and find the best pathways to achieve those impacts. It also considers the compromise of multiple components, so that multi-step attacks are assessed. Doing this programmatically is a great way to ensure that it’s done correctly and comprehensively every time.
Where things get tricky is when you start considering the defender actions. Defenders can change the network topology itself to isolate components, change access controls in the network, diversify components, protect assets making them harder to compromise, and include failover and backup activities in the mission process that can minimize impacts when they do occur. Every time a defender applies one of these measures, the attacker will try and circumvent the measure. If they can’t, it will lower the risk assessment. Conversely, employing these methods involves spending money, so the problem is to identify the best set of defenses, given what you can afford to spend. CSG also includes a portfolio analysis engine, since this is another example of problem that computers can solve much more effectively that humans can.
ActiveCyber: What types of system properties are important to be captured for the model to work effectively and can these properties be collected and imported through automated means or do they need to be managed through manual efforts?
Musman: If you want to assess the cyber risks of a system during the design phase, there’s little in the way of automation that can help you. In that case, you need to build models of the system that represent a logical depiction of the activities and assets involved. If you want to assess an operational system, then there is plenty that automation can do to help make the building of system models easier.
We are currently looking into methods that help to semi-automate the gathering of information that is needed to build the models. There are aspects of building the required system models that is ripe for automation. For example, there are commercial products that can instrument your computing network and produce diagrams of it and keep them up to date. That diagram can be combined with other information sources such as firewall rules, or asset inventories to capture the trust and component type information needed for the CSG topology model. I fully expect automation to largely create the topology oriented models and keep those models up to date as a system changes.
However, automation will only take you so far. It’s important to consider that we’re talking about systems that are designed to perform some purpose. It’s almost impossible to back-solve design intents from an operational system. Characteristics of how the performance aspects of a system, or how the security requirements are valued, represent aspects of the system that must be elicited and managed manually from system, security, and mission experts. So, while I expect automation to help with developing and managing parts of the CSG models, I think we’re realistically looking at some level of semi-automation.
ActiveCyber: What have you experienced as the biggest obstacle to the formulation of the different sub-models that are created and used to perform the cyber risk assessment? How much effort is typically expended to conduct a cyber risk assessment using your model? How reusable are the model artifacts that are produced? Are default values provided?
Musman: Everything is a trade-off. Model formulation is no exception. If we include more details in a model, it requires more effort to gather the information, build, and maintain those models. So “how much is enough” is a constant question, and one which doesn’t have a one-size fits all answer. CSG, like every risk method, makes trade-offs that affect how much information is needed, which correlates with effort. The models do define a minimum set of information, but at the same time try not to restrict people who might need to do so from including more details in their system models. So, typically we model the cyber dependencies to include network components, hosts, applications/services and information assets, but if you need to include cyber details such as hypervisors, or CPU registers, (which you might if you’re modeling a cloud, a cyber physical system, or some embedded devices) you can still do so.
Getting the information for the models is the other challenge. There are no shortage of game-theoretic models that exist only as academic curiosities since we cannot obtain useful parameters to put into them. We want modeling to be as objective as possible. For example, a mission subject matter expert can tell you whether CMIA process diagram accurately describes the mission process they perform. Sometimes, one can find actuarial sources for estimating impact. Other times, one must rely on getting stakeholders to agree on impact values. But because there is still uncertainty, CSG does support model sensitivity analysis, and there are calibration methods for human judgments.
Characterizing the performance of security and resilience measures is particularly challenging. When you install anti-virus, do you know what percentage of attacks will it detect over the next three years? How about a network intrusion detection system? What if you think you’re the target of a nation-state actor, then how effective is it? If it detects an attack, but does not stop it, how fast will you be able to react to that to reduce any impacts? Assessing individual security measures is challenging enough, but assessing them in combination is also a challenge. Security measures don’t always exactly complement each other, so if you install a host-based intrusion detection system (HIDS) that you expect stops 45% of attacks, and combine it with a network based intrusion detection (NIDS) that you expect stops 50% of attacks, that doesn’t imply that in combination they stop 95% of the attacks. This is an area we have little real data on, and our method doesn’t solve the problem of knowing what happens when you combine measures. It does, however, force us to expose our expectations about what we expect security and resilience techniques to be good for. Having that traceability is ultimately a good thing.
At MITRE, I can reach out to our experts to provide knowledge about employing different security measures and how well they are expected to work. But if you’re in a small company wanting to mitigate risks in your system, you don’t have that ability. There needs to be a standards body to develop the security measure performance, and combination models, but that doesn’t exist now. Similarly, there should be standard models of attacker capability. This is another area where few small to medium sized companies will have any expertise to know what they’re up against, so it would be nice to have experts in that field developing the attacker models that CSG uses, for different levels of attacker sophistication.
ActiveCyber: What analytic or simulation software do you employ in the formulation and running of your sub-models? What types of advancements in modeling tools would you like to take advantage of for your cyber risk model?
Musman: CSG is entirely custom code. It’s an eclectic combination of several capabilities. It simulates a process model to compute mission impacts using a discrete event simulation. It performs a combinatoric assessment of impact given all possible incidents against all resources given different start and duration times. It performs path planning on a network model to identify the attacker pathways needed to develop attack trees. It computes a game-tree to assess attacker and defender move pairs, and it has a portfolio engine to assess combinations of defense deployments given cost constraints. While there are solutions for each of these activities, I believe we’re the first people to assemble them end-to-end into an integrated capability – where diversifying modeled components, changing the network topology model, or updating the mission process model results in changes in risk scores, which in turn result in changes to the optimal defensive portfolio.
Rolling our own solution was not ideal, but a nice byproduct of this is that people don’t have to also buy a third-party product just to support running our solution. CSG is a prototype. Hopefully industry may eventually adopt and even standardize some of its features so that it can be more widely applied. I just presented at the Model-based enterprise summit, which focuses on Model-based Engineering (MBE). The hope there was that exposing the CSG methods to people who develop and use model-based engineering tools might be an impetus to start incorporating some CSG concepts, to incorporate security assessments. We have MBE people in MITRE and are exploring exactly this type of integration with MBE tools, so you can build-in security into the design rather than tack it on after.
An area where work is needed is in model standardization. Having standardized models for network topologies, and capturing attacker behavior are also problems that need addressing. As an example, the CMIA process modeling simulates a business process. While the Business Process Modeling Notation (BPMN) process diagram part is an industry standard, the discrete event simulation underneath it is something that is only now starting to become standardized. We originally demonstrated CMIA in a commercial process modeling package, but ultimately we found it to be unsuitable for use in our solution. We also identified scenarios where the default behavior of the commercial process modeling tools didn’t produce the right answers for what we’re doing in CMIA. Lacking access to the internals, it wasn’t possible to turn these tools into a solution.
ActiveCyber: One of the leading thoughts for active cyber defense is to enable a variety of preventive or mitigative measures that will disrupt the attacker kill chain. How does your model incorporate this concept and how do you relate this kill chain disruption concept to risk assessment?
Musman: It’s important to not mix apples and oranges here, so let me try and describe what CSG does and does not do. CSG’s goal is to reduce risk. It doesn’t care whether you do this by avoiding some hazard, disrupting the kill-chain, preventing, or effectively mitigating impacts by recovering mission functionality every time it’s compromised. CSG only cares about reducing risk. To some extent, where in the kill-chain that can be accomplished is irrelevant. CSG only considers combining measures across the kill chain when it comes to reducing risks that are not adequately reduced by a single defensive measure.
At its very simplest, it’s fair to say that everything you do to improve security or resilience is intended to reduce the risk of something bad occurring. CSG is “mission focused,” so it is trying to reduce the risks that cyber incidents will have on the mission outcomes. It won’t give a defender any credit for protecting something that doesn’t need protection in the context of the mission. An example might be encrypting the signals from an emergency distress radio. If your use-case is military, there can be significant impacts if the enemy can intercept the radio signal. If your use-case is civilian, there’s probably little impact if other people can intercept the signal. In fact, in the civilian use, usually the more people who can receive a distress signal the better. So, employing a mission model for the military use case in CSG will compute risk reduction when the signals are encrypted, whereas when using a civilian use case model it won’t. Because of its mission focus CSG doesn’t care how you reduce a risk. Only that you do it, and as best you can within your budget. If there are reasons you want to exclude certain types of security and resilience measures, you can merely leave them out of the set of tools that CSG considers.
Lacking any kind of quantitative assessment, many risk reduction recommendations will include ideas like: “let’s include one prevent and one mitigate measure” for each high-value asset. This is as a panacea for lack of measuring things quantitatively. In answering one of the earlier questions, I discussed the challenge of knowing how well multiple measures combine with each other. In that context, knowing how two preventative measures combine usually isn’t as clear as knowing how well a prevent and a mitigate measure combine. In most cases, the mitigate measure will complement the prevent measure, simply because you only need to mitigate what you can’t prevent. There’s actually a waterfall effect going across the set of resilience goals, from avoid, to withstand, to recover and evolve in that you only have to withstand attacks you can’t avoid, and you only need to recover from attacks you can’t withstand.
ActiveCyber: One of the many novel aspects of your model is the “cyber security game” between attacker and defender. What type of insight into risk does the game provide and how might security automation that accelerates incident responses impact the game result?
Musman: The game theory is CSG’s strength. Because of the game theory, every time a defender employs some defensive measure, the attacker player will look for ways to circumvent the defense or find the next weakest link. The game theory, combined with the portfolio analysis, ensures that its allocation of defenses are balanced to reduce the overall risks. Unlike humans, it won’t overly fixate on defending one part of the system at the expense of ignoring others. The game theory makes it possible to answer important questions like: “how much replication or how much diversity is enough?” In a game-theoretic sense, you have added enough diversity when the best remaining payoff for the attacker is something that cannot be reduced by adding any more diversity.
Around the same time I started working on this problem, some important risk assessment realizations were being made by the risk analysis community. They identified the limitations of traditional risk assessment methods when dealing with an intelligent adversary. Work by Tony Cox (and others) highlights this by demonstrating situations where applying traditional probabilistic risk analysis approaches produce the wrong recommendation. The key insight is that in each move in an attacker defender game, the players are influenced by what they learned from the players last move. Even if you didn’t realize that the attacker had a zero-day exploit, once you detect the existence of one, it will completely change the probabilities of what you know is now possible, and you can project that forward. CSG’s attacker player can project forward how the further employment of attack steps that are successful can be used to gain the access needed to cause the impacts.
ActiveCyber: How do you translate the cyber game results into quantifiable risk-based formulations that lead to cyber investment decisions?
Musman: There’s a couple of important questions here. One should be wondering whether the CSG models are calibrated to reality. If they are, then since the algorithms that run on the models compute the risk scores are defensible, you could use it to answer questions in an absolute sense. If that’s the case, and you can cast impacts in terms of dollars, one could use CSG to show how your security investments pay for themselves in terms their ability to reduce your losses from security incidents.
Unfortunately, I don’t think current CSG models are calibrated to reality, for some of the reasons I mentioned earlier when talking about obtaining tool performance numbers, and other challenges about how understand the attackers’ capabilities. Lack of absolute calibration doesn’t mean they’re not useful though. Running CSG gives you the ability to assess the relative risk reduction from employing security measures given the starting point of where you are without them. Even if the performance of the security measures are not absolutely calibrated, if they are correct relative to each other, then the recommendations of how to best compose a security portfolio for any given investment level will be the same. Knowing what measures you should select to best reduce your risk is usually what people need to know from the assessment.
ActiveCyber: What is your customer base and what level of user sophistication is needed to employ the method – a coder, domain specialist, an analyst? What changes are you considering for the model in response to customer needs? Is the model going to be available as an open source property?
Musman: CSG is currently a prototype, and we at MITRE have been the people employing it. The CSG software works robustly once the models are debugged, but there are still lots of places where an error in a model can cause the software to fail. That’s unacceptable in a commercial product. There’s a big step needed to take the prototype and turn it into a production product. Bridging that gap is not a MITRE type of activity. I’m not sure if CSG is something MITRE would open source or not, but we can certainly make it available as-is to our government customers.
In its current form, CSG can be viewed as a model-based cybersecurity systems engineering assessment tool. From that perspective, having it integrated into a commercial MBE product would be great. It would bring benefits of leveraging all the work people already do developing the existing engineering artifacts, and then add a few new artifacts specific to cybersecurity to that. As such, I tend to think of CSG as something a systems engineer would use to build security into a system. It can also be used to assess operational system, so you can identify targeted improvements.
Part of my personal goal for CSG, though, is that it should reduce the level of expertise needed to perform these cyber risk assessments. Cyber risk assessment is hard. It takes experience in addition to technical knowledge, and if it’s done after the fact on operational systems (which it is now), it requires a lot of effort just to obtain the information you need about the mission, the system, the threats, etc. The current emphasis focusing on the upfront MBE applications of CSG, however, is because the elephant in the room when it comes to identifying crown jewels, key terrain in cyberspace, or assessing cyber risk is needing to rediscover all the details about system security and operational performance that were never adequately captured by the system artifacts when the system was developed. There is no “security view” in standards such as DODAF and as such security related information is scattered in many places. If one were to use something like CSG up front during system design, it’s model artifacts go some way to filling that security view gap, and provide the basis of information that is needed during operations. With CSG, you just need to build the models, and then let the algorithmic code do the work. The idea is that building models is easier, and requires less expertise, than doing the entire assessment. The expertise associated with building the models can come from different people with different areas of expertise, so eventually you may not need a single team of experts to work through the entire assessment.
So far, requests for changing the CSG models have come in a few forms. Some people have asked for a more sophisticated attacker model, and others are asking to get away from a zero-sum game, where additional factors are added to reflect that the attacker values a system in a different way than the defender does. A more sophisticated attacker model is good, but has ramifications. To include extra details in the attacker model, it probably also means you will need to have more details in the topology model. Since we’re currently building the models manually, that will require more effort. However I would also like to move away from a zero-sum game.
Thank you Scott for describing your journey into cyber risk modeling and assessment and the important contributions you and your team has made to bringing model-based engineering to the cyber world. I look forward to seeing if CSG makes it to a full-fledged commercial capability.
And thanks for checking out ActiveCyber.net! Please give us your feedback because we’d love to know some topics you’d like to hear about in the area of active cyber defenses, PQ cryptography, risk assessment and modeling, or other security topics. Also, email marketing@activecyber.net if you’re interested in interviewing or advertising with us at ActiveCyber.
About Mr. Scott MusmanScott Musman is a principal engineer at MITRE Corporation, a not-for-profit organization that operates research and development centers sponsored by the federal government. He holds a BSc(Hon) in Engineering from the University of Sussex in the U.K. and a MSc in Computer Science from Johns Hopkins University. In addition to working in other domains, such as image understanding, decision support, data-planning and reasoning under uncertainty, and data-mining, Scott has been working information security related problems since the mid 1990’s, acting as director of R&D for Integrated Management Services Inc, head of the enterprise security research group for Alphatech/BAE Systems-AIT, and as a principal engineer for MITRE. Most recently he has been focusing on mission assurance by estimating and mitigating cyber related mission risks. |