Adversary emulation is the practice of simulating real-world threat actor behavior to test and validate an organization's defenses. Unlike generic penetration testing, which seeks to find any path to compromise, adversary emulation follows a specific threat actor's known playbook — their tools, techniques, and procedures — to answer the question: "Can we detect and respond to this specific threat?" This lesson covers emulation planning, tooling, purple teaming, and how CTI drives the entire process.
Learning Objectives
- Distinguish adversary emulation from penetration testing and red teaming
- Understand how MITRE ATT&CK and CTI reporting inform emulation planning
- Identify key adversary emulation tools and frameworks
- Describe the purple teaming methodology and its benefits
- Measure defensive coverage and conduct gap analysis using emulation results
What Is Adversary Emulation?
Definition: Adversary emulation is a type of offensive security assessment where the red team replicates the specific tactics, techniques, and procedures of a known threat actor, based on cyber threat intelligence, to evaluate an organization's ability to detect and respond to that particular threat.
The key distinction from traditional offensive security activities:
| Activity | Objective | Approach | CTI Role |
|---|---|---|---|
| Vulnerability Assessment | Find vulnerabilities | Automated scanning | Minimal |
| Penetration Testing | Prove exploitability | Any path to objective | Limited |
| Red Teaming | Test overall security posture | Stealthy, objective-based | Moderate — scenario-driven |
| Adversary Emulation | Validate defense against specific actor | Follow actor's known playbook | Central — drives the entire plan |
Adversary emulation is inherently CTI-driven. Without quality intelligence about how a specific threat actor operates, there is nothing to emulate.
MITRE ATT&CK's Role in Emulation Planning
MITRE ATT&CK serves as the common language and structural framework for emulation planning. When a CTI team builds a threat actor profile mapped to ATT&CK techniques (as discussed in Lesson 24), that profile becomes the blueprint for an emulation plan.
From CTI to Emulation Plan
- Identify the priority threat actor based on your organization's threat landscape, sector, and intelligence requirements
- Compile the actor's TTP profile from published reporting, ATT&CK entries, and internal intelligence
- Map the attack flow — determine the sequence of techniques the actor typically uses, from initial access through actions on objectives
- Identify procedures — determine the specific tools, commands, and methods the actor employs at each step
- Build the emulation plan — create a step-by-step playbook that replicates the actor's operational sequence
- Define success criteria — what detection and response outcomes are you measuring?
MITRE CTID Adversary Emulation Library
The MITRE Center for Threat-Informed Defense (CTID) maintains the Adversary Emulation Library, a collection of publicly available emulation plans based on real threat actors. As of 2025, the library includes plans for:
- APT29 (Cozy Bear) — SVR-linked espionage operations
- FIN6 — Financially motivated group targeting payment card data
- menuPass (APT10) — Chinese espionage targeting managed service providers
- Sandworm — GRU Unit 74455, destructive operations
- Turla — Russian FSB-linked espionage group
Each emulation plan in the library includes a detailed operational flow, the specific ATT&CK techniques at each step, the tools and commands to execute, and the expected detection opportunities. These plans are freely available on the CTID GitHub repository.
Adversary Emulation Tools and Frameworks
MITRE Caldera
Caldera is an open-source adversary emulation platform developed by MITRE. It provides an automated framework for running adversary emulation operations using "abilities" (individual ATT&CK techniques implemented as executable plugins) organized into "adversary profiles" (sequences of abilities that model a specific actor).
Key features:
- Agent-based execution on target systems
- Ability to chain techniques into multi-step operations
- Built-in ATT&CK technique coverage
- Automated reporting of which techniques succeeded or were detected
- Extensible plugin architecture
Atomic Red Team
Atomic Red Team, developed by Red Canary, is a library of small, focused tests (called "atomics") mapped to individual ATT&CK techniques. Each atomic test is a self-contained procedure that executes a single technique on a target system.
Atomic Red Team differs from Caldera in philosophy:
- Atomics are granular — each test covers one technique, executed independently
- No agent required — tests are typically executed via command line (PowerShell, bash, cmd)
- Low barrier to entry — tests can be run by any team member, not just red team specialists
- Testing, not operations — atomics validate detection, they do not simulate full attack chains
The Atomic Red Team repository on GitHub contains tests for hundreds of ATT&CK techniques across Windows, macOS, and Linux.
SCYTHE
SCYTHE is a commercial adversary emulation platform that enables security teams to build and execute realistic attack scenarios. SCYTHE provides:
- A visual campaign builder for designing multi-step emulation plans
- Pre-built threat actor campaigns based on published intelligence
- Integration with ATT&CK for technique mapping
- Automated detection validation and reporting
Other Notable Tools
- Infection Monkey (Guardicore/Akamai): Open-source breach and attack simulation focusing on network propagation
- Invoke-AtomicRedTeam: PowerShell module for executing Atomic Red Team tests programmatically
- Prelude Operator: Lightweight adversary emulation agent with community and commercial versions
Building an Emulation Plan from CTI
A practical emulation plan translates intelligence into action. Here is the structure of an emulation plan document:
1. Threat Actor Overview
Summarize the actor: name/aliases, attributed sponsor, known targets, motivation, and operational history. Cite the intelligence sources used.
2. Scope and Objectives
Define what will be emulated (which phases of the kill chain), what systems are in scope, and what specific questions the emulation aims to answer.
3. Attack Flow
Document the step-by-step operational sequence:
| Phase | ATT&CK Technique | Procedure | Tool/Command | Expected Detection |
|---|---|---|---|---|
| Initial Access | T1566.001 Spear-phishing Attachment | Macro-enabled Word doc delivered via email | Crafted .docm with VBA macro | Email gateway alert, Endpoint macro execution |
| Execution | T1059.001 PowerShell | Macro launches encoded PowerShell | powershell -enc [base64] |
Process creation (Sysmon EID 1), Script block logging (EID 4104) |
| Persistence | T1053.005 Scheduled Task | Create scheduled task for callback | schtasks /create ... |
Scheduled task creation (EID 4698), Sysmon EID 1 |
| C2 | T1071.001 Web Protocols | HTTPS beacon to C2 server | Custom C2 over HTTPS | Network anomaly detection, DNS query logging |
4. Detection Expectations
For each step, document what detection should fire, in which tool (SIEM, EDR, NDR), and what the expected alert looks like. This becomes the scorecard.
5. Safety Controls
Define abort criteria, deconfliction procedures (how to distinguish emulation from a real incident), and rollback procedures for any changes made to target systems.
Purple Teaming
Definition: Purple teaming is a collaborative security exercise where red team (offensive) and blue team (defensive) personnel work together in real-time, with the explicit goal of improving detection and response capabilities.
Purple Team vs. Traditional Red Team
In a traditional red team engagement, the red team operates covertly and the blue team is unaware. Success is measured by whether the red team achieves its objective undetected.
In a purple team exercise, both teams work together:
- The red team executes a technique and announces it
- The blue team checks whether the activity was detected
- If detected: validate the alert quality and response procedures
- If not detected: collaboratively investigate why and build the detection
- Move to the next technique and repeat
Purple Team Workflow
- Plan: CTI team identifies the priority threat actor and builds the TTP profile
- Prepare: Red team builds the emulation plan; blue team identifies existing detections for each technique
- Execute: Technique by technique, the red team executes while the blue team monitors
- Assess: For each technique, record: Was it detected? By what tool? How long until detection? Was the alert actionable?
- Remediate: For gaps, collaboratively build new detections, tune existing ones, or identify telemetry collection needs
- Report: Document findings, new detections created, gaps remaining, and prioritized recommendations
Benefits of Purple Teaming
- Immediate feedback loop: Detections are validated or created in real-time, not weeks after a report
- Knowledge transfer: Blue team learns attack techniques; red team learns defensive blind spots
- Measurable outcomes: Clear metrics on detection coverage before and after the exercise
- Prioritized by threat: Focus is on the actors and techniques most relevant to the organization
Measuring Defensive Coverage
Emulation exercises produce measurable data about defensive capability:
| Metric | Description |
|---|---|
| Detection Rate | Percentage of emulated techniques that triggered an alert |
| Mean Time to Detect | Average time between technique execution and alert generation |
| Alert Quality | Was the alert specific enough for an analyst to understand and act on? |
| Response Effectiveness | Did the SOC respond correctly when alerted? |
| Coverage Delta | Change in ATT&CK coverage before vs. after the exercise |
Gap Analysis
After an emulation exercise, gaps fall into three categories:
- Telemetry gaps: The necessary data is not being collected (e.g., no PowerShell script block logging enabled)
- Detection logic gaps: The data is available but no rule exists to detect the behavior
- Response gaps: The detection fired but the SOC did not respond effectively (training, process, or tooling issue)
Each gap type requires a different remediation approach: telemetry gaps need engineering changes, detection gaps need rule development, and response gaps need process improvement or training.
How CTI Drives the Entire Process
CTI is the engine of adversary emulation and purple teaming:
- Before: CTI identifies which actors matter, what their TTPs are, and what the emulation should replicate
- During: CTI provides context during the exercise — "this is how APT29 actually does this in the wild"
- After: CTI validates whether the exercise was realistic, identifies gaps in the actor profile, and updates intelligence based on defensive findings
Organizations without a CTI function driving emulation default to generic penetration testing. With CTI integration, every emulation exercise directly improves resilience against the specific threats that matter most.
Key Takeaways
- Adversary emulation replicates specific threat actor behavior, making it fundamentally different from generic penetration testing
- MITRE ATT&CK provides the structural framework; CTI provides the content that drives emulation planning
- Tools like Caldera, Atomic Red Team, and SCYTHE enable automated and repeatable emulation execution
- Purple teaming maximizes value by combining offensive execution with real-time defensive validation
- Emulation results produce measurable metrics on detection coverage, response capability, and defensive gaps
- CTI is the central driver throughout the process — without quality intelligence, emulation becomes untargeted testing
Practical Exercise
Using the MITRE CTID Adversary Emulation Library (available at https://github.com/center-for-threat-informed-defense/adversary_emulation_library):
- Select one emulation plan (APT29 is recommended as it is the most detailed)
- Review the operational flow and identify the ATT&CK techniques used at each phase
- For five techniques in the plan, document:
- What telemetry source would be needed to detect it (e.g., Sysmon, Windows Security logs, network capture)
- Whether a Sigma rule exists in the SigmaHQ repository for this technique
- What a realistic false positive scenario would look like for that detection
- Create a detection scorecard (table format): Technique | Detection Exists (Y/N) | Data Source Required | Gap Type (if N)
- Prioritize the gaps: If you could only close three detection gaps, which three would provide the most defensive value and why?
Further Reading
- MITRE Center for Threat-Informed Defense. Adversary Emulation Library. Available at: https://github.com/center-for-threat-informed-defense/adversary_emulation_library
- Red Canary. Atomic Red Team. Available at: https://github.com/redcanaryco/atomic-red-team
- MITRE. Caldera Documentation. Available at: https://caldera.readthedocs.io/
- Applebaum, Andy et al. (2016). "Intelligent, Automated Red Team Emulation." Proceedings of ACSAC '16, ACM.