Adversary Emulation & Purple Teaming — CTI Academy

Adversary emulation is the practice of simulating real-world threat actor behavior to test and validate an organization's defenses. Unlike generic penetration testing, which seeks to find any path to compromise, adversary emulation follows a specific threat actor's known playbook — their tools, techniques, and procedures — to answer the question: "Can we detect and respond to this specific threat?" This lesson covers emulation planning, tooling, purple teaming, and how CTI drives the entire process.

Learning Objectives

Distinguish adversary emulation from penetration testing and red teaming
Understand how MITRE ATT&CK and CTI reporting inform emulation planning
Identify key adversary emulation tools and frameworks
Describe the purple teaming methodology and its benefits
Measure defensive coverage and conduct gap analysis using emulation results

What Is Adversary Emulation?

Definition: Adversary emulation is a type of offensive security assessment where the red team replicates the specific tactics, techniques, and procedures of a known threat actor, based on cyber threat intelligence, to evaluate an organization's ability to detect and respond to that particular threat.

The key distinction from traditional offensive security activities:

Activity	Objective	Approach	CTI Role
Vulnerability Assessment	Find vulnerabilities	Automated scanning	Minimal
Penetration Testing	Prove exploitability	Any path to objective	Limited
Red Teaming	Test overall security posture	Stealthy, objective-based	Moderate — scenario-driven
Adversary Emulation	Validate defense against specific actor	Follow actor's known playbook	Central — drives the entire plan

Adversary emulation is inherently CTI-driven. Without quality intelligence about how a specific threat actor operates, there is nothing to emulate.

MITRE ATT&CK's Role in Emulation Planning

MITRE ATT&CK serves as the common language and structural framework for emulation planning. When a CTI team builds a threat actor profile mapped to ATT&CK techniques (as discussed in Lesson 24), that profile becomes the blueprint for an emulation plan.

From CTI to Emulation Plan

Identify the priority threat actor based on your organization's threat landscape, sector, and intelligence requirements
Compile the actor's TTP profile from published reporting, ATT&CK entries, and internal intelligence
Map the attack flow — determine the sequence of techniques the actor typically uses, from initial access through actions on objectives
Identify procedures — determine the specific tools, commands, and methods the actor employs at each step
Build the emulation plan — create a step-by-step playbook that replicates the actor's operational sequence
Define success criteria — what detection and response outcomes are you measuring?

MITRE CTID Adversary Emulation Library

The MITRE Center for Threat-Informed Defense (CTID) maintains the Adversary Emulation Library, a collection of publicly available emulation plans based on real threat actors. As of 2025, the library includes plans for:

APT29 (Cozy Bear) — SVR-linked espionage operations
FIN6 — Financially motivated group targeting payment card data
menuPass (APT10) — Chinese espionage targeting managed service providers
Sandworm — GRU Unit 74455, destructive operations
Turla — Russian FSB-linked espionage group

Each emulation plan in the library includes a detailed operational flow, the specific ATT&CK techniques at each step, the tools and commands to execute, and the expected detection opportunities. These plans are freely available on the CTID GitHub repository.

Adversary Emulation Tools and Frameworks

MITRE Caldera

Caldera is an open-source adversary emulation platform developed by MITRE. It provides an automated framework for running adversary emulation operations using "abilities" (individual ATT&CK techniques implemented as executable plugins) organized into "adversary profiles" (sequences of abilities that model a specific actor).

Key features:

Agent-based execution on target systems
Ability to chain techniques into multi-step operations
Built-in ATT&CK technique coverage
Automated reporting of which techniques succeeded or were detected
Extensible plugin architecture

Atomic Red Team

Atomic Red Team, developed by Red Canary, is a library of small, focused tests (called "atomics") mapped to individual ATT&CK techniques. Each atomic test is a self-contained procedure that executes a single technique on a target system.

Atomic Red Team differs from Caldera in philosophy:

Atomics are granular — each test covers one technique, executed independently
No agent required — tests are typically executed via command line (PowerShell, bash, cmd)
Low barrier to entry — tests can be run by any team member, not just red team specialists
Testing, not operations — atomics validate detection, they do not simulate full attack chains

The Atomic Red Team repository on GitHub contains tests for hundreds of ATT&CK techniques across Windows, macOS, and Linux.

SCYTHE

SCYTHE is a commercial adversary emulation platform that enables security teams to build and execute realistic attack scenarios. SCYTHE provides:

A visual campaign builder for designing multi-step emulation plans
Pre-built threat actor campaigns based on published intelligence
Integration with ATT&CK for technique mapping
Automated detection validation and reporting

Other Notable Tools

Infection Monkey (Guardicore/Akamai): Open-source breach and attack simulation focusing on network propagation
Invoke-AtomicRedTeam: PowerShell module for executing Atomic Red Team tests programmatically
Prelude Operator: Lightweight adversary emulation agent with community and commercial versions

Building an Emulation Plan from CTI

A practical emulation plan translates intelligence into action. Here is the structure of an emulation plan document:

1. Threat Actor Overview

Summarize the actor: name/aliases, attributed sponsor, known targets, motivation, and operational history. Cite the intelligence sources used.

2. Scope and Objectives

Define what will be emulated (which phases of the kill chain), what systems are in scope, and what specific questions the emulation aims to answer.

3. Attack Flow

Document the step-by-step operational sequence:

Phase	ATT&CK Technique	Procedure	Tool/Command	Expected Detection
Initial Access	T1566.001 Spear-phishing Attachment	Macro-enabled Word doc delivered via email	Crafted .docm with VBA macro	Email gateway alert, Endpoint macro execution
Execution	T1059.001 PowerShell	Macro launches encoded PowerShell	`powershell -enc [base64]`	Process creation (Sysmon EID 1), Script block logging (EID 4104)
Persistence	T1053.005 Scheduled Task	Create scheduled task for callback	`schtasks /create ...`	Scheduled task creation (EID 4698), Sysmon EID 1
C2	T1071.001 Web Protocols	HTTPS beacon to C2 server	Custom C2 over HTTPS	Network anomaly detection, DNS query logging

4. Detection Expectations

For each step, document what detection should fire, in which tool (SIEM, EDR, NDR), and what the expected alert looks like. This becomes the scorecard.

5. Safety Controls

Define abort criteria, deconfliction procedures (how to distinguish emulation from a real incident), and rollback procedures for any changes made to target systems.

Purple Teaming

Definition: Purple teaming is a collaborative security exercise where red team (offensive) and blue team (defensive) personnel work together in real-time, with the explicit goal of improving detection and response capabilities.

Purple Team vs. Traditional Red Team

In a traditional red team engagement, the red team operates covertly and the blue team is unaware. Success is measured by whether the red team achieves its objective undetected.

In a purple team exercise, both teams work together:

The red team executes a technique and announces it
The blue team checks whether the activity was detected
If detected: validate the alert quality and response procedures
If not detected: collaboratively investigate why and build the detection
Move to the next technique and repeat

Purple Team Workflow

Plan: CTI team identifies the priority threat actor and builds the TTP profile
Prepare: Red team builds the emulation plan; blue team identifies existing detections for each technique
Execute: Technique by technique, the red team executes while the blue team monitors
Assess: For each technique, record: Was it detected? By what tool? How long until detection? Was the alert actionable?
Remediate: For gaps, collaboratively build new detections, tune existing ones, or identify telemetry collection needs
Report: Document findings, new detections created, gaps remaining, and prioritized recommendations

Benefits of Purple Teaming

Immediate feedback loop: Detections are validated or created in real-time, not weeks after a report
Knowledge transfer: Blue team learns attack techniques; red team learns defensive blind spots
Measurable outcomes: Clear metrics on detection coverage before and after the exercise
Prioritized by threat: Focus is on the actors and techniques most relevant to the organization

Measuring Defensive Coverage

Emulation exercises produce measurable data about defensive capability:

Metric	Description
Detection Rate	Percentage of emulated techniques that triggered an alert
Mean Time to Detect	Average time between technique execution and alert generation
Alert Quality	Was the alert specific enough for an analyst to understand and act on?
Response Effectiveness	Did the SOC respond correctly when alerted?
Coverage Delta	Change in ATT&CK coverage before vs. after the exercise

Gap Analysis

After an emulation exercise, gaps fall into three categories:

Telemetry gaps: The necessary data is not being collected (e.g., no PowerShell script block logging enabled)
Detection logic gaps: The data is available but no rule exists to detect the behavior
Response gaps: The detection fired but the SOC did not respond effectively (training, process, or tooling issue)

Each gap type requires a different remediation approach: telemetry gaps need engineering changes, detection gaps need rule development, and response gaps need process improvement or training.

How CTI Drives the Entire Process

CTI is the engine of adversary emulation and purple teaming:

Before: CTI identifies which actors matter, what their TTPs are, and what the emulation should replicate
During: CTI provides context during the exercise — "this is how APT29 actually does this in the wild"
After: CTI validates whether the exercise was realistic, identifies gaps in the actor profile, and updates intelligence based on defensive findings

Organizations without a CTI function driving emulation default to generic penetration testing. With CTI integration, every emulation exercise directly improves resilience against the specific threats that matter most.

Key Takeaways

Adversary emulation replicates specific threat actor behavior, making it fundamentally different from generic penetration testing
MITRE ATT&CK provides the structural framework; CTI provides the content that drives emulation planning
Tools like Caldera, Atomic Red Team, and SCYTHE enable automated and repeatable emulation execution
Purple teaming maximizes value by combining offensive execution with real-time defensive validation
Emulation results produce measurable metrics on detection coverage, response capability, and defensive gaps
CTI is the central driver throughout the process — without quality intelligence, emulation becomes untargeted testing

Practical Exercise

Using the MITRE CTID Adversary Emulation Library (available at https://github.com/center-for-threat-informed-defense/adversary_emulation_library):

Select one emulation plan (APT29 is recommended as it is the most detailed)
Review the operational flow and identify the ATT&CK techniques used at each phase
For five techniques in the plan, document:
- What telemetry source would be needed to detect it (e.g., Sysmon, Windows Security logs, network capture)
- Whether a Sigma rule exists in the SigmaHQ repository for this technique
- What a realistic false positive scenario would look like for that detection
Create a detection scorecard (table format): Technique | Detection Exists (Y/N) | Data Source Required | Gap Type (if N)
Prioritize the gaps: If you could only close three detection gaps, which three would provide the most defensive value and why?