Malware analysis is a critical skill in the CTI analyst's toolkit, but the way intelligence analysts approach malware differs fundamentally from how reverse engineers do. As a CTI analyst, your goal is not to understand every assembly instruction or reconstruct the complete logic of a binary. Instead, you need to extract actionable intelligence: what does this malware do, how does it communicate, what infrastructure does it use, and how does it connect to broader threat actor activity? This lesson covers the non-reversing approaches to malware analysis that every CTI analyst should master.
Learning Objectives
- Distinguish between behavioral analysis and deep reverse engineering, and understand when each is appropriate for CTI work
- Extract IOCs and behavioral indicators from sandbox reports without performing manual reverse engineering
- Perform basic static analysis using file properties, strings, PE headers, and fuzzy hashing
- Write and understand basic YARA rules for malware detection and classification
- Navigate malware naming conventions and use VirusTotal for intelligence pivoting
Behavioral Analysis vs. Reverse Engineering
Malware analysis exists on a spectrum of depth. At one end is basic triage — checking file hashes, scanning with antivirus, and reviewing metadata. At the other end is deep reverse engineering — disassembling binaries in IDA Pro or Ghidra, debugging execution flow, and reconstructing algorithms. CTI analysts typically operate in the middle, focusing on behavioral analysis.
Behavioral analysis is the process of observing what malware does when it executes — what files it creates, what registry keys it modifies, what network connections it makes, and what processes it spawns — without necessarily understanding the underlying code that produces those behaviors.
How CTI Analysts Use Malware Analysis Differently
| Aspect | Malware Researcher | CTI Analyst |
|---|---|---|
| Goal | Understand how the malware works internally | Extract intelligence to link, track, and defend |
| Depth | Full code-level reverse engineering | Behavioral observation + static triage |
| Output | Technical report on malware internals | IOCs, MITRE ATT&CK mappings, threat actor linkage |
| Time spent | Days to weeks per sample | Minutes to hours per sample |
| Tools | IDA Pro, Ghidra, x64dbg, WinDbg | Sandboxes, VirusTotal, YARA, string extractors |
This does not mean CTI analysts should avoid understanding malware internals — the more you understand, the better your analysis. But your time is better spent connecting malware to campaigns and actors than reconstructing encryption routines.
Sandbox Analysis
Sandboxes are automated environments that execute malware samples and record their behavior. They provide the behavioral analysis that CTI analysts rely on most heavily.
Major Sandbox Platforms
ANY.RUN is an interactive sandbox that lets you observe malware execution in real time through a browser interface. You can interact with the virtual machine during execution — clicking through installer prompts, opening email attachments, or providing input the malware expects. ANY.RUN provides process trees, network traffic, file system changes, and registry modifications. Its interactive nature makes it particularly useful for samples that require user interaction to detonate.
Joe Sandbox performs deep automated analysis across multiple operating systems (Windows, macOS, Linux, Android, iOS). It produces detailed reports with behavior graphs, MITRE ATT&CK mappings, and classification scores. Joe Sandbox is known for its evasion detection — it can identify when malware is checking for virtualized environments and attempting to avoid analysis.
Hybrid Analysis (powered by Falcon Sandbox, a CrowdStrike product) is a free community service that provides automated analysis reports. It extracts indicators, maps behaviors to MITRE ATT&CK, and allows searching across previously submitted samples. Its public database makes it valuable for finding related samples.
Cuckoo Sandbox is the primary open-source automated malware analysis system. Organizations can deploy it internally, which is critical for handling sensitive samples that cannot be uploaded to public services. Cuckoo monitors API calls, network traffic, file operations, and memory, producing JSON-formatted reports that can be integrated into automated pipelines.
What Sandboxes Provide for CTI
Sandbox reports typically include:
- Network indicators: DNS queries, HTTP/HTTPS requests, IP connections, command-and-control (C2) communication patterns
- Host indicators: Files created/modified/deleted, registry changes, processes spawned, mutexes created
- Process behavior: Process trees showing parent-child relationships, injected processes, privilege escalation attempts
- MITRE ATT&CK mapping: Automated classification of observed behaviors to ATT&CK techniques
- Screenshots: Visual timeline of what appeared on screen during execution
- Dropped files: Secondary payloads or configuration files extracted during execution
- PCAP files: Full packet captures of network traffic for further analysis
Extracting IOCs from Sandbox Reports
When reviewing a sandbox report, systematically extract the following:
- Network IOCs: C2 domains, IP addresses, URLs, URI patterns, User-Agent strings, JA3/JA3S fingerprints
- File IOCs: Hashes (MD5, SHA-1, SHA-256) of the original sample and any dropped files, file names, file paths used
- Host IOCs: Registry keys created or modified, scheduled tasks, services installed, mutex names
- Behavioral signatures: Specific sequences of API calls, process injection techniques, persistence mechanisms
Basic Static Analysis Without Reversing
Static analysis examines a file without executing it. You do not need a disassembler to extract substantial intelligence from a binary.
File Properties and Metadata
Start with the basics. Check file size, type, and compilation timestamp. PE (Portable Executable) files contain a compilation timestamp in the COFF header — while this can be forged, inconsistencies between the timestamp and other evidence can themselves be indicators. Tools like file (on Linux), PEStudio, or CFF Explorer display these properties.
Examine the PE header for additional metadata: the linker version can suggest the development environment, the subsystem field indicates GUI vs. console application, and the checksum can reveal whether the file has been modified post-compilation.
Strings Analysis
Running the strings utility (or FLOSS — FireEye Labs Obfuscated String Solver — for obfuscated strings) against a binary can reveal:
- URLs, domains, and IP addresses used for C2
- File paths that the malware reads or writes
- Registry keys for persistence
- Error messages and debug strings left by the developer
- Encryption keys or passwords embedded in the binary
- PDB (Program Database) paths that reveal the developer's build environment
Import Table Analysis
The import table of a PE file lists the Windows API functions the binary calls. Certain combinations of imports are strong indicators of malicious behavior:
| Imports | Likely Behavior |
|---|---|
VirtualAlloc, WriteProcessMemory, CreateRemoteThread |
Process injection |
InternetOpen, HttpSendRequest |
HTTP-based C2 communication |
CryptEncrypt, CryptDecrypt |
Data encryption (possibly ransomware) |
RegSetValueEx, CreateService |
Persistence mechanisms |
AdjustTokenPrivileges, LookupPrivilegeValue |
Privilege escalation |
Tools like PEStudio, pestudio, and pe-sieve display imports in an easily reviewable format.
Fuzzy Hashing with ssdeep
ssdeep implements Context Triggered Piecewise Hashing (CTPH), which produces a hash that can be compared to other hashes to determine similarity. Unlike cryptographic hashes where a single byte change produces a completely different hash, ssdeep hashes of similar files will have a high match score.
This is invaluable for CTI because malware authors frequently recompile their tools with minor changes. Cryptographic hashes will not match, but ssdeep can identify samples that share significant code overlap. A match score above 50 generally indicates substantial similarity.
YARA Rules Basics
YARA is a pattern-matching tool used to identify and classify malware. CTI analysts write YARA rules to detect known malware families, track variations of a sample, and scan large file repositories.
A basic YARA rule structure:
rule APT_Backdoor_Example {
meta:
description = "Detects Example backdoor used by Threat Group"
author = "Analyst Name"
date = "2026-01-15"
reference = "https://example.com/report"
strings:
$mutex = "Global\\ExampleMutex" ascii
$c2_pattern = /https?:\/\/[a-z]{5,10}\.example\.(com|net)/
$pdb = "C:\\Users\\dev\\malware\\Release\\backdoor.pdb"
$magic = { 4D 5A 90 00 } // MZ header
condition:
$magic at 0 and (2 of ($mutex, $c2_pattern, $pdb))
}
YARA rules consist of three main sections: meta (descriptive information), strings (patterns to match — text, hex, or regex), and condition (logic for when the rule fires). The condition section uses Boolean logic to define what combination of strings must match.
VirusTotal Intelligence for Pivoting
VirusTotal (VT) aggregates scan results from 70+ antivirus engines, but its intelligence value for CTI extends far beyond detection names. With a VT Enterprise or VT Intelligence account, analysts can:
- Search by content: Find samples containing specific strings, imports, or file properties using VT's search modifiers (e.g.,
content:"specific_mutex_name") - Pivot on infrastructure: Search for all samples that communicated with a specific domain or IP address
- Graph relationships: Use VT Graph to visualize connections between files, domains, IPs, and URLs
- Track submissions: Monitor when and from where samples were uploaded, which can indicate targeting geography
- Retrohunt: Run YARA rules against VT's historical corpus to find samples you may have missed
Malware Naming Conventions
One persistent challenge in CTI is the lack of standardized malware naming. The same malware family may have different names across vendors:
| Vendor A | Vendor B | Vendor C | Common Name |
|---|---|---|---|
| Trojan.Neurevt | Win32/Betabot | Backdoor.Neurevt | Betabot |
| Ransom.WannaCrypt | Trojan-Ransom.Win32.Wanna | W32/WannaCry | WannaCry |
MITRE ATT&CK's Software entries (under the Malware and Tool categories) provide a reference point, listing known aliases for each malware family. When documenting malware in intelligence reports, include the hash values alongside any name to ensure unambiguous identification.
AV naming generally follows a loose convention: Platform:Type/Family.Variant, but each vendor implements this differently. Focus on the family name component and cross-reference using hash lookups rather than relying on name consistency.
Key Takeaways
- CTI analysts focus on behavioral analysis and indicator extraction rather than deep reverse engineering — your goal is actionable intelligence, not code reconstruction
- Sandboxes (ANY.RUN, Joe Sandbox, Hybrid Analysis, Cuckoo) automate behavioral analysis and provide network indicators, host artifacts, dropped files, and MITRE ATT&CK mappings
- Basic static analysis — strings, imports, PE headers, and ssdeep fuzzy hashing — yields substantial intelligence without disassembly
- YARA rules are essential for detecting, classifying, and hunting for malware across your environment and file repositories
- VirusTotal is a pivoting platform, not just a scanner — use it to find related samples, track infrastructure, and run retrohunts
- Malware naming is inconsistent across vendors; always include hash values and cross-reference with MITRE ATT&CK Software entries
Practical Exercise
- Go to Hybrid Analysis (hybrid-analysis.com) and search for a recently analyzed sample (choose something with a high threat score).
- Review the full sandbox report. Extract and document the following in a structured format:
- All network IOCs (domains, IPs, URLs)
- All host IOCs (files created, registry keys, mutexes)
- MITRE ATT&CK techniques observed
- Any dropped or downloaded secondary payloads
- Search the primary sample hash on VirusTotal. Note the different names assigned by at least five AV vendors. Identify the common family name.
- Using the network IOCs you extracted, pivot in VirusTotal to find other samples that communicated with the same infrastructure. Document how many related samples you find and any patterns you observe.
- Write a basic YARA rule using at least two indicators from your analysis (e.g., a mutex name and a C2 URL pattern) that could detect this malware family.
Further Reading
- Sikorski, M. & Honig, A. (2012). Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software. No Starch Press.
- MITRE ATT&CK Software List: https://attack.mitre.org/software/
- YARA Documentation: https://yara.readthedocs.io/
- VirusTotal Documentation: https://docs.virustotal.com/