SIEM for CTI Analysts — CTI Academy

A Security Information and Event Management (SIEM) system is one of the most important tools in a CTI analyst's operational toolkit. While SIEMs are primarily associated with security operations and incident detection, CTI analysts use them daily to validate intelligence, hunt for threats, assess organizational exposure, and translate threat knowledge into actionable detections. Understanding how to query a SIEM effectively transforms a CTI analyst from someone who produces reports into someone who directly defends the network.

Learning Objectives

Understand what a SIEM is and how it fits into security operations
Recognize the major SIEM platforms and their query languages
Write basic queries in SPL (Splunk) and KQL (Elastic/Microsoft Sentinel)
Use a SIEM for CTI-specific tasks: IOC lookups, hunting, and detection validation
Translate intelligence into SIEM searches and manage false positives

What Is a SIEM?

A SIEM collects, normalizes, indexes, and correlates log data from across an organization's IT infrastructure. It provides a centralized platform for searching through security-relevant events, building detection rules, and investigating incidents.

Core SIEM capabilities include:

Log aggregation: Collecting logs from endpoints, servers, network devices, cloud services, and applications into a single searchable repository
Normalization: Converting logs from different formats into a common schema so they can be queried consistently
Search and query: Providing a query language to search through collected data
Alerting: Triggering notifications when predefined conditions are met (correlation rules, saved searches)
Dashboards and visualization: Presenting data in charts, tables, and graphs for operational awareness
Retention: Storing historical data for investigation and compliance

Key Concept: A SIEM is only as good as the data it ingests. If critical log sources are not feeding into the SIEM, those blind spots cannot be searched, hunted, or alerted on. Understanding what data is — and is not — available is essential before writing queries.

Common SIEM Platforms

Splunk

Splunk is one of the most widely deployed SIEMs in enterprise and government environments. It uses the Search Processing Language (SPL) for queries. Splunk is known for its powerful search capabilities, extensive app ecosystem, and scalability, though it can be expensive at high data volumes.

Elastic (ELK Stack) / Elastic Security

The Elastic Stack (Elasticsearch, Logstash, Kibana — commonly called ELK) is an open-source platform frequently used as a SIEM. Elastic Security adds detection rules, case management, and endpoint capabilities. It uses Kibana Query Language (KQL) for simple searches and the Elasticsearch Query DSL for advanced queries. Elastic also supports Event Query Language (EQL) for sequence-based detections.

Microsoft Sentinel

Microsoft Sentinel is a cloud-native SIEM built on Azure. It uses Kusto Query Language (KQL) — which is similar in name to Elastic's KQL but is a distinct language with different syntax. Sentinel integrates tightly with Microsoft 365, Azure Active Directory (Entra ID), and Defender products.

IBM QRadar

QRadar is an enterprise SIEM that uses Ariel Query Language (AQL) for searches. It is commonly deployed in large enterprises and government environments. QRadar emphasizes offense-based analytics and asset profiling.

SIEM	Query Language	Deployment	Notable Strength
Splunk	SPL	On-prem / Cloud	Flexible search, massive app ecosystem
Elastic	KQL, EQL, Query DSL	On-prem / Cloud	Open-source core, strong for log analytics
Microsoft Sentinel	KQL (Kusto)	Cloud (Azure)	Deep Microsoft ecosystem integration
IBM QRadar	AQL	On-prem / SaaS	Asset-centric correlation, offense tracking

Basic Query Concepts

SPL (Splunk Processing Language)

SPL queries start with a search command and use pipes to chain processing steps, similar to Unix command-line pipelines.

Basic search for an IP address:

index=firewall sourcetype=pan:traffic dest_ip="203.0.113.45"

Search with time range and field filtering:

index=proxy sourcetype=bluecoat earliest=-7d latest=now url="*malicious-domain.com*"
| table _time src_ip dest_ip url http_method status

Aggregate and count events:

index=windows sourcetype=WinEventLog:Security EventCode=4625
| stats count by src_ip Account_Name
| sort -count

Look for process execution with command-line details:

index=sysmon EventCode=1 ParentImage="*\\winword.exe"
| table _time ComputerName User Image CommandLine ParentImage

Key SPL concepts:

index and sourcetype narrow the search to specific data sets
Pipe (|) chains commands together
stats performs aggregations (count, sum, avg, dc for distinct count)
table formats output as columns
where filters results with conditions
eval creates calculated fields
Wildcards (*) match any characters

KQL (Kibana Query Language — Elastic)

Elastic's KQL is simpler than SPL and is used in the Kibana search bar.

Basic search for an IP address:

destination.ip: "203.0.113.45"

Combining conditions:

event.category: "network" and destination.ip: "203.0.113.45" and not source.ip: "10.0.0.0/8"

Wildcard search:

url.full: *malicious-domain.com*

For more complex analysis in Elastic, analysts typically use Elasticsearch Query DSL (JSON-based) or EQL for sequence detection.

KQL (Kusto Query Language — Microsoft Sentinel)

Sentinel's KQL is a more expressive language with SQL-like capabilities.

Basic search for an IP address:

CommonSecurityLog
| where DestinationIP == "203.0.113.45"
| project TimeGenerated, SourceIP, DestinationIP, Activity

Search with time range:

SigninLogs
| where TimeGenerated > ago(7d)
| where ResultType != 0
| summarize FailureCount=count() by UserPrincipalName, IPAddress
| sort by FailureCount desc

How CTI Analysts Use SIEMs

IOC Lookups

The most basic CTI-SIEM interaction is searching for known indicators of compromise. When a new threat report is published with IOCs, the CTI analyst searches historical logs to determine whether the organization has been exposed.

SPL example — searching for multiple IPs:

index=firewall OR index=proxy
(dest_ip="203.0.113.45" OR dest_ip="198.51.100.23" OR dest_ip="192.0.2.100")
| stats count by index dest_ip src_ip
| sort -count

SPL example — searching for a file hash:

index=sysmon EventCode=1 (SHA256="a1b2c3d4e5f6..." OR MD5="abc123...")
| table _time ComputerName User Image CommandLine SHA256

For large-scale IOC lookups, most SIEMs support lookup tables or watchlists where IOC lists can be loaded and automatically matched against incoming data.

Threat Hunting Queries

CTI analysts use SIEMs to execute TTP-based hunting queries. These searches look for behavioral patterns rather than specific indicators.

Hunt for potential DNS tunneling (SPL):

index=dns
| eval subdomain_length=len(mvindex(split(query, "."), 0))
| where subdomain_length > 50
| stats count by query src_ip
| sort -count

Hunt for unusual scheduled task creation (SPL):

index=windows sourcetype=WinEventLog:Security EventCode=4698
| table _time ComputerName SubjectUserName TaskName TaskContent
| search NOT TaskName IN ("\\Microsoft\\*", "\\Adobe\\*", "\\Google\\*")

Hunt for PowerShell encoded commands (SPL):

index=sysmon EventCode=1 Image="*\\powershell.exe" CommandLine="*-enc*" OR CommandLine="*-EncodedCommand*"
| table _time ComputerName User CommandLine ParentImage

Detection Validation

After creating a new detection rule, CTI analysts validate that it fires correctly and does not produce excessive false positives. This involves:

Running the detection query against historical data
Reviewing the results for true positives and false positives
Tuning the query to reduce noise while maintaining detection coverage
Testing with known-good samples if available

Translating Intelligence into SIEM Searches

One of the most valuable skills for a CTI analyst is translating threat intelligence into effective SIEM queries. The process follows these steps:

Read the intelligence report and identify TTPs, indicators, and targeted infrastructure
Map TTPs to data sources: Which logs would show evidence of each technique? (Use MITRE ATT&CK data source mappings as a reference)
Check data availability: Verify that the required log sources are ingested into the SIEM with sufficient detail
Write the query: Build a search that detects the behavioral pattern, not just the specific indicator
Test and tune: Run against historical data, assess false positive rate, refine as needed

Example translation:

An intelligence report states: "The threat actor uses renamed instances of PsExec for lateral movement."

TTP: Lateral movement using PsExec (MITRE ATT&CK T1570, T1021.002)
Data source needed: Sysmon Event ID 1 (process creation) with command-line logging
Behavioral indicator: A process with a non-standard name but the original description or hash of PsExec, or a process creating the PSEXESVC service

SPL query:

index=sysmon EventCode=1
(OriginalFileName="psexec.c" OR OriginalFileName="PsExec.exe")
NOT (Image="*\\PsExec.exe" OR Image="*\\PsExec64.exe")
| table _time ComputerName User Image OriginalFileName CommandLine

This query detects renamed PsExec by comparing the binary's OriginalFileName metadata against the actual file name on disk — catching the TTP regardless of what the adversary renamed the file to.

Log Sources That Matter

Not all log sources are equally valuable for CTI-driven analysis. The following are the most important:

Log Source	Key Value	Example Events
Endpoint (Sysmon/EDR)	Process execution, file creation, network connections per process	Process creation with command line, file writes, registry modifications
Network flow/firewall	Connection metadata, blocked traffic	Source/destination IPs, ports, bytes, allow/deny
DNS	Domain resolution, tunneling detection	Queried domains, response codes, query types
Proxy/web	URL-level visibility, user agent strings	Full URLs, HTTP methods, response codes, bytes
Authentication	Account usage, lateral movement	Logon success/failure, logon type, source workstation
Email	Phishing delivery, social engineering	Sender, recipient, subject, attachments, URLs, verdicts

Correlation Rules vs. Saved Searches

SIEMs offer two primary mechanisms for ongoing detection:

Correlation rules fire when multiple conditions are met within a defined time window. They detect patterns that span multiple events.

Example: "Alert when the same source IP fails authentication 10 times within 5 minutes and then successfully authenticates" (potential brute force followed by compromise)

Saved searches / scheduled searches run a query on a defined schedule and generate alerts when results are found.

Example: "Every 15 minutes, search for any DNS query to domains on the IOC watchlist"

CTI analysts typically work more with saved searches (for IOC matching and TTP hunting) while SOC engineers build correlation rules for complex behavioral detection.

False Positive Management

False positives are the persistent enemy of effective detection. CTI analysts must actively manage them:

Baseline first: Before deploying a detection, understand what normal activity looks like for that data source
Whitelist carefully: Exclude known-good activity using specific criteria (exact process paths, known service accounts, expected scheduled tasks) rather than broad exclusions
Document exceptions: Every whitelist entry should include who added it, when, why, and a review date
Monitor false positive rates: Track the ratio of true positives to false positives for each detection rule
Iterate: Continuously refine queries based on operational experience

Warning: Over-tuning a detection to eliminate all false positives often creates false negatives — real threats that the rule no longer catches. The goal is an acceptable balance, not zero false positives.

Key Takeaways

A SIEM is the primary tool for operationalizing threat intelligence — it turns knowledge into action
The major platforms (Splunk, Elastic, Sentinel, QRadar) have different query languages but similar core concepts
CTI analysts use SIEMs for IOC lookups, TTP-based hunting, detection creation, and validation
Translating intelligence into SIEM queries requires mapping TTPs to data sources and writing behavioral searches
Endpoint, network, DNS, proxy, authentication, and email logs are the most valuable data sources for CTI-driven analysis
False positive management is an ongoing discipline that requires baselining, careful whitelisting, and continuous iteration

Practical Exercise

Practice translating intelligence into SIEM queries:

Select a MITRE ATT&CK technique: Go to https://attack.mitre.org/ and choose a technique that interests you (e.g., T1059.001 — PowerShell, T1053.005 — Scheduled Task, or T1003.001 — LSASS Memory)
Review the ATT&CK page: Note the description, procedure examples, data sources, and detections listed
Identify the log source: What log source would capture evidence of this technique? Is it endpoint (Sysmon, EDR), network, authentication, or another source?
Write 2-3 queries: Using SPL or KQL (whichever you have access to, or pseudocode if neither), write queries that would detect the technique behaviorally
Consider false positives: For each query, list 2-3 legitimate activities that might trigger a false positive and how you would filter them out
Write a detection rule description: Document the query as a detection rule with a name, description, severity, MITRE mapping, and recommended response action

Learning Objectives

What Is a SIEM?

Common SIEM Platforms

Splunk

Elastic (ELK Stack) / Elastic Security

Microsoft Sentinel

IBM QRadar

Basic Query Concepts

SPL (Splunk Processing Language)

KQL (Kibana Query Language — Elastic)

KQL (Kusto Query Language — Microsoft Sentinel)

How CTI Analysts Use SIEMs

IOC Lookups

Threat Hunting Queries

Detection Validation

Translating Intelligence into SIEM Searches

Log Sources That Matter

Correlation Rules vs. Saved Searches

False Positive Management

Key Takeaways

Practical Exercise

Further Reading