A Security Information and Event Management (SIEM) system is one of the most important tools in a CTI analyst's operational toolkit. While SIEMs are primarily associated with security operations and incident detection, CTI analysts use them daily to validate intelligence, hunt for threats, assess organizational exposure, and translate threat knowledge into actionable detections. Understanding how to query a SIEM effectively transforms a CTI analyst from someone who produces reports into someone who directly defends the network.
Learning Objectives
- Understand what a SIEM is and how it fits into security operations
- Recognize the major SIEM platforms and their query languages
- Write basic queries in SPL (Splunk) and KQL (Elastic/Microsoft Sentinel)
- Use a SIEM for CTI-specific tasks: IOC lookups, hunting, and detection validation
- Translate intelligence into SIEM searches and manage false positives
What Is a SIEM?
A SIEM collects, normalizes, indexes, and correlates log data from across an organization's IT infrastructure. It provides a centralized platform for searching through security-relevant events, building detection rules, and investigating incidents.
Core SIEM capabilities include:
- Log aggregation: Collecting logs from endpoints, servers, network devices, cloud services, and applications into a single searchable repository
- Normalization: Converting logs from different formats into a common schema so they can be queried consistently
- Search and query: Providing a query language to search through collected data
- Alerting: Triggering notifications when predefined conditions are met (correlation rules, saved searches)
- Dashboards and visualization: Presenting data in charts, tables, and graphs for operational awareness
- Retention: Storing historical data for investigation and compliance
Key Concept: A SIEM is only as good as the data it ingests. If critical log sources are not feeding into the SIEM, those blind spots cannot be searched, hunted, or alerted on. Understanding what data is — and is not — available is essential before writing queries.
Common SIEM Platforms
Splunk
Splunk is one of the most widely deployed SIEMs in enterprise and government environments. It uses the Search Processing Language (SPL) for queries. Splunk is known for its powerful search capabilities, extensive app ecosystem, and scalability, though it can be expensive at high data volumes.
Elastic (ELK Stack) / Elastic Security
The Elastic Stack (Elasticsearch, Logstash, Kibana — commonly called ELK) is an open-source platform frequently used as a SIEM. Elastic Security adds detection rules, case management, and endpoint capabilities. It uses Kibana Query Language (KQL) for simple searches and the Elasticsearch Query DSL for advanced queries. Elastic also supports Event Query Language (EQL) for sequence-based detections.
Microsoft Sentinel
Microsoft Sentinel is a cloud-native SIEM built on Azure. It uses Kusto Query Language (KQL) — which is similar in name to Elastic's KQL but is a distinct language with different syntax. Sentinel integrates tightly with Microsoft 365, Azure Active Directory (Entra ID), and Defender products.
IBM QRadar
QRadar is an enterprise SIEM that uses Ariel Query Language (AQL) for searches. It is commonly deployed in large enterprises and government environments. QRadar emphasizes offense-based analytics and asset profiling.
| SIEM | Query Language | Deployment | Notable Strength |
|---|---|---|---|
| Splunk | SPL | On-prem / Cloud | Flexible search, massive app ecosystem |
| Elastic | KQL, EQL, Query DSL | On-prem / Cloud | Open-source core, strong for log analytics |
| Microsoft Sentinel | KQL (Kusto) | Cloud (Azure) | Deep Microsoft ecosystem integration |
| IBM QRadar | AQL | On-prem / SaaS | Asset-centric correlation, offense tracking |
Basic Query Concepts
SPL (Splunk Processing Language)
SPL queries start with a search command and use pipes to chain processing steps, similar to Unix command-line pipelines.
Basic search for an IP address:
index=firewall sourcetype=pan:traffic dest_ip="203.0.113.45"
Search with time range and field filtering:
index=proxy sourcetype=bluecoat earliest=-7d latest=now url="*malicious-domain.com*"
| table _time src_ip dest_ip url http_method status
Aggregate and count events:
index=windows sourcetype=WinEventLog:Security EventCode=4625
| stats count by src_ip Account_Name
| sort -count
Look for process execution with command-line details:
index=sysmon EventCode=1 ParentImage="*\\winword.exe"
| table _time ComputerName User Image CommandLine ParentImage
Key SPL concepts:
indexandsourcetypenarrow the search to specific data sets- Pipe (
|) chains commands together statsperforms aggregations (count, sum, avg, dc for distinct count)tableformats output as columnswherefilters results with conditionsevalcreates calculated fields- Wildcards (
*) match any characters
KQL (Kibana Query Language — Elastic)
Elastic's KQL is simpler than SPL and is used in the Kibana search bar.
Basic search for an IP address:
destination.ip: "203.0.113.45"
Combining conditions:
event.category: "network" and destination.ip: "203.0.113.45" and not source.ip: "10.0.0.0/8"
Wildcard search:
url.full: *malicious-domain.com*
For more complex analysis in Elastic, analysts typically use Elasticsearch Query DSL (JSON-based) or EQL for sequence detection.
KQL (Kusto Query Language — Microsoft Sentinel)
Sentinel's KQL is a more expressive language with SQL-like capabilities.
Basic search for an IP address:
CommonSecurityLog
| where DestinationIP == "203.0.113.45"
| project TimeGenerated, SourceIP, DestinationIP, Activity
Search with time range:
SigninLogs
| where TimeGenerated > ago(7d)
| where ResultType != 0
| summarize FailureCount=count() by UserPrincipalName, IPAddress
| sort by FailureCount desc
How CTI Analysts Use SIEMs
IOC Lookups
The most basic CTI-SIEM interaction is searching for known indicators of compromise. When a new threat report is published with IOCs, the CTI analyst searches historical logs to determine whether the organization has been exposed.
SPL example — searching for multiple IPs:
index=firewall OR index=proxy
(dest_ip="203.0.113.45" OR dest_ip="198.51.100.23" OR dest_ip="192.0.2.100")
| stats count by index dest_ip src_ip
| sort -count
SPL example — searching for a file hash:
index=sysmon EventCode=1 (SHA256="a1b2c3d4e5f6..." OR MD5="abc123...")
| table _time ComputerName User Image CommandLine SHA256
For large-scale IOC lookups, most SIEMs support lookup tables or watchlists where IOC lists can be loaded and automatically matched against incoming data.
Threat Hunting Queries
CTI analysts use SIEMs to execute TTP-based hunting queries. These searches look for behavioral patterns rather than specific indicators.
Hunt for potential DNS tunneling (SPL):
index=dns
| eval subdomain_length=len(mvindex(split(query, "."), 0))
| where subdomain_length > 50
| stats count by query src_ip
| sort -count
Hunt for unusual scheduled task creation (SPL):
index=windows sourcetype=WinEventLog:Security EventCode=4698
| table _time ComputerName SubjectUserName TaskName TaskContent
| search NOT TaskName IN ("\\Microsoft\\*", "\\Adobe\\*", "\\Google\\*")
Hunt for PowerShell encoded commands (SPL):
index=sysmon EventCode=1 Image="*\\powershell.exe" CommandLine="*-enc*" OR CommandLine="*-EncodedCommand*"
| table _time ComputerName User CommandLine ParentImage
Detection Validation
After creating a new detection rule, CTI analysts validate that it fires correctly and does not produce excessive false positives. This involves:
- Running the detection query against historical data
- Reviewing the results for true positives and false positives
- Tuning the query to reduce noise while maintaining detection coverage
- Testing with known-good samples if available
Translating Intelligence into SIEM Searches
One of the most valuable skills for a CTI analyst is translating threat intelligence into effective SIEM queries. The process follows these steps:
- Read the intelligence report and identify TTPs, indicators, and targeted infrastructure
- Map TTPs to data sources: Which logs would show evidence of each technique? (Use MITRE ATT&CK data source mappings as a reference)
- Check data availability: Verify that the required log sources are ingested into the SIEM with sufficient detail
- Write the query: Build a search that detects the behavioral pattern, not just the specific indicator
- Test and tune: Run against historical data, assess false positive rate, refine as needed
Example translation:
An intelligence report states: "The threat actor uses renamed instances of PsExec for lateral movement."
- TTP: Lateral movement using PsExec (MITRE ATT&CK T1570, T1021.002)
- Data source needed: Sysmon Event ID 1 (process creation) with command-line logging
- Behavioral indicator: A process with a non-standard name but the original description or hash of PsExec, or a process creating the PSEXESVC service
SPL query:
index=sysmon EventCode=1
(OriginalFileName="psexec.c" OR OriginalFileName="PsExec.exe")
NOT (Image="*\\PsExec.exe" OR Image="*\\PsExec64.exe")
| table _time ComputerName User Image OriginalFileName CommandLine
This query detects renamed PsExec by comparing the binary's OriginalFileName metadata against the actual file name on disk — catching the TTP regardless of what the adversary renamed the file to.
Log Sources That Matter
Not all log sources are equally valuable for CTI-driven analysis. The following are the most important:
| Log Source | Key Value | Example Events |
|---|---|---|
| Endpoint (Sysmon/EDR) | Process execution, file creation, network connections per process | Process creation with command line, file writes, registry modifications |
| Network flow/firewall | Connection metadata, blocked traffic | Source/destination IPs, ports, bytes, allow/deny |
| DNS | Domain resolution, tunneling detection | Queried domains, response codes, query types |
| Proxy/web | URL-level visibility, user agent strings | Full URLs, HTTP methods, response codes, bytes |
| Authentication | Account usage, lateral movement | Logon success/failure, logon type, source workstation |
| Phishing delivery, social engineering | Sender, recipient, subject, attachments, URLs, verdicts |
Correlation Rules vs. Saved Searches
SIEMs offer two primary mechanisms for ongoing detection:
Correlation rules fire when multiple conditions are met within a defined time window. They detect patterns that span multiple events.
- Example: "Alert when the same source IP fails authentication 10 times within 5 minutes and then successfully authenticates" (potential brute force followed by compromise)
Saved searches / scheduled searches run a query on a defined schedule and generate alerts when results are found.
- Example: "Every 15 minutes, search for any DNS query to domains on the IOC watchlist"
CTI analysts typically work more with saved searches (for IOC matching and TTP hunting) while SOC engineers build correlation rules for complex behavioral detection.
False Positive Management
False positives are the persistent enemy of effective detection. CTI analysts must actively manage them:
- Baseline first: Before deploying a detection, understand what normal activity looks like for that data source
- Whitelist carefully: Exclude known-good activity using specific criteria (exact process paths, known service accounts, expected scheduled tasks) rather than broad exclusions
- Document exceptions: Every whitelist entry should include who added it, when, why, and a review date
- Monitor false positive rates: Track the ratio of true positives to false positives for each detection rule
- Iterate: Continuously refine queries based on operational experience
Warning: Over-tuning a detection to eliminate all false positives often creates false negatives — real threats that the rule no longer catches. The goal is an acceptable balance, not zero false positives.
Key Takeaways
- A SIEM is the primary tool for operationalizing threat intelligence — it turns knowledge into action
- The major platforms (Splunk, Elastic, Sentinel, QRadar) have different query languages but similar core concepts
- CTI analysts use SIEMs for IOC lookups, TTP-based hunting, detection creation, and validation
- Translating intelligence into SIEM queries requires mapping TTPs to data sources and writing behavioral searches
- Endpoint, network, DNS, proxy, authentication, and email logs are the most valuable data sources for CTI-driven analysis
- False positive management is an ongoing discipline that requires baselining, careful whitelisting, and continuous iteration
Practical Exercise
Practice translating intelligence into SIEM queries:
- Select a MITRE ATT&CK technique: Go to https://attack.mitre.org/ and choose a technique that interests you (e.g., T1059.001 — PowerShell, T1053.005 — Scheduled Task, or T1003.001 — LSASS Memory)
- Review the ATT&CK page: Note the description, procedure examples, data sources, and detections listed
- Identify the log source: What log source would capture evidence of this technique? Is it endpoint (Sysmon, EDR), network, authentication, or another source?
- Write 2-3 queries: Using SPL or KQL (whichever you have access to, or pseudocode if neither), write queries that would detect the technique behaviorally
- Consider false positives: For each query, list 2-3 legitimate activities that might trigger a false positive and how you would filter them out
- Write a detection rule description: Document the query as a detection rule with a name, description, severity, MITRE mapping, and recommended response action
Further Reading
- Splunk Documentation: Search Reference. https://docs.splunk.com/Documentation/Splunk/latest/SearchReference
- Elastic Documentation: Kibana Query Language. https://www.elastic.co/guide/en/kibana/current/kuery-query.html
- Microsoft Learn: Kusto Query Language (KQL) Overview. https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/
- Smith, B., & Lam, J. (2022). Blue Team Handbook: SOC, SIEM, and Threat Hunting Use Cases. CreateSpace. (Practical SIEM use cases and query patterns)