Live Splunk Threat Hunt: EventID 4688
Rex-based field extraction built from scratch, parent-process exclusion logic, and Codex AI tool spawning pwsh.exe 375x — correctly classified as tool behavior vs LOLBin abuse.
Problem / Hypothesis
Event ID 4688 (Process Creation) is one of the highest-volume Windows Security events and one of the most valuable for behavioral detection. In a constrained Splunk environment — single sourcetype, no Windows TA, no CIM fields — the question was whether meaningful threat hunting could be conducted using only manual field extraction against raw XML.
The hypothesis: with rex-based extraction and parent-process chain analysis built from scratch, I could surface anomalous process behavior in live endpoint telemetry and distinguish genuine threats from tool behavior.
Environment
Splunk Enterprise, REST API on port 8089. Single Windows 11 workstation (HO-WE-01). 7-day window, ~283k security events. Sourcetype: XmlWinEventLog:Security. All field extraction via manual rex. CommandLine field empty (auditing not yet enabled at time of hunt).
Methodology
Step 1 — Build the field extraction layer
Every field extracted from raw XML via rex:
| rex field=_raw "<Data Name='NewProcessName'>(?<NewProcessName>[^<]+)</Data>" | rex field=_raw "<Data Name='ParentProcessName'>(?<ParentProcessName>[^<]+)</Data>" | rex field=_raw "<Data Name='CommandLine'>(?<CommandLine>[^<]+)</Data>" | eval process=mvindex(split(NewProcessName,"\\"),-1) | eval parent=mvindex(split(ParentProcessName,"\\"),-1)
Step 2 — Baseline parent-child relationships
Full parent → child frequency analysis across the 7-day window. Top entries: svchost.exe spawning services, explorer.exe spawning applications, RuntimeBroker.exe lifecycle events.
Step 3 — Build parent-process exclusion logic
Exclusions were additive — each added only after verifying the parent-child pair was legitimate baseline behavior. Every exclusion documented with the volume it suppressed.
Step 4 — Hunt for anomalies
With baseline noise suppressed, the remaining events surfaced high-volume automated process chains and low-volume unusual parent-child pairs.
Step 5 — Deep-dive the top anomaly
Single highest-volume non-baseline finding: pwsh.exe spawned 375 times by a process chain originating from Codex, an AI coding tool.
Findings
| Finding | Volume | Classification | Action |
|---|---|---|---|
| Codex → pwsh.exe | 375 | Tool behavior | Baselined |
| bash.exe → base64.exe | 30,855 | Triage-worthy | Enable CommandLine |
| Browser → cmd.exe | 59 | Triage-worthy | Enable CommandLine |
| Rex extraction (CommandLine) | 100% null | Audit gap | Enable CommandLine |
Rex Extraction Validation
Extraction success rate >99.9% for NewProcessName and ParentProcessName. CommandLine returned null in 100% of events — confirmed as audit configuration gap, not extraction failure.
Operational Impact
Verification
- All SPL queries reproducible against index=wazuh with EventCode=4688
- Codex temporal profile: | where match(NewProcessName, "pwsh\\.exe$") | timechart span=1h count
- Cross-reference: Detection Rule Audit and Security Hardening document downstream actions
What This Demonstrates
Threat hunting on constrained telemetry is not an excuse for shallow analysis. The rex extraction layer achieved >99.9% accuracy. The parent-process exclusion logic was additive and documented. And the Codex classification required actual analytical work: temporal profiling, volume characterization, and parent-chain tracing to correctly classify 375 PowerShell spawns as tool behavior, not adversary behavior.
The easy answer was to flag pwsh.exe spawned by a non-standard parent as suspicious. The correct answer required understanding what was actually happening.