← All case studies
Case Study — Detection Engineering

Splunk Detection Rule Audit

Four ways my own rules would flood a real analyst with noise. Audited every SPL detection against 283,976 events, classified the failure modes, and fixed the gaps.

2026-03-24 Self-audit Splunk SPL 283,976 events 4 noise sources

Problem / Hypothesis

I had written detection rules in Splunk targeting a constrained Windows telemetry environment — single sourcetype, no Sysmon, no Windows TA, manual rex-based field extraction. The rules worked. They fired. They matched MITRE ATT&CK techniques.

The hypothesis: working rules are not the same as deployable rules. If a real analyst inherited these rules in a production SOC, what would their first week look like?

Environment

Splunk Enterprise, REST API on port 8089. Single Windows 11 workstation (HO-WE-01). Wazuh agent forwarding Security Event Log as XmlWinEventLog:Security. 7-day window, ~283k events. All field extraction via manual rex against raw XML. No CIM normalization.

Methodology

Step 1 — Inventory

Cataloged every SPL detection query. Each mapped to a MITRE ATT&CK technique with defined thresholds and target EventIDs.

Step 2 — Run against production data

Every rule executed against the full 283,976-event dataset via Splunk REST API. Hit count, volume distribution, and sampled matches recorded.

Step 3 — Analyst workload test

For each rule: if this alert fired in a SOC queue, could an analyst triage it to a conclusion with the information available? Or would they need to pivot to data that doesn’t exist?

Step 4 — Classify the noise

Every rule that failed the analyst-workload test was categorized by the specific reason it would generate untriageable alerts.

Findings

Noise Source 1: Empty CommandLine Fields. Event ID 4688 fired on every process creation, but CommandLine was empty in 100% of events. bash.exe → base64.exe: 30,855 hits. Browser → cmd.exe: 59 hits. Every one untriageable without arguments.

Noise Source 2: No Failed Logon Baseline. Zero EventID 4625 events in 7 days. Brute force rules would either never fire (silent failure) or fire on the first event with no baseline for comparison.

Noise Source 3: Missing Sourcetype Coverage. No Sysmon — no network connections, DNS, file creation, or registry events. Lateral movement and C2 rules had zero data to match. Dead weight in the rule set creating false coverage.

Noise Source 4: Rex Extraction Fragility. Manual regex against XML structure. If Wazuh version changes XML nesting, extraction silently returns null. Analyst sees blank fields with no indication the extraction failed.

#Noise SourceRules AffectedAnalyst Impact
1Empty CommandLine3 rulesAlerts fire but can’t be triaged
2No failed logon baseline1 ruleSilent failure or no-context fire
3Missing sourcetype2 rulesRules never fire — false coverage
4Rex fragilityAll rulesSilent extraction failure

Of 8 detection rules: 3 immediately actionable, 3 require command-line auditing, 1 requires failed logon auditing, 1 requires Sysmon or network telemetry.

Operational Impact

This audit directly triggered two follow-on actions: Phase 1 Audit Policy Hardening (command-line logging, failed logon auditing, 25 additional subcategories) and detection rule dependency flags on every SPL rule.

Every new detection now includes a “deployment prerequisites” section before the SPL, not after. Rules are not marked stable until their dependencies are confirmed present.

Verification

  • Detection rules with dependency flags: content/detection-rules/splunk/
  • Full analysis: content/case-studies/signalfoundry-splunk-detection-engineering.md
  • Phase 1 remediation: Enterprise Security Hardening case study
  • CommandLine gap verification: index=wazuh data.win.system.eventID=4688 | eval has_cmdline=if(len(CommandLine)>0,"yes","no") | stats count by has_cmdline

What This Demonstrates

Writing detection rules is the easy part. Knowing whether your rules are deployable — whether they’ll help an analyst or bury them — requires running them against real data and asking uncomfortable questions about what happens when the alert fires.

I found four ways my own rules would make a real analyst’s life worse. I documented them, flagged the dependencies, and fixed the underlying gaps. That sequence — build, audit, document the gaps honestly, fix them — is what separates a detection library from a noise generator.