Chapter 3 — How Not to Design Detection Use-Cases (and What to Do Instead)

1) Title + Hook

  • Building detections without the right logs is like writing movie reviews without watching the films.
  • Marking everything “High” severity is a smoke alarm that screams for toast and wildfires alike.
  • Ten sloppy rules beat one attacker—once. One precise rule beats ten attackers—daily.

This guide spotlights the anti-patterns that quietly wreck detection programs—and the fixes that make them resilient.


2) Why It’s Needed (Context)

Detection use-cases are your SIEM/SOAR’s north star. When they’re vague, noisy, or unmoored from telemetry, you pay in three currencies: alert fatigue, missed intrusions, and lost credibility with engineering and leadership. We’ll decode the classic mistakes and give you a playbook to align detections with MITRE ATT&CK (Adversarial Tactics, Techniques & Common Knowledge) and real attacker paths.


3) Core Concepts Explained Simply

A) Use-Cases Enabled but Logs Missing

  • Technical definition: Analytics rules exist, but prerequisite telemetry (tables/fields) is absent, late, or malformed.
  • Everyday example: Setting up a coffee machine with no water line.
  • Technical example: A credential-stuffing rule depends on SigninLogs risk state, but RiskyUsers connector isn’t enabled—rule never fires.

B) Everything Marked “High” Severity

  • Technical definition: Flat severity model (all High/Critical) that ignores confidence, impact, and enrichment.
  • Everyday example: All emails marked “urgent”—soon, none are.
  • Technical example: Port scan, failed logins, and confirmed egress beaconing all assigned “High,” drowning triage.

C) No Incident / Alert Grouping

  • Technical definition: Alerts remain atomic; no correlation by entity/time/TTP (Tactics, Techniques, and Procedures).
  • Everyday example: Treating 20 notifications from the same delivery as 20 separate packages.
  • Technical example: Multiple SecurityEvent 4625 failures from one host generate 50 incidents instead of one grouped brute-force case.

D) Alerts with Zero Context

  • Technical definition: Alerts lack entity resolution, enrichment, or links to playbooks and knowledge.
  • Everyday example: A fire alarm with no floor or room number.
  • Technical example: “Suspicious PowerShell” with no command line, user SID, parent process, or MITRE technique tag.

E) No Standard Parsing / Field Mismatch

  • Technical definition: Inconsistent schemas; fields named differently across sources; missing ASIM (Advanced Security Information Model) normalization.
  • Everyday example: Mixing metric and imperial tools in the same toolbox.
  • Technical example: src_ip vs SourceIP vs ClientIP break joins; URL field sometimes base64, sometimes plain.

F) Poor KQL Hygiene

  • Technical definition: Inefficient or brittle KQL (Kusto Query Language): wildcard scans, no summarization windows, time drift, or unbounded joins.
  • Everyday example: Searching a library by reading every page of every book.
  • Technical example: | where tostring(CommandLine) contains "mimikatz" across * tables without time or table scoping.

G) Quantity Over Quality (Rule Count Vanity)

  • Technical definition: Optimizing for number of rules, not precision, recall, or mean time to detect (MTTD).
  • Everyday example: Owning 50 kitchen knives but still using a butter knife.
  • Technical example: 300 rules with <0.5% true-positive rate; no retirement of deadweight rules.

H) No MITRE ATT&CK Coverage Mapping

  • Technical definition: Detections aren’t mapped to techniques/sub-techniques; gaps unknown.
  • Everyday example: Playing chess without knowing pieces or the board.
  • Technical example: Great coverage for execution (T1059) but nothing for discovery (T1087) or privilege escalation (T1068).

I) No Log-Source → Use-Case Coverage Mapping

  • Technical definition: No matrix that shows which use-cases rely on which sources/fields.
  • Everyday example: Not knowing which ingredient makes which dish.
  • Technical example: Disabling DNS logs breaks exfiltration detections—but no one realizes until after an incident.

J) Detections Not Mapped to Attacker Paths

  • Technical definition: Rules exist in isolation, not aligned to attack chains/kill-chains or common adversary playbooks.
  • Everyday example: Locking your front door but leaving the windows wide open.
  • Technical example: Excellent ransomware encryption alerts, but zero coverage for initial access (phish), lateral movement (RDP), or data staging.

4) Real-World Case Study

Failure — The “Everything High” Breach

  • Situation: A healthcare provider had 240 Sentinel rules. 80% were “High.” No grouping, weak enrichment.
  • Impact: Analysts ignored 30+ failed-login bursts tied to a compromised VPN account; beaconing went unnoticed for 5 days.
  • Lesson: Severity discipline + grouping + enrichment would have collapsed 120 noisy alerts into 3 actionable incidents.

Success — Use-Case Contracts & ATT&CK Map

  • Situation: A fintech created “Detection Contracts”: each rule listed required fields, data sources, ATT&CK technique, severity rubric, and sample incidents. Built a source↔use-case matrix and an ATT&CK heatmap.
  • Impact: -42% alert volume, +31% true-positive rate, MTTD down from 6h to 90m.
  • Lesson: Treat detections as products with inputs/outputs and SLOs.

5) Action Framework — Prevent → Detect → Respond

Prevent (Design Right)

  • Define detection contracts:
    • Intent: threat, ATT&CK T#
    • Inputs: tables + fields (with ASIM names)
    • Logic: KQL with test cases
    • Severity rubric: impact × confidence
    • Ops: owner, SLOs (latency, FP rate), links to runbooks
  • Build the Coverage Matrix: Use-case (rows) × Sources/Fields (columns). Color by criticality.
  • Normalize early: Enforce ASIM (or your schema) at ingest; ban ad-hoc field names.
  • Set a severity policy: E.g., Critical = confirmed malicious + material impact; High = high confidence + privileged entity; Medium/Low with clear auto-closure criteria.

Detect (Run Well)

  • Group intelligently: Entity-based (user, host, IP), time-windowed (e.g., 30–60 min), TTP-aware correlation.
  • Enrich alerts: Entity resolution (UEBA), asset tags, geolocation, exposure (internet-facing), vuln context (CVSS).
  • Harden KQL:
    • Scope tables & time (project before join).
    • Use make-series, summarize with bins, toscalar for thresholds.
    • Add null/format checks and time-zone normalization.
  • Measure quality: Track precision, recall, FP rate, FNR, and rule runtime. Retire or refactor rules quarterly.

Respond (Improve Fast)

  • Playbooks (SOAR): Map each severity to a minimal response checklist; automate enrichment and ticketing.
  • Drill with simulations: Use Atomic Red Team/ATT&CK emulations; confirm end-to-end (log present → rule fires → grouped → playbook runs).
  • Feedback loop: Every false positive updates the contract (logic or enrichment). Every miss creates a backlog item with ATT&CK mapping.

6) Key Differences to Keep in Mind

  1. Severity vs Priority — Severity = inherent risk; Priority = queue order (contextual).
    • Scenario: A “Medium” alert on a domain admin in production becomes top priority.
  2. Alert vs Incident — Alerts are signals; incidents are stories (grouped evidence).
    • Scenario: 15 brute-force alerts across users → 1 incident with attacker IP, timeframe, and impact.
  3. Rule Count vs Coverage Quality — More rules ≠ better defense.
    • Scenario: 60 well-mapped detections covering ATT&CK tactics beat 300 shallow ones.
  4. Detection Logic vs Enrichment — Logic finds; enrichment explains.
    • Scenario: A hash match (logic) + EDR verdict + VT score + asset criticality (enrichment) drives faster action.
  5. Schema Normalization vs Parser Sprawl — One language, fewer bugs.
    • Scenario: ASIM fields (SrcIp, DstIp, User) enable reusable joins and content packs.

7) Summary Table

ConceptDefinitionEveryday ExampleTechnical Example
Logs missingRule needs data that isn’t thereCoffee machine w/o waterSigninLogs dependencies not connected
All High severityFlat model; no nuanceEverything marked “urgent”Port scan = High same as C2 beacon
No alert groupingNo correlation into incidents20 packages treated separately50 4625s = 50 incidents, not 1
Zero contextNo enrichment/linksFire alarm w/o floorNo command line, no parent PID
Field mismatchInconsistent schemasMetric vs imperial mixsrc_ip vs SourceIP breaks joins
Poor KQL hygieneInefficient/brittle queriesReading every page to searchUnbounded contains across *
Rule vanityOptimize for count not quality50 knives, use one300 rules, <0.5% TP
No ATT&CK mappingNo technique coverage viewPlaying chess blindGaps in discovery/priv-esc
No source mappingNo data→use-case matrixUnknown ingredientsDNS disabled breaks exfil rules
Not on attacker pathsNo kill-chain alignmentLock door, open windowsEncrypt detect but no lateral move detect

8) ASCII Diagram — Detection Product Loop

[ATT&CK Technique] → [Detection Contract] → [KQL Logic]
        ↓                     ↓                   ↓
 [Required Sources/Fields] → [Normalization/ASIM] → [Alert Enrichment]
        ↓                     ↓                   ↓
     [Grouping/Incidents] → [Severity Policy] → [SOAR Playbook]
        ↓
   [Metrics: Precision | Recall | FP/FN | Latency]
        ↓
   [Refactor/Retire]  ←——  [Purple Team Tests]

9) What’s Next

Next in this series: “Detection Contracts in Practice: A Step-by-Step Template (with KQL patterns and ATT&CK mapping).” We’ll publish a fill-in-the-blanks worksheet plus sample tests.


🌞 The Last Sun Rays…

Hook answers:

  • Don’t write reviews without watching the film—connect detections to telemetry and verify it’s present.
  • Don’t let every toaster trip the fire alarm—calibrate severity and group signals into incidents.
  • Don’t collect knives for the drawer—optimize for coverage quality, not rule count.

Your turn: If you could only fix one thing this week, would you choose severity discipline, schema normalization, or source↔use-case mapping—and how would you prove it worked (which metric first)?

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Index