Chapter 3 — How Not to Design Detection Use-Cases (and What to Do Instead)

Sentinel Detection Use Case Design: How NOT to Design Your Rules

This guide on Sentinel detection use case design exposes critical mistakes in designing Microsoft Sentinel detection use cases—from overly broad KQL rules to failing to map alerts to MITRE ATT&CK tactics. Designing effective detection use cases is the core skill of detection engineering. For related content, see our Sentinel Testing Guide and Sentinel Threat Hunting. External references: MITRE ATT&CK Framework and Microsoft Sentinel Detection Rules.

1) Title + Hook

  • Building detections without the right logs is like writing movie reviews without watching the films.
  • Marking everything “High” severity is a smoke alarm that screams for toast and wildfires alike.
  • Ten sloppy rules beat one attacker—once. One precise rule beats ten attackers—daily.

This guide spotlights the anti-patterns that quietly wreck detection programs—and the fixes that make them resilient.


2) Why It’s Needed (Context)

Detection use-cases are your SIEM/SOAR’s north star. When they’re vague, noisy, or unmoored from telemetry, you pay in three currencies: alert fatigue, missed intrusions, and lost credibility with engineering and leadership. We’ll decode the classic mistakes and give you a playbook to align detections with MITRE ATT&CK (Adversarial Tactics, Techniques & Common Knowledge) and real attacker paths.


3) Core Concepts Explained Simply

A) Use-Cases Enabled but Logs Missing

  • Technical definition: Analytics rules exist, but prerequisite telemetry (tables/fields) is absent, late, or malformed.
  • Everyday example: Setting up a coffee machine with no water line.
  • Technical example: A credential-stuffing rule depends on SigninLogs risk state, but RiskyUsers connector isn’t enabled—rule never fires.

B) Everything Marked “High” Severity

  • Technical definition: Flat severity model (all High/Critical) that ignores confidence, impact, and enrichment.
  • Everyday example: All emails marked “urgent”—soon, none are.
  • Technical example: Port scan, failed logins, and confirmed egress beaconing all assigned “High,” drowning triage.

C) No Incident / Alert Grouping

  • Technical definition: Alerts remain atomic; no correlation by entity/time/TTP (Tactics, Techniques, and Procedures).
  • Everyday example: Treating 20 notifications from the same delivery as 20 separate packages.
  • Technical example: Multiple SecurityEvent 4625 failures from one host generate 50 incidents instead of one grouped brute-force case.

D) Alerts with Zero Context

  • Technical definition: Alerts lack entity resolution, enrichment, or links to playbooks and knowledge.
  • Everyday example: A fire alarm with no floor or room number.
  • Technical example: “Suspicious PowerShell” with no command line, user SID, parent process, or MITRE technique tag.

E) No Standard Parsing / Field Mismatch

  • Technical definition: Inconsistent schemas; fields named differently across sources; missing ASIM (Advanced Security Information Model) normalization.
  • Everyday example: Mixing metric and imperial tools in the same toolbox.
  • Technical example: src_ip vs SourceIP vs ClientIP break joins; URL field sometimes base64, sometimes plain.

F) Poor KQL Hygiene

  • Technical definition: Inefficient or brittle KQL (Kusto Query Language): wildcard scans, no summarization windows, time drift, or unbounded joins.
  • Everyday example: Searching a library by reading every page of every book.
  • Technical example: | where tostring(CommandLine) contains "mimikatz" across * tables without time or table scoping.

G) Quantity Over Quality (Rule Count Vanity)

  • Technical definition: Optimizing for number of rules, not precision, recall, or mean time to detect (MTTD).
  • Everyday example: Owning 50 kitchen knives but still using a butter knife.
  • Technical example: 300 rules with <0.5% true-positive rate; no retirement of deadweight rules.

H) No MITRE ATT&CK Coverage Mapping

  • Technical definition: Detections aren’t mapped to techniques/sub-techniques; gaps unknown.
  • Everyday example: Playing chess without knowing pieces or the board.
  • Technical example: Great coverage for execution (T1059) but nothing for discovery (T1087) or privilege escalation (T1068).

I) No Log-Source → Use-Case Coverage Mapping

  • Technical definition: No matrix that shows which use-cases rely on which sources/fields.
  • Everyday example: Not knowing which ingredient makes which dish.
  • Technical example: Disabling DNS logs breaks exfiltration detections—but no one realizes until after an incident.

J) Detections Not Mapped to Attacker Paths

  • Technical definition: Rules exist in isolation, not aligned to attack chains/kill-chains or common adversary playbooks.
  • Everyday example: Locking your front door but leaving the windows wide open.
  • Technical example: Excellent ransomware encryption alerts, but zero coverage for initial access (phish), lateral movement (RDP), or data staging.

4) Real-World Case Study

Failure — The “Everything High” Breach

  • Situation: A healthcare provider had 240 Sentinel rules. 80% were “High.” No grouping, weak enrichment.
  • Impact: Analysts ignored 30+ failed-login bursts tied to a compromised VPN account; beaconing went unnoticed for 5 days.
  • Lesson: Severity discipline + grouping + enrichment would have collapsed 120 noisy alerts into 3 actionable incidents.

Success — Use-Case Contracts & ATT&CK Map

  • Situation: A fintech created “Detection Contracts”: each rule listed required fields, data sources, ATT&CK technique, severity rubric, and sample incidents. Built a source↔use-case matrix and an ATT&CK heatmap.
  • Impact: -42% alert volume, +31% true-positive rate, MTTD down from 6h to 90m.
  • Lesson: Treat detections as products with inputs/outputs and SLOs.

5) Action Framework — Prevent → Detect → Respond

Prevent (Design Right)

  • Define detection contracts:
    • Intent: threat, ATT&CK T#
    • Inputs: tables + fields (with ASIM names)
    • Logic: KQL with test cases
    • Severity rubric: impact × confidence
    • Ops: owner, SLOs (latency, FP rate), links to runbooks
  • Build the Coverage Matrix: Use-case (rows) × Sources/Fields (columns). Color by criticality.
  • Normalize early: Enforce ASIM (or your schema) at ingest; ban ad-hoc field names.
  • Set a severity policy: E.g., Critical = confirmed malicious + material impact; High = high confidence + privileged entity; Medium/Low with clear auto-closure criteria.

Detect (Run Well)

  • Group intelligently: Entity-based (user, host, IP), time-windowed (e.g., 30–60 min), TTP-aware correlation.
  • Enrich alerts: Entity resolution (UEBA), asset tags, geolocation, exposure (internet-facing), vuln context (CVSS).
  • Harden KQL:
    • Scope tables & time (project before join).
    • Use make-series, summarize with bins, toscalar for thresholds.
    • Add null/format checks and time-zone normalization.
  • Measure quality: Track precision, recall, FP rate, FNR, and rule runtime. Retire or refactor rules quarterly.

Respond (Improve Fast)

  • Playbooks (SOAR): Map each severity to a minimal response checklist; automate enrichment and ticketing.
  • Drill with simulations: Use Atomic Red Team/ATT&CK emulations; confirm end-to-end (log present → rule fires → grouped → playbook runs).
  • Feedback loop: Every false positive updates the contract (logic or enrichment). Every miss creates a backlog item with ATT&CK mapping.

6) Key Differences to Keep in Mind

  1. Severity vs Priority — Severity = inherent risk; Priority = queue order (contextual).
    • Scenario: A “Medium” alert on a domain admin in production becomes top priority.
  2. Alert vs Incident — Alerts are signals; incidents are stories (grouped evidence).
    • Scenario: 15 brute-force alerts across users → 1 incident with attacker IP, timeframe, and impact.
  3. Rule Count vs Coverage Quality — More rules ≠ better defense.
    • Scenario: 60 well-mapped detections covering ATT&CK tactics beat 300 shallow ones.
  4. Detection Logic vs Enrichment — Logic finds; enrichment explains.
    • Scenario: A hash match (logic) + EDR verdict + VT score + asset criticality (enrichment) drives faster action.
  5. Schema Normalization vs Parser Sprawl — One language, fewer bugs.
    • Scenario: ASIM fields (SrcIp, DstIp, User) enable reusable joins and content packs.

7) Summary Table

ConceptDefinitionEveryday ExampleTechnical Example
Logs missingRule needs data that isn’t thereCoffee machine w/o waterSigninLogs dependencies not connected
All High severityFlat model; no nuanceEverything marked “urgent”Port scan = High same as C2 beacon
No alert groupingNo correlation into incidents20 packages treated separately50 4625s = 50 incidents, not 1
Zero contextNo enrichment/linksFire alarm w/o floorNo command line, no parent PID
Field mismatchInconsistent schemasMetric vs imperial mixsrc_ip vs SourceIP breaks joins
Poor KQL hygieneInefficient/brittle queriesReading every page to searchUnbounded contains across *
Rule vanityOptimize for count not quality50 knives, use one300 rules, <0.5% TP
No ATT&CK mappingNo technique coverage viewPlaying chess blindGaps in discovery/priv-esc
No source mappingNo data→use-case matrixUnknown ingredientsDNS disabled breaks exfil rules
Not on attacker pathsNo kill-chain alignmentLock door, open windowsEncrypt detect but no lateral move detect

8) ASCII Diagram — Detection Product Loop

[ATT&CK Technique] → [Detection Contract] → [KQL Logic]
        ↓                     ↓                   ↓
 [Required Sources/Fields] → [Normalization/ASIM] → [Alert Enrichment]
        ↓                     ↓                   ↓
     [Grouping/Incidents] → [Severity Policy] → [SOAR Playbook]
        ↓
   [Metrics: Precision | Recall | FP/FN | Latency]
        ↓
   [Refactor/Retire]  ←——  [Purple Team Tests]

9) What’s Next

Next in this series: “Detection Contracts in Practice: A Step-by-Step Template (with KQL patterns and ATT&CK mapping).” We’ll publish a fill-in-the-blanks worksheet plus sample tests.


🌞 The Last Sun Rays…

Hook answers:

  • Don’t write reviews without watching the film—connect detections to telemetry and verify it’s present.
  • Don’t let every toaster trip the fire alarm—calibrate severity and group signals into incidents.
  • Don’t collect knives for the drawer—optimize for coverage quality, not rule count.

Your turn: If you could only fix one thing this week, would you choose severity discipline, schema normalization, or source↔use-case mapping—and how would you prove it worked (which metric first)?

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Index