Welcome to Vision stack

Purview DLP Incident Management (IM) – From Alert to Outcome #07

Phase 3 – Investigation

Context Is the Real Investigation Engine.

Overview

A DLP investigation that starts and ends with the alert is not an investigation. It is a policy match review. The two things look similar from the outside and produce very different outcomes.

Quick recap

In the previous post, we introduced VisionStack’s 4-Context Investigation Model, the structured framework Tier 1 uses to gather context before making a triage decision:

User Context – Data Context – Activity Context – Environment Context

That post covered what to gather and why. This post covers what Tier 2 does with it, how context enrichment turns a triage package into an investigation decision, and where that process typically goes wrong.

Why context is the real work

The alert tells you that a policy matched. The evidence detail tells you what matched and where. Neither of those things tells you what the event means.

Meaning comes from context, and building context is the bulk of what a real DLP investigation involves. Most of the time, an experienced investigator isn’t uncovering new information. They are assembling existing information from multiple sources into a coherent picture that supports a defensible decision.

That assembly process is what separates an investigation from a guess. And it is the part most DLP programs under-invest in, because it requires tooling discipline, access to the right data sources, and a consistent methodology, none of which come pre-configured.

What Tier 2 is doing

When a triage package arrives from Tier 1, Tier 2 picks up with a richer access model and a different objective. Tier 1‘s job was to assess whether the alert was worth escalating. Tier 2‘s job is to determine what the event means and what, if anything, should happen next.

That requires going deeper into each of the four context dimensions, and in some cases, pulling in data sources that Tier 1 didn’t have access to.

Deepening user context

Tier 1 established who the user is and whether they have a prior DLP history. Tier 2 goes further.

  • Employment and HR signals: Is the user in a notice period? Have they recently had a performance review, role change, or access modification? These signals don’t confirm malicious intent, but they change the risk weighting of an otherwise ambiguous event. A bulk download by a long-tenured employee returning from leave is different from the same action by someone who submitted a resignation last week.
  • Access scope: Does this user’s role justify access to the data they were handling? A finance analyst accessing financial records is expected. The same analyst downloading HR records is not. Entra ID and directory data can inform this assessment, but in most environments it requires a manual check or integration with an IGA (Identity Governance and Administration) system.
  • Cross-incident correlation: Has this user appeared in other recent incidents – DLP, identity, or endpoint? A DLP alert that sits alongside a suspicious sign-in or a Defender for Endpoint alert for the same user in the same time window is a materially different investigation than a standalone DLP event. Defender XDR’s incident correlation surfaces this automatically when alerts are correlated into a single incident, but Tier 2 should verify this explicitly rather than assuming the platform caught everything.

Deepening data context

Tier 1 identified the file, the label, and the SIT match count. Tier 2 looks at what those signals mean in combination.

  • Label accuracy: Does the sensitivity label on the file reflect the content? A file labelled General that contains 40 high-confidence credit card numbers is a labeling failure and a DLP gap simultaneously. The DLP alert is valid, but the investigation should also flag the misclassification for remediation, it doesn’t end with the incident.
  • SIT match quality: High match count and high confidence on regulated data types (credit card numbers, national IDs, health record identifiers) is a very different risk profile from a low-confidence partial match on a name pattern in a large document. Content Explorer gives Tier 2 access to the actual content for this assessment, subject to the RMS decryption caveat covered in Post 4.
  • Data lineage: Where did this file come from? Was it created by the user, downloaded from a business system, shared with them by a colleague, or pulled from a third-party integration? A file that originated in a regulated system and was subsequently moved by the user carries a different risk profile than a file the user assembled themselves. This is often hard to determine without additional tooling, but Activity Explorer can surface some of the movement history.

Deepening activity context

The activity context from triage gave Tier 2 the action type, destination, and timing. The investigation layer adds pattern and volume.

Behavioral baseline: Is this action consistent with what this user normally does? An analyst who regularly exports data to a SharePoint site for team sharing is behaving differently from one who has never done so and suddenly exports 300 files in a single session. Establishing a baseline requires historical activity data, Activity Explorer supports this with up to 30 days of label and activity history.

Volume and velocity: A single file shared externally is a data point. Fifty files shared externally within a 20-minute window is a pattern. Advanced Hunting in Defender XDR lets Tier 2 build a time-scoped activity picture:

CloudAppEvents
| where AccountId == "<user-id>"
| where Timestamp between (datetime(YYYY-MM-DDT00:00:00Z) .. datetime(YYYY-MM-DDT23:59:59Z))
| where ActionType in ("FileUploaded", "FileShared", "FileSyncDownloadedFull")
| summarize count() by ActionType, Application, bin(Timestamp, 5m)
| order by Timestamp asc

Destination analysis: External domain is a broad category. A known business partner domain is different from a personal webmail address, which is different from a domain registered two weeks ago. Basic domain intelligence – registration date, category, reputation – can be pulled from threat intelligence sources or assessed manually when the destination is unusual.

Deepening environment context

Environment context is the dimension most likely to be under-weighted in a fast-moving investigation, and often the one that changes the risk assessment most significantly.

  • Device state: Was the action performed on a managed, compliant corporate device or an unmanaged personal one? Endpoint DLP surfaces device compliance state as part of the event context. An action on a non-compliant or unmanaged device has no enforced data boundary, anything copied or downloaded exists outside your organization’s security controls from that point forward.
  • Network context: Corporate network, VPN, and external connections carry different risk profiles for the same action. A file download over a corporate VPN is observable and logged at the network layer. The same download from a coffee shop Wi-Fi connection is not. This distinction matters for containment decisions, if data has already left a managed network boundary, containment options are more limited.
  • Application context: Endpoint DLP captures the application used for the action, browser, native Office client, sync client, or a third-party app. A file opened in Word and printed is a different risk vector from the same file uploaded via a browser to an unmanaged cloud storage service. The application context affects both risk assessment and the practical scope of containment.

Synthesizing across dimensions

The investigation decision isn’t made by looking at each context dimension in isolation. It comes from reading them together.

A few practical synthesis patterns:

  • Convergent risk signals: Multiple dimensions pointing in the same direction – departing employee, high-confidence SIT match on Highly Confidential data, external destination, unmanaged device, outside working hours – represent a convergent risk picture. Each signal alone might be explainable. Together they warrant escalation to Tier 3 and parallel Governance engagement.
  • Divergent signals: When dimensions point in different directions – unusual timing but managed device, corporate network, known destination, no prior DLP history, documented business justification – the investigation is pointing toward a false positive or a low-risk event. The correct outcome is closure with documented rationale, not escalation on the basis of a single elevated signal.
  • Genuinely ambiguous cases: Some investigations don’t resolve cleanly. Ambiguity is a valid outcome, but it needs to be documented as ambiguity, with a record of what was assessed and what remained unresolvable. Closing an ambiguous case as a false positive because it was inconclusive is a discipline failure. Escalating it to Tier 3 without a documented rationale is a bottleneck failure. The right call is a documented escalation that clearly states what could and couldn’t be determined.

Where context enrichment commonly breaks down

  • Tooling gaps: Tier 2 investigators who can see incidents in Defender XDR but can’t access Content Explorer or Activity Explorer because the Purview role groups weren’t configured correctly are working with one hand behind their back. The investigation is bounded by what the access model allows, not by what the evidence contains.
  • Time pressure collapsing the process: In high-volume environments, Tier 2 can start behaving like a faster Tier 1, reviewing context quickly and making decisions based on the most visible signal rather than a full assessment. The result is the same inconsistency problem that the triage framework was designed to solve, just one layer up.
  • No documented baseline: If you don’t know what normal looks like for a given user or department, you can’t assess whether a behavior is anomalous. Building activity baselines proactively – not just re-actively when an incident arrives – is an operational investment that pays off in investigation quality.

What’s coming next

Context enrichment produces an investigation decision. The next post covers what happens after that decision is made. Specifically, how containment and remediation actions are scoped, executed, and documented, and why the closure discipline at the end of an incident is just as important as the investigation that preceded it.

0 comments