If you’ve ever built a “sensitive files modified in the last X days” report from Purview Content Explorer exports and thought, “Nice, we’re capturing real tenant data estate status”… there’s a decent chance you’ve been measuring something else.
Because in Microsoft 365, “Last Modified” isn’t always one truth. Sometimes it’s the file’s own embedded history. Sometimes it’s SharePoint/OneDrive’s service-side history. And if you mix them up, your “recent data estate” reporting can be quietly inaccurate.
The fun part: it still looks correct. The charts render. The KPIs look plausible. Everyone nods.
(And then you make a decision based on a PDF that claims it was last modified in 2011.)
I built a simple 60-day view (the charts in question are highlighted in blue):
Then I swapped one field.
Version A (uses LastModified – as exported from Purview Content Explorer)
Version B (uses ItemLastModifiedTime – obtained from SharePoint Online via Microsoft Graph API)
Same scope, window and tenant. Different “Last Modified”.
“Sensitive files modified (60d)” using LastModified showing 63 + top site bar

“Externally shared sensitive files modified (60d)” using LastModified showing 6 + trend

“Sensitive files modified (60d)” using ItemLastModifiedTime showing 239 + top site bar

“Externally shared sensitive files modified (60d)” using ItemLastModifiedTime showing 7 + trend

Some file formats carry their own embedded metadata, including modification timestamps and “last saved by”-style information.
Purview itself is explicit about this in eDiscovery/export metadata:
That means the value can reflect the file’s internal history, potentially from long before the file ever entered your SharePoint/OneDrive tenant.
This is why you’ll often see the effect on “rich” formats like PDF and Office documents. PDFs can store metadata in a document information dictionary and/or XMP metadata streams, and standards work has long acknowledged the “multiple metadata containers” reality inside PDFs.
So if a PDF was last edited in 2011, then uploaded into SharePoint in 2026, it can still legitimately carry a 2011 “modified” value as part of its embedded metadata.
SharePoint/OneDrive also maintains item-level metadata representing what happened in the service.
Microsoft Graph draws a clean line between:
The Graph fileSystemInfo resource is very explicit:
That is the “two clocks” model in plain English.
So when you use SharePoint/OneDrive item timestamps (like ItemLastModifiedTime in the above example), you’re grounding “recent activity” in tenant reality, which is what most security reporting actually intends.
This lines up with how file formats behave:
So when a system tries to populate “Last Modified” from document metadata, PDFs and Office docs tend to return a meaningful value, even if it’s not what you want for “recent tenant activity”. Plain text formats often have less embedded metadata to extract, so they’re less likely to produce that misleading “ancient last modified” effect.
(Translation: Office Docs/PDFs show up with a backstory. CSVs show up with a blank name tag.)
If your report is meant to answer questions like:
…then the service-side item timestamp is usually the relevant one.
If you accidentally use document metadata “LastModified in the above example” for those questions, you can end up with:
The key point isn’t “Content Explorer is wrong”. It’s that “Last Modified” is overloaded language, and the field you choose must match the question you’re answering.
A simple pattern that avoids future confusion:
If you want one extra “tell me when something smells” metric:
“Last Modified” can trick you because it can mean either:
Both are legitimate. They just answer different questions.
Security reporting almost always wants tenant activity. Using the wrong clock doesn’t just skew the story. It replaces it.
0 comments