Architecture¶
End-to-end flow¶
flowchart TD
user([End user]) -->|invokes| copilot[Microsoft 365 Copilot<br/>declarative agent surface]
copilot -->|invokes| fda[Fabric Data Agent<br/>NL2DAX tool]
fda -->|generated DAX, executed over XMLA| model[(Power BI<br/>semantic model)]
subgraph capture [CAPTURE SURFACES]
direction LR
A["(A) MS Graph<br/>aiInteractionHistory<br/>prompt + response text<br/>per user"]
B["(B) Office 365 Mgmt API<br/>CopilotInteraction<br/>metadata / index<br/>(RecordType 261)"]
C["(C) Workspace monitoring<br/>semantic-model ops<br/>EXECUTED DAX + metrics"]
end
copilot -.-> A
copilot -.-> B
model -.-> C
A --> collector
B --> collector
C --> collector
collector[["FDA_Collector<br/>scheduled Fabric notebook · service principal<br/>pull A,B,C with watermarks + dedup<br/>parse / extract DAX, normalize, hash<br/>ingest to landing tables"]]
collector --> raw["Eventhouse (KQL)<br/>Raw_GraphInteractions · Raw_AuditInteractions · Raw_ExecutedDax"]
raw -->|update policies / correlation| curated[("FdaInteractions<br/>curated triple — one row per interaction")]
curated --> app["FdaObservability.App<br/>C# WinForms · interactive AAD → Kusto<br/>search · review detail · configuration"]
curated --> dash["Real-Time Dashboard<br/>fleet health & tuning trends"]
Why three sources¶
A single API does not return the full triple for live M365-originated calls:
- The raw CopilotInteraction audit record (RecordType 261, via Office 365 Management Activity API) carries
metadata only —
ThreadId,MessageIds,AppHost,AccessedResources, user, timestamp — not the prompt or response text, and not the generated DAX. - MS Graph
aiInteractionHistory.getAllEnterpriseInteractionsreturns the prompt and response text (AiEnterpriseInteraction.Read.All, app-only). It is the cleanest programmatic text source, but it is keyed per user and its coverage of FDA-via-M365 depends on which experiences write to the interaction-history service. - The generated DAX is most reliably recovered from workspace monitoring of the semantic model: the FDA
executes its NL2DAX output over the XMLA endpoint, and that execution is logged with the full DAX text in
EventText, plusExecutionMetrics, duration, CPU, and the executing user's identity.
So: Graph/Audit give question + answer + identity; workspace monitoring gives the DAX; correlation joins them.
Correlation key¶
The collector builds a deterministic correlation key for each interaction:
Executed-DAX events (source C) are matched to interactions (sources A/B) by executingUser + nearest timestamp
within a configurable window (default ±90s) and the same semanticModelId. When multiple DAX executions fall in the
window, all are attached to the interaction as an ordered array (an FDA turn can emit several validation/probe
queries before the final one). The notebook records MatchConfidence (Exact / Windowed / Unmatched) so the review
app can flag low-confidence joins instead of hiding the uncertainty.
The mechanics — pairing prompts to responses, the window join, orphan handling, and confidence scoring — are detailed in Correlation model.
Capability & limitation notes (be explicit with stakeholders)¶
- Reasoning / chain-of-thought for live M365 calls is not exposed. You get the rephrased question
(
ApplicationContext, where present) and the concrete grounding artifact (the DAX), not the model's internal reasoning tokens. For richer step-by-step capture, useFDA_SDK_Replayto re-ask sampled questions through the Fabric Data Agent SDK, which returnsget_run_details()steps including generated queries and errors. That is a reconstruction on the FDA side, not the original M365 turn. - Preview surfaces: Purview DSPM-for-AI audit for FDA and the FDA SDK are in preview; treat them as best-effort enrichment. The GA spine is Office 365 Management API + workspace monitoring.
- Latency: audit/Graph records can lag the live interaction by minutes to ~30 min; workspace-monitoring is near-real-time. The collector re-scans a trailing window so late-arriving records are back-filled and de-duplicated.
- PII / governance: prompts and responses can contain sensitive data. The Eventhouse should inherit the
workspace sensitivity label; restrict the KQL database and the review app to authorized reviewers. The schema keeps
raw payloads in
Raw_*tables with shorter retention and a curated, optionally-redactableFdaInteractions.