Architecture¶

End-to-end flow¶

flowchart TD
    user([End user]) -->|invokes| copilot[Microsoft 365 Copilot<br/>declarative agent surface]
    copilot -->|invokes| fda[Fabric Data Agent<br/>NL2DAX tool]
    fda -->|generated DAX, executed over XMLA| model[(Power BI<br/>semantic model)]

    subgraph capture [CAPTURE SURFACES]
        direction LR
        A["(A) MS Graph<br/>aiInteractionHistory<br/>prompt + response text<br/>per user"]
        B["(B) Office 365 Mgmt API<br/>CopilotInteraction<br/>metadata / index<br/>(RecordType 261)"]
        C["(C) Workspace monitoring<br/>semantic-model ops<br/>EXECUTED DAX + metrics"]
    end

    copilot -.-> A
    copilot -.-> B
    model -.-> C

    A --> collector
    B --> collector
    C --> collector

    collector[["FDA_Collector<br/>scheduled Fabric notebook · service principal<br/>pull A,B,C with watermarks + dedup<br/>parse / extract DAX, normalize, hash<br/>ingest to landing tables"]]

    collector --> raw["Eventhouse (KQL)<br/>Raw_GraphInteractions · Raw_AuditInteractions · Raw_ExecutedDax"]
    raw -->|update policies / correlation| curated[("FdaInteractions<br/>curated triple — one row per interaction")]
    curated --> app["FdaObservability.App<br/>C# WinForms · interactive AAD → Kusto<br/>search · review detail · configuration"]
    curated --> dash["Real-Time Dashboard<br/>fleet health & tuning trends"]

Why three sources¶

A single API does not return the full triple for live M365-originated calls:

The raw CopilotInteraction audit record (RecordType 261, via Office 365 Management Activity API) carries metadata only — ThreadId, MessageIds, AppHost, AccessedResources, user, timestamp — not the prompt or response text, and not the generated DAX.
MS Graph aiInteractionHistory.getAllEnterpriseInteractions returns the prompt and response text (AiEnterpriseInteraction.Read.All, app-only). It is the cleanest programmatic text source, but it is keyed per user and its coverage of FDA-via-M365 depends on which experiences write to the interaction-history service.
The generated DAX is most reliably recovered from workspace monitoring of the semantic model: the FDA executes its NL2DAX output over the XMLA endpoint, and that execution is logged with the full DAX text in EventText, plus ExecutionMetrics, duration, CPU, and the executing user's identity.

So: Graph/Audit give question + answer + identity; workspace monitoring gives the DAX; correlation joins them.

Correlation key¶

The collector builds a deterministic correlation key for each interaction:

corr = hash( executingUser , floor(timestamp to 1-minute bucket) , workspaceId , semanticModelId )

Executed-DAX events (source C) are matched to interactions (sources A/B) by executingUser + nearest timestamp within a configurable window (default ±90s) and the same semanticModelId. When multiple DAX executions fall in the window, all are attached to the interaction as an ordered array (an FDA turn can emit several validation/probe queries before the final one). The notebook records MatchConfidence (Exact / Windowed / Unmatched) so the review app can flag low-confidence joins instead of hiding the uncertainty.

The mechanics — pairing prompts to responses, the window join, orphan handling, and confidence scoring — are detailed in Correlation model.

Capability & limitation notes (be explicit with stakeholders)¶

Reasoning / chain-of-thought for live M365 calls is not exposed. You get the rephrased question (ApplicationContext, where present) and the concrete grounding artifact (the DAX), not the model's internal reasoning tokens. For richer step-by-step capture, use FDA_SDK_Replay to re-ask sampled questions through the Fabric Data Agent SDK, which returns get_run_details() steps including generated queries and errors. That is a reconstruction on the FDA side, not the original M365 turn.
Preview surfaces: Purview DSPM-for-AI audit for FDA and the FDA SDK are in preview; treat them as best-effort enrichment. The GA spine is Office 365 Management API + workspace monitoring.
Latency: audit/Graph records can lag the live interaction by minutes to ~30 min; workspace-monitoring is near-real-time. The collector re-scans a trailing window so late-arriving records are back-filled and de-duplicated.
PII / governance: prompts and responses can contain sensitive data. The Eventhouse should inherit the workspace sensitivity label; restrict the KQL database and the review app to authorized reviewers. The schema keeps raw payloads in Raw_* tables with shorter retention and a curated, optionally-redactable FdaInteractions.