Data model & schema¶

The Eventhouse holds landing tables (Raw_*, written by the collector) and one curated table (FdaInteractions, the troubleshooting/tuning triple). Schema is defined in fabric/kql/01_tables.kql; policies in 02_policies.kql.

flowchart LR
    subgraph raw [Landing tables · short retention]
        G[Raw_GraphInteractions]
        A[Raw_AuditInteractions]
        D[Raw_ExecutedDax]
        S[Raw_SdkRuns]
    end
    G --> C[(FdaInteractions<br/>curated triple)]
    A --> C
    D --> C
    W[Watermarks] -. incremental state .- C
    S -. side-by-side compare .- C

All tables are created with .create-merge (idempotent — safe to re-run; adds missing columns without dropping data).

FdaInteractions (curated)¶

One row per interaction — the troubleshooting/tuning triple. Written by the collector via .set-or-append FdaInteractions <| .... This is the table the review app and dashboard read.

Column	Type	Description
`InteractionId`	string	Stable key — Graph id, else audit `RecordId`, else correlation hash (`dax-…` for orphans)
`Timestamp`	datetime	Interaction time (response time for paired rows, DAX time for orphans)
`User`	string	UPN of the end user
`AppHost`	string	`M365` for paired interactions; empty for monitoring-only orphans
`ThreadId`	string	Conversation / session id
`CapacityId`	string	Fabric capacity (reserved; populated from scope where available)
`WorkspaceId` / `WorkspaceName`	string	Workspace hosting the semantic model
`AgentId` / `AgentName`	string	FDA identity, when resolvable from audit
`SemanticModelId` / `SemanticModelName`	string	The queried model (`ItemId` / `ItemName`)
`Question`	string	Original user prompt
`RephrasedQuestion`	string	From `ApplicationContext`, when present
`GeneratedDax`	string	Best-available generated DAX (DSPM/SDK/extracted); may equal `ExecutedDax`
`ExecutedDax`	string	Final/primary executed DAX from monitoring
`DaxQueries`	dynamic	Ordered array `{dax, durationMs, cpuMs, status, ts}` for every execution in the turn
`Answer`	string	Response text
`DurationMs`	long	Summed query duration across the turn
`CpuTimeMs`	long	Summed CPU time across the turn
`Status`	string	`Ok` \| `Error` \| `Unknown`
`ErrorMessage`	string	Error detail, when present
`MatchConfidence`	string	`Exact` \| `Windowed` \| `Unmatched` — see correlation
`Sources`	dynamic	Which surfaces fed the row, e.g. `["graph","monitoring"]`
`CorrelationKey`	string	Deterministic `hash(user, minute-bucket, model)`
`IngestedAt`	datetime	When the collector wrote the row

Raw_GraphInteractions (source A)¶

MS Graph aiInteractionHistory — prompt/response text, one row per message.

Column	Type	Description
`InteractionId`	string	aiInteraction id
`ConversationId`	string	sessionId / thread
`CreatedDateTime`	datetime	Message time
`User`	string	UPN the export was pulled for
`UserId`	string	Object id
`AppClass`	string	e.g. `IPM.SkypeTeams.Message.Copilot.BizChat`
`InteractionType`	string	`userPrompt` \| `aiResponse`
`Body`	string	Prompt or response text (plain)
`BodyType`	string	`text` \| `html`
`Contexts`	dynamic	Grounding resources referenced
`Mentions`	dynamic	@-mentions
`Raw`	dynamic	Full aiInteraction object
`IngestedAt`	datetime	Ingest time

Raw_AuditInteractions (source B)¶

Office 365 Management Activity API — CopilotInteraction metadata / index (RecordType 261).

Column	Type	Description
`RecordId`	string	Audit record id
`CreationTime`	datetime	Record time
`Operation`	string	`CopilotInteraction`
`RecordType`	int	`261`
`Workload`	string	`Copilot`
`User` / `UserId`	string	End user
`AppHost`	string	`Bing` / `Teams` / `Office` / `M365App` / …
`AgentId` / `AgentName`	string	`CopilotStudio.*` or Fabric data agent id, when present
`ThreadId`	string	Conversation id
`MessageIds`	dynamic	Message id list
`AISystemPlugin`	dynamic	Plugin/agent descriptors
`AccessedResources`	dynamic	Resources the turn touched
`ClientRegion`	string	Client region
`AuditData`	dynamic	Full raw audit record
`IngestedAt`	datetime	Ingest time

Raw_ExecutedDax (source C)¶

Workspace monitoring SemanticModelLogs — executed DAX + performance.

Column	Type	Description
`XmlaRequestId`	string	`OperationId` / request id
`Timestamp`	datetime	Execution time
`OperationName`	string	`QueryEnd` / `ExecuteQueryEnd` / `CommandEnd`
`OperationDetail`	string	`OperationDetailName`
`ExecutingUser`	string	Identity FDA impersonated (the end user)
`ApplicationName`	string	Helps confirm FDA / Assistant origin
`ApplicationContext`	dynamic	Request context
`WorkspaceId` / `WorkspaceName`	string	Workspace
`SemanticModelId` / `SemanticModelName`	string	`ItemId` / `ItemName`
`Dax`	string	`EventText` — the DAX query
`DurationMs`	long	Query duration
`CpuTimeMs`	long	CPU time
`Status`	string	Execution status
`ExecutionMetrics`	dynamic	Engine metrics
`IngestedAt`	datetime	Ingest time

Raw_SdkRuns (source D, optional)¶

FDA SDK replay output — richest reasoning/steps for sampled questions. A reconstruction on the FDA side, not the live M365 turn. See SDK replay.

Column	Type	Description
`RunId`	string	Replay run id
`Question`	string	Replayed question
`Answer`	string	SDK answer
`GeneratedQueries`	dynamic	Generated DAX/SQL/KQL strings pulled from steps
`Steps`	dynamic	Full run-steps (tool calls, inputs, outputs, errors)
`Status`	string	`Ok` \| `Error`
`Error`	string	Error detail
`DataAgentName`	string	Published agent name
`ReplayedAt`	datetime	Replay time

Watermarks (incremental state)¶

One row per source watermark — drives incremental, idempotent collection.

Column	Type	Description
`Source`	string	`graph` \| `audit` \| `monitoring` \| `curated`
`LastTimestamp`	datetime	High-water mark for the source
`LastRunAt`	datetime	When the collector last advanced it
`Detail`	string	Free-text run note (e.g. row count)

Policies¶

Defined in 02_policies.kql. Two policy families:

Retention (PII-aware)¶

Raw payloads can contain sensitive prompt/response text, so they are kept shorter than the curated triple:

Table	Soft-delete retention
`Raw_GraphInteractions`, `Raw_AuditInteractions`, `Raw_ExecutedDax`	90 days
`Raw_SdkRuns`	365 days
`FdaInteractions`	730 days

Ingestion batching (low latency)¶

Raw_GraphInteractions, Raw_AuditInteractions, Raw_ExecutedDax, and FdaInteractions use a 30-second / 500-item batching policy for near-real-time review.

Governance

Prompts and responses can contain sensitive data. The Eventhouse should inherit the workspace sensitivity label; restrict the KQL database and the review app to authorized reviewers. The split between short-lived Raw_* and the curated, optionally-redactable FdaInteractions is deliberate.