Skip to main content

ADR-014: AI Governance, Model Lineage, and Human Oversight

Status: Proposed Date: 2026-03-29 Author: Atlas Architecture Depends on: ADR-007 (Workflow Ontology Engine), ADR-008 (EBA Risk Matrix Engine), ADR-009 (Data Provider Plugin Architecture), ADR-010 (Ontology Lifecycle, Conflict Resolution & Conflict-Driven Investigation), ADR-013 (Analyst Interaction Layer) Impacts: Investigation modules, workflow builder, lineage APIs, analyst review UX, observability


Table of Contents

  1. Context & Problem Statement
  2. Decision Summary
  3. Governed AI Surfaces
  4. AI Run Classification
  5. Decision Boundaries & Human Oversight
  6. AI Run Ledger
  7. Prompt, Policy, and Tool Governance
  8. Replay, Reproducibility, and Challenge
  9. Atlas Lineage Integration
  10. EU AI Act Positioning
  11. Implementation Direction
  12. Migration Path
  13. Competitive Differentiation
  14. Rejected Alternatives

1. Context & Problem Statement

Atlas already uses AI in material parts of the product:

  • investigation modules invoke LLMs during tool-calling, analysis, ontology extraction, and corroboration
  • the workflow builder uses AI to turn policy documents into executable workflow schemas
  • quality-scoring and related internal evaluators use model judgments

The current codebase already contains strong governance-adjacent building blocks:

  • src/temporal/activities.py centralizes most investigation LLM calls through llm_invoke_with_retry()
  • src/workflows/activities/audit_activities.py persists workflow decisions and evidence hashes
  • src/api/lineage_router.py exposes investigation lineage and provenance
  • ADR-008 requires deterministic, non-LLM risk scoring
  • ADR-010 introduces mutation provenance, evidence snapshots, and replay-oriented evaluation
  • ADR-013 introduces explicit human review surfaces and approval flows

However, Atlas does not yet define a unified governance model for AI-assisted operations:

  1. No canonical AI run record. We can trace some runs through Langfuse and raw outputs, but there is no first-class ledger entry that ties a model invocation to prompts, tools, evidence inputs, downstream mutations, and human decisions.
  2. Prompt governance is fragmented. Some prompts are versioned in agent_prompts; others are inline in Python modules or builder logic. There is no concept of an effective prompt bundle or policy hash.
  3. Lineage stops at the module boundary. Existing lineage reconstructs module provenance, but not model/provider/prompt/tool provenance.
  4. Human oversight is present but not classified. Review phases, wizard approvals, and score overrides exist, but Atlas does not explicitly define which AI outputs are advisory, which may stage writes, and which must never finalize decisions.
  5. Replay guarantees are underspecified for AI. Risk scoring is deterministic by ADR-008, but stochastic model behavior requires clearer rules about exact replay, best-effort replay, and what evidence must be preserved.
  6. Product positioning is weaker than it could be. “AI Act compliant evidence collection” is not the right claim. Atlas needs a stronger architectural position: AI-governed compliance intelligence with traceable assistance and explicit human accountability.

This ADR defines that governance model.


2. Decision Summary

Atlas will introduce a unified AI Governance Layer with the following properties:

  • Canonical AI run ledger for every material AI-assisted step
  • Run classification that distinguishes advisory, extraction, recommendation, and decision-constraining AI
  • Explicit decision boundaries so stochastic AI cannot silently become the final compliance decision-maker
  • Prompt, policy, and tool lineage recorded alongside model/provider metadata
  • Replay and challenge support that ties AI outputs to source evidence, downstream mutations, and human override paths
  • Lineage API integration so AI provenance becomes visible through the same evidence and ontology lineage surfaces already used in Atlas
  • EU AI Act-ready governance posture focused on traceability, logging, documentation, human oversight, and auditability without making unsupported legal claims

3. Governed AI Surfaces

ADR-014 applies to all material AI-assisted steps in Atlas.

In Scope

  1. Investigation module execution
    • tool-calling and analysis in src/temporal/activities.py
    • ontology extraction and corroboration in the same module
    • segment-driven risk framing loaded from Settings and injected into prompts at runtime
  2. Workflow builder generation
    • requirement extraction
    • ambiguity clarification
    • schema generation
    • builder-session approval outputs
  3. Quality and judgment models
    • evaluator or judge-model flows such as quality scoring
  4. AI-generated explanations
    • summaries, rationale text, evidence synthesis, and analyst-facing explanations that may affect review workload or risk interpretation

Out of Scope

  • deterministic risk-matrix scoring logic under ADR-008
  • deterministic reference-data resolution under ADR-008a
  • non-production experiments, local scripts, and development-only prompts

Materiality Rule

An AI-assisted step is material if it can:

  • create or modify ontology mutations
  • influence workflow routing or review prioritization
  • change what evidence is presented to a human reviewer
  • influence risk interpretation, score explanations, or conflict handling
  • generate schemas or policies that may later control regulated workflows
  • change its behavior based on configured runtime policy context such as active client segment settings

4. AI Run Classification

Every governed AI run must declare a run_class.

Run ClassDescriptionTypical Atlas ExamplesDefault Authority
advisoryGenerates summaries, explanations, or drafting assistancereport summaries, builder helper textNo direct writes to authoritative state
extraction_normalizationConverts unstructured evidence into typed fields, entities, or relationshipsontology extraction, corroboration, document field extractionMay stage mutations with provenance
recommendationProposes actions, routes, scores, or investigationssuggested review outcome, escalation recommendation, schema proposalCannot finalize regulatory outcome
decision_constrainingCan trigger protective controls that materially affect workflow behaviorfreeze, escalate, hold-for-review, analyst task creationAllowed only under explicit workflow/ontology policy and auditable review logic

Design Rule

Atlas v1 does not allow a stochastic model to directly finalize irreversible compliance outcomes such as:

  • customer approval
  • customer rejection
  • offboarding
  • final risk score mutation in the deterministic ADR-008 scorer
  • final analyst override acceptance

Decision-constraining AI may trigger protective actions, such as pending_review, freeze, or escalate, when the workflow/schema policy explicitly allows it.


5. Decision Boundaries & Human Oversight

Core Principle

AI can assist, classify, extract, and recommend. Humans or deterministic policy engines remain accountable for final regulated decisions.

Oversight Rules

  1. Advisory outputs may be displayed without approval but must be labeled as AI-assisted.
  2. Extraction / normalization outputs may create staged ontology mutations only when they include evidence references, confidence metadata, and replay metadata.
  3. Recommendation outputs must route through an explicit human or deterministic gate before changing authoritative business state.
  4. Decision-constraining outputs may only apply protective actions defined in workflow or ontology policy and must produce a reviewable audit trail.
  5. Workflow-builder outputs must always pass through human approval before becoming active workflow definitions.
  6. Risk-matrix overrides remain human-authored derived evaluations under ADR-008, not AI-written final decisions.
  7. Segment-configured AI behavior must be reviewable as policy-driven behavior, not hidden prompt magic. If segment settings materially change severity framing, recommendation language, or compliance interpretation, that segment configuration must be auditable and replayable.

Human Oversight Surfaces

ADR-014 relies on and extends the surfaces defined in ADR-013:

  • workflow review phases
  • task inbox and execution views
  • builder-session approval steps
  • evaluation verification and override workflows

These surfaces must eventually expose:

  • whether content was AI-generated
  • which run produced it
  • what evidence it relied on
  • how to challenge, override, or re-run it

6. AI Run Ledger

Atlas will maintain a canonical AI run ledger for every material AI-assisted step.

Required Fields

ai_run_record:
id: uuid
parent_context:
investigation_id: uuid?
workflow_execution_id: uuid?
phase_id: string?
module_name: string?
company_id: uuid?
run_class: advisory | extraction_normalization | recommendation | decision_constraining
purpose: string

model:
provider: string
model_name: string
model_version: string?
endpoint: string?
routing_vendor: string?

prompt_governance:
prompt_template_id: string?
prompt_version: string?
effective_prompt_hash: string
policy_bundle_version: string?
segment_config_hash: string?

tool_governance:
tool_manifest_hash: string?
tool_names: [string]
mcp_servers: [string]

data_context:
ontology_schema_version_id: uuid?
workflow_schema_version_id: uuid?
matrix_schema_id: uuid?
active_segment_code: string?
active_segment_id: uuid?
active_segment_version: string?
source_contract_set: json?
evidence_snapshot_ids: [uuid]

traceability:
langfuse_trace_id: string?
input_hash: string?
output_hash: string?
status: succeeded | failed | superseded
started_at: timestamp
completed_at: timestamp?

outputs:
raw_output_ref: string?
artifact_ids: [string]
downstream_mutation_ids: [uuid]
downstream_decision_ids: [uuid]

Ledger Principles

  • one record per material run, not per token stream chunk
  • append-only for audit integrity
  • linked to existing workflow, investigation, ontology, and evaluation records
  • usable even when external tracing systems are unavailable

Langfuse remains valuable, but it is not the system of record for governance-critical lineage.


7. Prompt, Policy, and Tool Governance

Prompt Governance

Atlas must distinguish between:

  • stored prompts from agent_prompts
  • inline prompts embedded in Python code
  • composed prompts assembled from templates, validation feedback, and runtime context

Every governed run must therefore record the effective prompt hash, even when no single prompt row exists in the database.

Policy Bundle

A policy_bundle_version should identify the full bundle of runtime governance assumptions, including:

  • prompt template versions or hashes
  • system policy instructions
  • tool allowlists
  • model routing defaults
  • confidence thresholds or guardrails
  • active segment configuration or its stable hash

Segment Governance

Atlas currently allows client segments to be configured through Settings and injects the active segment into investigation prompt context. This is not merely presentation-layer metadata. Segment settings can alter:

  • risk severity framing
  • recommendation language
  • regulatory interpretation context
  • relative screening priorities

Therefore, segments are treated by ADR-014 as governed runtime policy inputs.

Every material AI run that uses segment context must record:

  • the active segment identity (code and durable ID where available)
  • a stable segment configuration hash or version
  • whether the segment was injected into the effective prompt/policy bundle

Changing an active segment is therefore governance-relevant in the same way as changing a prompt, tool allowlist, or policy bundle.

Tool Governance

For any tool-using run, Atlas must record:

  • tool names exposed to the model
  • MCP servers or HTTP integrations involved
  • a stable tool manifest hash where possible

This is critical because the effective behavior of a tool-calling model is defined by both the prompt and the allowed tool surface.


8. Replay, Reproducibility, and Challenge

Replay Classes

ADR-014 distinguishes three replay modes:

  1. Exact deterministic replay
    • required for ADR-008 risk scoring
    • same input and versioned config must produce the same result
  2. Governed best-effort replay
    • for stochastic AI runs
    • Atlas must preserve enough metadata and evidence to explain what happened and re-run under materially equivalent conditions, including the active segment configuration when one shaped the run
  3. Evidence-only replay
    • when the original external model version is no longer available
    • Atlas must still reconstruct the evidence basis, prompt/policy hash, and downstream effects

Challenge Requirements

Any material AI-assisted output must support a challenge path:

  • view the evidence basis
  • identify the producing AI run
  • identify the segment/policy context that shaped the run
  • see downstream mutations or decisions
  • mark the output disputed, overridden, or superseded
  • trigger re-review or re-run where policy allows

Challengeability is a product capability, not just a logging requirement.


9. Atlas Lineage Integration

Atlas already exposes investigation lineage. ADR-014 extends lineage from:

module -> field provenance

to:

module -> ai run -> prompt/policy/tool context -> segment context -> evidence snapshot -> downstream mutation / decision

Lineage Expansion Rules

  1. Investigation lineage must be able to show which AI runs contributed to a module result.
  2. Entity and field lineage must be able to show whether a value came from deterministic transformation, source mapping, analyst input, or AI-assisted extraction.
  3. Workflow execution history must show whether a recommendation, draft, or escalation was AI-assisted.
  4. Score explanations may be AI-generated, but the score itself remains governed by ADR-008 deterministic logic.
  5. Where an active segment influenced an AI run, lineage must expose that segment context as part of the run's governing policy inputs.

10. EU AI Act Positioning

ADR-014 does not declare Atlas legally “EU AI Act compliant” in the abstract.

Instead, Atlas is designed to provide the governance capabilities that sophisticated providers and deployers need in order to operate AI systems in a defensible way, including:

  • logging and traceability
  • documentation and version pinning
  • human oversight
  • challenge and override support
  • quality monitoring
  • evidence-backed outputs

Product Claim Boundary

Atlas should position this capability as:

  • AI-governed compliance intelligence
  • traceable AI assistance with human accountability
  • AI Act-ready governance controls

Atlas should avoid claiming:

  • that “evidence itself is AI Act compliant”
  • that technical controls alone satisfy all provider or deployer obligations
  • that all AI outputs are exactly reproducible across future model/provider changes

This ADR is therefore architecture and governance guidance, not legal advice.


11. Implementation Direction

Existing Extension Points

The codebase already provides strong insertion points:

  • src/temporal/activities.py for investigation-side AI run recording
  • src/workflows/activities/audit_activities.py for workflow-side audit persistence
  • src/workflows/schema/models.py for schema-level audit policy expansion
  • src/api/lineage_router.py for provenance exposure
  • workflow builder session APIs and approval flow for human governance over generated schemas
  • risk matrix evaluation verification and override flows for downstream accountability

Directional Implementation Rules

  1. Standardize material AI invocation behind a common recording path.
  2. Persist AI run records locally even when Langfuse is enabled.
  3. Extend workflow audit events to reference AI run IDs where relevant.
  4. Expose AI provenance through lineage APIs instead of creating a disconnected governance silo.
  5. Keep deterministic scoring separate from stochastic explanation or recommendation layers.
  6. Prefer staged writes plus review over silent auto-application in high-impact contexts.
  7. Treat active segment settings as governed policy context and include them in prompt/policy hashing, run lineage, and replay metadata wherever they influence model behavior.

12. Migration Path

Phase 1: AI Run Recording

  • define the canonical AI run record schema
  • capture investigation-side material runs
  • capture workflow-builder material runs
  • persist effective prompt hashes and model/provider metadata
  • persist active segment identity and configuration hash for runs that use segment context

Phase 2: Audit and Lineage Integration

  • link AI run records to workflow decisions, ontology mutations, and raw outputs
  • extend lineage APIs to include AI provenance
  • expose AI-generated labels in analyst-facing read paths

Phase 3: Human Challenge and Override

  • add challenge / dispute / supersede flows for material AI outputs
  • surface create-override flows already present in the risk API
  • add explicit review affordances for AI-assisted findings and recommendations

Phase 4: Governance Hardening

  • unify prompt/policy bundle versioning
  • standardize allowed models and providers by policy
  • add internal evaluations for quality, drift, and governance regressions

13. Competitive Differentiation

Most competitors can claim some combination of:

  • AI-assisted onboarding
  • workflow automation
  • sanctions / adverse media enrichment
  • configurable risk rules

ADR-014 differentiates Atlas when combined with ADR-008 through ADR-010 because it creates:

  • AI assistance tied to evidence, not black-box outputs
  • ontology-aware lineage instead of isolated screening alerts
  • explicit human accountability instead of vague “human in the loop” claims
  • replayable, challengeable AI-assisted compliance operations

The moat is not “we use AI.” The moat is:

Atlas makes AI-assisted KYB inspectable, governable, and operationally auditable at field, workflow, and score level.


14. Rejected Alternatives

Alternative 1: Treat Langfuse as the governance system of record

Rejected because observability is not the same as governance. External tracing can supplement, but not replace, first-class internal audit records.

Alternative 2: Govern only final workflow decisions

Rejected because extraction, recommendation, and schema-generation runs can materially shape downstream outcomes even when they do not directly press the final “approve” button.

Alternative 3: Allow fully autonomous AI final decisions in v1

Rejected because Atlas's strategic advantage is trustworthy, reviewable compliance intelligence. Silent autonomous final decisions weaken auditability and make regulatory positioning more fragile.

Alternative 4: Keep prompt governance informal

Rejected because model lineage without effective prompt/policy lineage is incomplete. In a tool-using system, prompt + tools + model jointly determine behavior.