Skip to main content

Mutation queue & provenance

Every change Atlas makes to the knowledge graph is recorded. The mutation queue (src/mutation_queue/) is the audit backbone: it captures before/after values with provider attribution, detects conflicts, and generates review tasks when a change needs a human. This is what makes the graph defensible — you can always ask "who changed this, when, and why?".

Mutations

A mutation records the entity, the provider responsible, the mutation type (created / updated / merged / matched), and the before/after values. Together, the mutations for an entity form its complete provenance timeline.

Conflicts and review

When survivorship cannot confidently pick a winner — or a change is significant enough to warrant oversight — a conflict is raised and a review task assigned. This is the human-in-the-loop required by ADR-014 and ADR-017.

API surface

The mutation queue exposes its own router (src/mutation_queue/router.py):

EndpointPurpose
GET /mutationsList mutations (paged)
POST /mutations/batchBulk-process mutations
GET /conflictsList open review tasks
GET /conflicts/{id}Conflict detail
POST /conflicts/{id}/resolveApply a manual resolution
GET /entity/{id}/provenanceFull mutation history for an entity

In the frontend, conflicts surface in the analyst's review surfaces and the report's conflicts panel; resolutions flow back through this API. See Frontend architecture and API → Routers.

Relationship to events

Alongside mutations, the system emits lightweight domain events (src/events/domain_events/entity.py) — EntityDiscovered, EntityLinked, EntityMerged, EntityUpdated — which map onto investigation activity-log entries and the graph projection. Mutations are the durable audit record; events are the in-flight signal that something changed.


Deep dive: record shapes & resolution

This reflects src/mutation_queue/repository.py, conflict_repository.py, router.py.

What a mutation records

A mutation row captures the full provenance of a single field change. Key fields (ontology_mutations):

FieldMeaning
subject_kindentity or relationship
field_path / cardinality_keythe attribute (and array-element key) being changed
previous_value / proposed_valuebefore/after (JSONB)
source_type / source_id / source_trustwho proposed it and how trusted
conflict_detected / conflict_idwhether it raised a conflict
statuspending / accepted / rejected / merged
batch_id / investigation_id / evidence_snapshot_idthe run that produced it

What a conflict records

A conflict freezes the current value and captures the incoming one, plus investigation context and a resolution outcome:

Endpoints

EndpointPurpose
POST /api/mutations/process-batchIngest a provider batch → mutations (+ conflicts)
GET /api/mutations/{id}One mutation
GET /api/mutations/provenance/{entity_id}An entity's full field provenance
GET /api/mutations/conflictsOpen conflicts (open / investigating)
POST /api/mutations/conflicts/{id}/resolveResolve (accept_incoming / keep_current / override)

Resolution writes the chosen value back, records who/when/why, and unfreezes the field provenance — closing the audit loop. This is the human-in-the-loop required by ADR-014.