The ontology (v3.5)
The ontology is the heart of Atlas. It defines the entity types, their attributes, the
relationships between them, the risk categories, and — crucially — the survivorship rules that
decide which provider's data wins when sources disagree. It is a versioned YAML document
(schemas/ontology/ontology_v3.5.yaml, currently v3.5.2) loaded into the database at startup.
The file name stays ontology_v3.5.yaml across minor bumps (v3.5 → v3.5.1 → v3.5.2); history lives
in git. A companion migration (migrations/V130__ontology_schema_v3_5_2.sql) publishes the YAML
byte-equal into the ontology_schema_versions table, and a CI drift check enforces parity. The
active schema is loaded via SchemaCache.initialize(pool) and the app fails to boot if it
cannot load.
Entity types
The eight entity types are:
| Entity type | Purpose |
|---|---|
| LegalEntity | Companies and organisations — the primary subject |
| Person | Individuals — directors, shareholders, UBOs |
| Address | Registered and operating addresses |
| Document | Filings and source documents |
| Domain | Web domains owned by an entity |
| SanctionsMatch | A match against a sanctions list |
| PEPExposure | Politically-exposed-person exposure |
| AdverseMedia | Adverse media mentions (with sentiment) |
Relationship types
| Relationship | From → To | Meaning |
|---|---|---|
| Directorship | Person → LegalEntity | Director / officer role |
| Ownership | Entity → LegalEntity | Shareholding / beneficial ownership |
| RegisteredAt | LegalEntity → Address | Registered/operating address |
| DocumentsEntity | Document → Entity | A document evidences an entity |
| OwnsDomain | LegalEntity → Domain | Domain ownership |
| MatchedTo | Person → SanctionsMatch | Screening match |
| MentionedIn | Entity → AdverseMedia | Adverse-media mention |
Risk categories
Findings are classified into eleven categories, used by both module rules and the risk matrix:
sanctions · pep · adverse_media · ownership · governance · corporate_status ·
jurisdiction · digital_footprint · regulatory · data_quality · secrecy_jurisdiction
Survivorship rules
When two sources disagree about an attribute, survivorship decides the winner. The ontology declares four strategies, a per-attribute default, and trust weights at two levels.
- Strategies:
most_trusted,most_recent,most_complete,combine. Each attribute in the schema declares which strategy applies (for examplelegal_name: most_trusted,registration_numbers: combine). - Module trust ranks the OSINT modules by authority —
cir(company registry) is most authoritative at 10, down todfwo(digital footprint) at 3. - Provider trust assigns per-field confidence to external providers. For example NorthData's
registration_numberis trusted at 0.99, while OSINT's_defaultis 0.70 and KVK's Dutchpostal_codeis 0.95. - Protected fields can never be silently overwritten by a provider. On
Personthese includeis_pep,pep_position,is_sanctioned, andsanctions_matches; they originate from the screening crews (SPEPWS, AMLRR), not from registries.
See Claims & survivorship for how these rules execute at merge time, ADR-010 for ontology lifecycle, and ADR-020 for field provenance.
Crew output contract
The ontology also defines the JSON contract every OSINT crew must return — entities,
relationships, risk_indicators, and a summary with a data_quality_score. This is what
binds the LLM crews to the ontology: their output is validated and mapped into the entity types
above before resolution.
For the field-by-field reference, see Reference → Ontology schema.