Skip to main content

Ontology schema reference

A field-level companion to the Ontology architecture page. The source of truth is schemas/ontology/ontology_v3.5.yaml (currently v3.5.2), published into the ontology_schema_versions table and validated for drift in CI.

Top-level structure

version: "3.5.2"
name: "Enhanced OSINT Schema (Complete)"
output_format: # JSON contract every crew must return
entity_types: # 8 entity types
relationship_types: # 7 relationship types
risk_categories: # 11 categories
crew_instructions: # per-module guidance
compliance_rules:
risk_scoring:
survivorship: # strategies, module_trust, provider_trust, protected_fields

Entity types

TypeSelected attributes
LegalEntitylegal_name (required), registration_number, registration_numbers (combine), jurisdiction, country_code (ISO 3166-1), lei (ISO 17442), vat_number, status, financials, sheets, events, segment_codes, contact_info, capital_history
Personname, date_of_birth, nationalities, is_pep 🔒, pep_position/pep_source/pep_details/pep_category 🔒, is_sanctioned/sanctions_matches/sanctions_source/sanctions_lists 🔒
Addressstreet_address, postal_code, city, country, latitude, longitude, premises_type
Documentsource documents / filings
Domainweb domains
SanctionsMatcha sanctions-list match
PEPExposurePEP exposure detail
AdverseMediaadverse-media mention, with sentiment

🔒 = protected field.

Relationship types

TypeSource → Target
DirectorshipPerson → LegalEntity
OwnershipEntity → LegalEntity
RegisteredAtLegalEntity → Address
DocumentsEntityDocument → Entity
OwnsDomainLegalEntity → Domain
MatchedToPerson → SanctionsMatch
MentionedInEntity → AdverseMedia

Risk categories

sanctions · pep · adverse_media · ownership · governance · corporate_status · jurisdiction · digital_footprint · regulatory · data_quality · secrecy_jurisdiction

Survivorship

Strategies

most_trusted (source priority cir > mebo > roa > frls > spepws > amlrr > dfwo) · most_recent · most_complete · combine.

Module trust

ModuleWeight
cir10
mebo8
roa7
frls6
spepws5
amlrr4
dfwo3

Provider trust (selected)

Provider · fieldWeight
northdata · registration_number0.99
northdata · jurisdiction0.99
northdata · legal_name0.98
northdata · lei / country_code0.95
northdata · _default0.80
kvk · postal_code0.95
kvk · _default0.85
osint · premises_type / full_address0.80
osint · _default0.70

Protected fields

EntityFields
Personis_pep, pep_position, pep_source, pep_details, pep_category, is_sanctioned, sanctions_matches, sanctions_source, sanctions_lists
LegalEntityPEP / sanctions flags

These are never overwritten by a provider; they originate from the SPEPWS / AMLRR crews. See Claims & survivorship.

Crew output contract

Every crew returns:

{
"entities": [{ "type": "...", "name": "...", "properties": {}, "confidence": 0.0 }],
"relationships": [{ "type": "...", "source": "...", "target": "...", "properties": {} }],
"risk_indicators": [{ "indicator_name": "...", "severity": "critical|high|medium|low",
"category": "...", "description": "...", "evidence": "..." }],
"summary": { "data_quality_score": 0.0, "total_entities": 0,
"total_relationships": 0, "key_findings": [] }
}

Versioning

Minor bumps keep the ontology_v3.5.yaml filename; history lives in git. Each version is published byte-equal into ontology_schema_versions by a companion migration (e.g. V130__ontology_schema_v3_5_2.sql), and a CI drift check enforces parity between YAML and database. Schema lifecycle is governed by ADR-010.