Skip to main content

Add an ontology entity type or field

The ontology is the contract every other subsystem depends on, so changes here ripple — but in a controlled, validated way. This guide covers adding a field to an existing type (common, easy), adding a whole entity type (rarer, touches resolution), and adding a graph relationship. Read Ontology and Entity resolution first.

Add a field to an existing type

This is the common case (it's how country_code, lei, vat_number were added in v3.5.2).

  1. Edit schemas/ontology/ontology_v3.5.yaml — add the attribute under the entity type, with a type, optional required, and a survivorship strategy:
    LegalEntity:
    attributes:
    my_new_field:
    type: string
    survivorship: most_trusted
  2. Add provider trust for the new field under survivorship.provider_trust.<provider> so survivorship knows how much to trust each source for it.
  3. Map it in any provider's mapping_spec.yaml that can supply it (see Add a provider).
  4. Version & migrate. Minor bumps keep the filename; add a companion migration (migrations/V###__ontology_schema_v3_5_x.sql) that publishes the YAML byte-equal into ontology_schema_versions. A CI drift check enforces YAML↔DB parity.

Add a new entity type

A new type touches resolution because Atlas must know how to match it. Files to touch:

FileChange
schemas/ontology/ontology_v3.5.yamlDefine the entity_type with attributes
src/ontology/registry.pyRegister the type's schema/class
src/ontology/entity_matcher.pyAdd a _match_<type>() and dispatch it in generate_matching_key()
src/ontology/blocking.pyAdd blocking-key generation for the type
src/ontology/reconciliation.pyHandle the type in _entities_match() / clustering

Design the matching key carefully. Atlas's resolution quality comes from good keys: a strong identifier (registration number, normalized) gives a high-confidence key; a name-only key is weaker. Follow the existing pattern — identifier-first with a normalized-name fallback, and a blocking key that's cheap enough to group on. See the concrete thresholds in Entity resolution.

Add a graph relationship

Relationships are schema-driven — the Cypher is generated from the ontology, nothing is hard-coded.

  1. Declare it under relationship_types in the ontology YAML, including a neo4j_type:
    relationship_types:
    MyRelation:
    source: LegalEntity
    target: Person
    neo4j_type: "MY_RELATION"
  2. The CypherGenerator reads neo4j_type automatically (falling back to upper-snake-case), and Neo4jSyncService will project instances during graph sync. Every generated query includes tenant_id in MERGE/MATCH keys for isolation.
  3. Query it:
    MATCH (le:LegalEntity {tenant_id:$tenant_id})-[:MY_RELATION]->(p:Person {tenant_id:$tenant_id})
    WHERE le.id = $entity_id RETURN p

Invariants

  • Determinism: a given entity must always produce the same matching key.
  • Protected fields: if your new field is a screening signal, add it to protected_fields so providers can't overwrite it.
  • Parity: the YAML and the ontology_schema_versions row must match — CI enforces it.

Next: Add a risk matrix to score on your new field.