Skip to main content

Reference data registry

The reference-data registry (src/reference_data/, ADR-008a) holds versioned datasets that rules, matrices, and resolution consult — for example compliance lists, jurisdiction risk ratings, and thresholds. Keeping these as versioned data (not code constants) means they can be updated, audited, and rolled back without a deployment.

Concepts

ConceptMeaning
DatasetA named, versioned blob of reference data (a list_key + version + data)
TypeThe schema a dataset must validate against
AdapterTransforms an external source into a dataset on import
ResolverResolves the active dataset version for a tenant at runtime

Lifecycle

  1. Seed. seed.py performs idempotent initial population at startup.
  2. Import. New versions are imported via adapters and validated (validator.py) against their type schema.
  3. Resolve. The resolver.py returns the correct version for the tenant when a rule or matrix needs it.
  4. Version & diff. Versions are retained; diffs between versions are inspectable.

API surface

The reference-data router exposes:

EndpointPurpose
GET /datasetsList datasets and versions
GET /typesList dataset types/schemas
GET /adaptersList import adapters
import/exportBring versions in and out, with validation

Why versioned, not constant

A compliance list changes; you import a new version and the resolver serves it — no code change, and the prior version remains for audit and rollback. This pairs with the risk-matrix engine and source contracts to keep policy data out of code.