Reference data registry
The reference-data registry (src/reference_data/,
ADR-008a) holds versioned datasets that rules,
matrices, and resolution consult — for example compliance lists, jurisdiction risk ratings, and
thresholds. Keeping these as versioned data (not code constants) means they can be updated,
audited, and rolled back without a deployment.
Concepts
| Concept | Meaning |
|---|---|
| Dataset | A named, versioned blob of reference data (a list_key + version + data) |
| Type | The schema a dataset must validate against |
| Adapter | Transforms an external source into a dataset on import |
| Resolver | Resolves the active dataset version for a tenant at runtime |
Lifecycle
- Seed.
seed.pyperforms idempotent initial population at startup. - Import. New versions are imported via adapters and validated (
validator.py) against their type schema. - Resolve. The
resolver.pyreturns the correct version for the tenant when a rule or matrix needs it. - Version & diff. Versions are retained; diffs between versions are inspectable.
API surface
The reference-data router exposes:
| Endpoint | Purpose |
|---|---|
GET /datasets | List datasets and versions |
GET /types | List dataset types/schemas |
GET /adapters | List import adapters |
| import/export | Bring versions in and out, with validation |
Why versioned, not constant
A compliance list changes; you import a new version and the resolver serves it — no code change, and the prior version remains for audit and rollback. This pairs with the risk-matrix engine and source contracts to keep policy data out of code.