ADR-009: Data Provider Plugin Architecture
Status: Proposed Date: 2026-03-29 Author: Atlas Architecture Depends on: ADR-008a (Reference Data Registry) First implementation: KVK (Dutch Chamber of Commerce)
Context
Atlas currently has a single hardcoded data provider integration — NorthData — with its routing, mapping, and credential management baked into src/integrations/northdata/. The abstract base (DataProvider in src/integrations/base.py) defines a clean interface, and the ProviderRouter handles country-based lookup, but several architectural gaps make it difficult to add new providers as self-contained, toggleable plugins:
-
No formal plugin lifecycle. Providers are registered via a Python decorator (
@register_provider), but there is no database-backed registry that tracks plugin metadata (version, author, changelog, health status), nor a way to install/uninstall a plugin without redeploying the application. -
Country exclusivity is not enforced. The
data_provider_countriestable allows multiple providers to cover the same country at different priorities, which is intentional for fallback. However, there is no concept of a primary authoritative provider per country — the one whose data should be treated as the official registry source for that jurisdiction. -
Ontology mapping is provider-internal. Each provider implements
map_to_ontology()as an opaque method. There is no shared, inspectable mapping specification that a compliance officer could review to understand how KVK fields become Atlas entity attributes. -
No test harness contract. Providers have a
test_connection()method but no structured way to validate that a provider's ontology mapping actually produces valid output for a set of known test cases.
This ADR defines a plugin architecture for data providers and uses the KVK (Kamer van Koophandel — Dutch Chamber of Commerce) integration as its first implementation.
KVK API Research
Overview
The KVK (Kamer van Koophandel) is the Dutch Chamber of Commerce that maintains the Handelsregister (Trade Register) — the authoritative commercial registry for all businesses operating in the Netherlands. The KVK Developer Portal (https://developers.kvk.nl) provides REST APIs for programmatic access.
Available APIs
| API | Base URL | Version | Purpose |
|---|---|---|---|
| Zoeken (Search) | https://api.kvk.nl/api/v2/zoeken | v2 | Search by name, KVK number, RSIN, location, or trade name |
| Basisprofiel (Basic Profile) | https://api.kvk.nl/api/v1/basisprofielen/{kvkNummer} | v1 | Core registration: statutory name, legal form, owner/main establishment, SBI activities |
| Vestigingsprofiel (Branch Profile) | https://api.kvk.nl/api/v1/vestigingsprofielen/{vestigingsnummer} | v1 | Per-branch detail: address, activities, employees, BAG/GPS geo-data |
| Naamgeving (Trade Names) | https://api.kvk.nl/api/v1/naamgevingen/{kvkNummer} | v1 | Statutory name, all trade names, non-commercial names |
Authentication & Pricing
- Authentication: API key passed as HTTP header. Keys are issued via the KVK Developer Portal after subscription approval.
- Pricing: Per-call usage fees, varying by API. A free test environment is available.
- Rate limits: Published per subscription tier; typical quota is request-count-based rather than rate-based.
Test Environment
KVK provides a sandbox at https://api.kvk.nl/test/api/ with fictitious company data and a shared test API key (l7xx1f2691f2520d487b902f4e0b57a0b197). This enables integration testing without incurring charges or touching real data.
Data Model (Basisprofiel response)
The Basisprofiel is the primary endpoint for Atlas. Key fields:
| Field | Type | Description |
|---|---|---|
kvkNummer | string (8 digits) | KVK registration number |
indNonMailing | string | Non-mailing indicator |
statutaireNaam | string | Statutory (legal) name |
handelsnamen | array | Trade names |
spiActiviteiten | array | SBI activity codes (Dutch NACE equivalent) |
rechtsvorm | string | Legal form code (e.g., "BV", "NV", "Eenmanszaak") |
formeleRegistratiedatum | date | Formal registration date in Handelsregister |
materieleRegistratie | object | Material registration dates (start, end) |
totaalWerkzamePersonen | integer | Total employees |
eigenaar | object | Owner details (for sole proprietorships) |
hoofdvestiging | object | Main establishment with vestigingsnummer, address, SBI activities |
vestigingen | array | All branch establishments |
_links | object | HAL/HATEOAS links to related Vestigingsprofiel, Naamgeving resources |
Vestigingsprofiel Fields
| Field | Type | Description |
|---|---|---|
vestigingsnummer | string (12 digits) | Branch establishment number |
kvkNummer | string | Parent KVK number |
eersteHandelsnaam | string | Primary trade name |
adressen | array | Physical and postal addresses with BAG ID, GPS coordinates |
websites | array | Website URLs |
sbiActiviteiten | array | Branch-level SBI codes |
totaalWerkzamePersonen | integer | Employees at this branch |
Limitations
- No financial statements. KVK does not serve annual accounts via API — those are published separately and available through services like NorthData.
- No UBO data. The KVK UBO register API is restricted to banks, notary offices, and designated obliged entities under the Dutch Wwft. General API subscribers cannot access it.
- No person details. The API does not return director/shareholder names or ownership percentages. These are available through the physical extract (uittreksel) but not via API.
- Netherlands only. KVK covers exclusively Dutch registrations (country code
NL).
Implications for Atlas
KVK is a company registry provider with high trust for registration data but limited scope — it covers company data but not person, financials, filings, or ownership capabilities. Atlas must combine KVK data with other providers (e.g., NorthData for financials and persons) to build a complete entity profile.
Decision
1. Plugin Architecture
Introduce a formal plugin system that wraps the existing DataProvider base class with lifecycle management, country exclusivity rules, and inspectable ontology mapping specifications.
1.1 Plugin Package Structure
Each plugin is a self-contained Python package under src/integrations/:
src/integrations/kvk/
├── __init__.py # @register_provider("kvk") decorator
├── plugin.py # PluginManifest dataclass
├── client.py # KVKProvider(DataProvider) implementation
├ ── mapper.py # KVKMapper with ontology mapping
├── mapping_spec.yaml # Declarative field mapping (inspectable)
├── test_fixtures/ # Known-good test data from sandbox
│ ├── basisprofiel_bv.json
│ ├── vestigingsprofiel.json
│ └── expected_ontology.json
└── README.md # Human-readable plugin docs
1.2 Plugin Manifest
Every plugin declares a PluginManifest that is stored in the data_providers table and displayed in the UI:
@dataclass
class PluginManifest:
name: str # "kvk"
display_name: str # "KVK (Dutch Chamber of Commerce)"
version: str # "1.0.0" (semver)
description: str # Human-readable summary
provider_type: str # "company_registry" | "screening" | "financial" | "composite"
country_codes: list[str] # ["NL"] — exclusive claim
capabilities: list[str] # ["company"] — subset of standard capabilities
trust_level: float # 0.97 for official registry
requires_credentials: list[str] # ["api_key"]
api_base_url: str # "https://api.kvk.nl/api"
test_base_url: str # "https://api.kvk.nl/test/api"
documentation_url: str # "https://developers.kvk.nl/documentation"
rate_limit: dict # {"requests_per_minute": 60}
author: str # "Atlas Team"
changelog: list[dict] # [{"version": "1.0.0", "date": "2026-03-29", "notes": "Initial release"}]
1.3 Plugin Lifecycle States
disabled ──▶ enabled ──▶ disabled
│
▼
degraded (health check failing)
- disabled: Plugin code is deployed but inactive. No API calls are made. Country routes are not advertised.
- enabled: Plugin actively serves requests for its declared countries. Health checks run periodically.
- degraded: Plugin is enabled but health checks are failing. Router falls back to lower-priority providers. Alert raised.
State transitions are persisted via the existing data_providers.enabled column plus a new health_status column.
2. Country Exclusivity Model
2.1 Authority Tiers
Each provider-country binding has an authority_tier:
| Tier | Meaning | Example |
|---|---|---|
primary | Authoritative registry source for this country. At most one per country. | KVK for NL |
supplementary | Provides additional data (financials, persons, events) not available from the primary. | NorthData for NL |
fallback | Used if primary and supplementary are unavailable. | OpenCorporates for NL |
Constraint: At most one provider may hold the primary tier for any given country_code. This is enforced with a partial unique index:
CREATE UNIQUE INDEX uq_primary_provider_per_country
ON data_provider_countries(country_code)
WHERE authority_tier = 'primary' AND enabled = true;
2.2 Country Routing with Authority Tiers
The ProviderRouter is enhanced to respect authority tiers:
async def fetch_company_complete(
self,
registration_number: str,
country_code: str,
required_capabilities: list[str] | None = None,
) -> ProviderResponse:
"""
Fetch company data by composing responses from multiple authority tiers.
1. Call the PRIMARY provider (if any) for authoritative registry data.
2. Call SUPPLEMENTARY providers for additional capabilities not covered by PRIMARY.
3. Fall back to FALLBACK providers if primary/supplementary fail.
4. Merge all responses respecting trust-weighted survivorship.
"""
For the Netherlands, a typical flow would be:
- KVK (
primary, capabilities:company) → LegalEntity with statutory name, legal form, registration date, addresses, SBI codes. - NorthData (
supplementary, capabilities:company,person,financials,filings,events,ownership) → Enriches with directors, shareholders, financials, events. - The reconciliation engine merges both, with KVK winning on registry fields (name, registration number, legal form) due to higher trust (0.97 vs NorthData's 0.95).
2.3 Schema Changes
Extend data_provider_countries:
ALTER TABLE data_provider_countries
ADD COLUMN authority_tier VARCHAR(20) NOT NULL DEFAULT 'fallback'
CHECK (authority_tier IN ('primary', 'supplementary', 'fallback'));
3. Declarative Ontology Mapping
3.1 Mapping Specification
Each plugin ships a mapping_spec.yaml that declaratively maps provider fields to ontology attributes. This file is inspectable by compliance officers and validated at plugin registration time.
# src/integrations/kvk/mapping_spec.yaml
plugin: kvk
version: "1.0.0"
target_schema: "ontology_schema_v3"
entity_mappings:
LegalEntity:
source: "basisprofiel"
identity:
registration_number: "$.kvkNummer"
jurisdiction: "'NL'" # constant
attributes:
legal_name: "$.statutaireNaam"
entity_type:
source: "$.rechtsvorm"
transform: "kvk_rechtsvorm_to_entity_type"
status:
source: "$.materieleRegistratie"
transform: "kvk_registratie_to_status"
incorporation_date: "$.formeleRegistratiedatum"
registration_numbers:
- type: "kvk"
value: "$.kvkNummer"
- type: "vestigingsnummer"
value: "$.hoofdvestiging.vestigingsnummer"
trade_names: "$.handelsnamen[*].naam"
sbi_codes: "$.sbiActiviteiten[*].sbiCode"
total_employees: "$.totaalWerkzamePersonen"
is_secrecy_jurisdiction: "false" # NL is not a secrecy jurisdiction
fatf_grey_list: "false" # NL is not on FATF grey list
eu_high_risk_third_country: "false"
Address:
source: "$.hoofdvestiging.adressen[*]"
filter: "type == 'bezoekadres'"
identity:
external_id: "$.volledigAdres"
attributes:
street: "$.straatnaam"
house_number: "$.huisnummer"
house_number_addition: "$.huisnummerToevoeging"
postal_code: "$.postcode"
city: "$.plaats"
country: "'NL'"
type:
source: "$.type"
transform: "kvk_adres_type_map"
bag_id: "$.bagId"
gps_latitude: "$.gpsLatitude"
gps_longitude: "$.gpsLongitude"
relationship_mappings:
RegisteredAt:
source_entity: "LegalEntity"
target_entity: "Address"
attributes:
relationship_type: "'registered_office'"
is_current: "true"
# Value transforms (referenced above)
transforms:
kvk_rechtsvorm_to_entity_type:
"Besloten Vennootschap": "Private Limited Company (BV)"
"Naamloze Vennootschap": "Public Limited Company (NV)"
"Eenmanszaak": "Sole Proprietorship"
"Vennootschap Onder Firma": "General Partnership (VOF)"
"Commanditaire Vennootschap": "Limited Partnership (CV)"
"Stichting": "Foundation"
"Vereniging": "Association"
"Cooperatie": "Cooperative"
"Maatschap": "Professional Partnership"
_default: "Other"
kvk_adres_type_map:
"bezoekadres": "visiting"
"postadres": "postal"
_default: "other"
kvk_registratie_to_status:
_logic: |
If materieleRegistratie.datumEinde is set → "dissolved"
Else → "active"
3.2 Mapping Validation
At plugin registration (or update), the system validates the mapping spec against the current ontology schema:
- All target attributes exist in
ontology_schema_v3.yaml. - All required ontology fields are either mapped or have a default.
- All transform functions are implemented in the mapper module.
- Test fixtures produce valid ontology output (entity type checks, required field checks).
3.3 Trust Scores
Extend the ontology schema's provider_trust section with KVK-specific field-level scores:
provider_trust:
kvk:
legal_name: 0.99 # Authoritative statutory name
registration_number: 1.00 # The source of truth for KVK numbers
jurisdiction: 1.00 # Always NL
status: 0.98 # Direct from Handelsregister
incorporation_date: 0.99 # Formal registration date
entity_type: 0.99 # Legal form from registry
trade_names: 0.97 # Registered trade names
addresses: 0.95 # BAG-validated addresses
sbi_codes: 0.95 # Registered activity codes
# KVK does NOT provide these, so no trust scores:
# directors, shareholders, financials, events, ownership
4. KVK Provider Implementation
4.1 Client
# src/integrations/kvk/client.py
@register_provider("kvk")
class KVKProvider(DataProvider):
"""
KVK (Dutch Chamber of Commerce) data provider.
Coverage: NL only
Capabilities: company
Trust: 0.97
"""
@property
def name(self) -> str:
return "kvk"
@property
def display_name(self) -> str:
return "KVK (Dutch Chamber of Commerce)"
@property
def supported_countries(self) -> list[str]:
return ["NL"]
@property
def trust_level(self) -> float:
return 0.97
@property
def capabilities(self) -> list[str]:
return ["company"]
async def fetch_company_complete(
self,
registration_number: str,
country_code: str,
**options,
) -> Optional[ProviderResponse]:
"""
Fetch from Basisprofiel + Vestigingsprofiel in a single logical operation.
Strategy:
1. GET /v1/basisprofielen/{kvkNummer}?geoData=True
(includes eigenaar, hoofdvestiging, vestigingen)
2. For each vestigingsnummer in the response, GET
/v1/vestigingsprofielen/{vestigingsnummer}?geoData=True
(parallel, for full address + SBI detail)
3. GET /v1/naamgevingen/{kvkNummer}
(complete trade name history)
4. Bundle all responses into a single ProviderResponse.
Although this is technically 3+ HTTP calls, they are all
against the same registry state and bundled atomically
into one ProviderResponse with one raw_response blob.
"""
...
def map_to_ontology(self, response: ProviderResponse) -> OntologyMapping:
"""Apply mapping_spec.yaml to transform KVK data to ontology format."""
...
async def _test_api_call(self) -> dict:
"""
Hit the test environment with a known fictitious KVK number.
Uses the sandbox at https://api.kvk.nl/test/api/.
"""
...
4.2 API Call Composition
The fetch_company_complete method composes multiple KVK endpoints into one atomic response:
┌───────────────────────────────────────────┐
│ KVKProvider.fetch_company_complete() │
│ │
│ 1. GET basisprofiel/{kvkNummer} │──▶ Core registration data
│ ?geoData=True │
│ │
│ 2. GET vestigingsprofielen/{nr} │──▶ Branch details + BAG/GPS
│ (parallel for each vestigingsnummer) │
│ │
│ 3. GET naamgevingen/{kvkNummer} │──▶ Full trade name history
│ │
│ Bundle into single ProviderResponse │
│ with combined raw_response │
└───────────────────────────────────────────┘
4.3 Error Handling
| KVK HTTP Status | Atlas Behaviour |
|---|---|
| 200 | Parse and return |
| 400 | ProviderError — invalid KVK number format |
| 401 / 403 | ProviderAuthError — API key invalid or expired |
| 404 | ProviderNotFoundError — KVK number not in Handelsregister |
| 429 | ProviderRateLimitError with Retry-After header |
| 500+ | ProviderAPIError — KVK service unavailable; router falls back |
5. Database Schema Changes
5.1 Extend data_provider_countries
-- Authority tier for country routing
ALTER TABLE data_provider_countries
ADD COLUMN authority_tier VARCHAR(20) NOT NULL DEFAULT 'fallback'
CHECK (authority_tier IN ('primary', 'supplementary', 'fallback'));
-- At most one enabled primary provider per country
CREATE UNIQUE INDEX uq_primary_provider_per_country
ON data_provider_countries(country_code)
WHERE authority_tier = 'primary' AND enabled = true;
5.2 Extend data_providers
-- Plugin metadata columns
ALTER TABLE data_providers
ADD COLUMN plugin_version VARCHAR(20),
ADD COLUMN health_status VARCHAR(20) NOT NULL DEFAULT 'unknown'
CHECK (health_status IN ('healthy', 'degraded', 'unknown')),
ADD COLUMN last_health_check TIMESTAMPTZ,
ADD COLUMN mapping_spec JSONB,
ADD COLUMN documentation_url TEXT,
ADD COLUMN test_base_url TEXT,
ADD COLUMN changelog JSONB DEFAULT '[]';
5.3 Seed Data for KVK
INSERT INTO data_providers (
name, display_name, provider_type, enabled,
capabilities, trust_level, plugin_version,
settings, documentation_url, test_base_url
) VALUES (
'kvk',
'KVK (Dutch Chamber of Commerce)',
'company_registry',
false, -- disabled by default, admin must enable and provide credentials
'["company"]',
0.97,
'1.0.0',
'{"api_version": "v1", "rate_limit_rpm": 60, "stale_threshold_days": 30}',
'https://developers.kvk.nl/documentation',
'https://api.kvk.nl/test/api'
);
INSERT INTO data_provider_countries (
provider_id, country_code, priority, coverage_level,
authority_tier, enabled, country_config
) VALUES (
(SELECT id FROM data_providers WHERE name = 'kvk'),
'NL',
10, -- highest priority
'full',
'primary', -- authoritative for NL
false, -- mirrors provider enabled state
'{"registration_format": "8-digit numeric", "search_endpoint": "/v2/zoeken"}'
);
-- Update NorthData's NL entry to supplementary
UPDATE data_provider_countries
SET authority_tier = 'supplementary', priority = 50
WHERE provider_id = (SELECT id FROM data_providers WHERE name = 'northdata')
AND country_code = 'NL';
6. Plugin Management UI
The plugin management interface is accessible from Settings → Data Providers (existing page), enhanced with:
6.1 Plugin List View
Each plugin card shows:
- Display name, version, health status badge
- Country flags for covered countries
- Capability tags (company, person, financials, etc.)
- Enable/disable toggle
- Trust level indicator
6.2 Plugin Detail View
- Overview tab: Manifest metadata, changelog, documentation link.
- Countries tab: Coverage table with authority tier per country. Inline editing of tier (primary / supplementary / fallback).
- Mapping tab: Read-only render of
mapping_spec.yamlshowing source → ontology field mappings. This lets compliance officers verify exactly how external data becomes entity attributes. - Credentials tab: Encrypted credential management (existing pattern).
- Health tab: Last check timestamp, response time trend, recent errors.
- Test tab: Run test connection against sandbox. Run mapping validation against test fixtures.
7. Health Checks
Each enabled plugin runs a periodic health check (configurable, default: every 5 minutes):
async def run_health_check(provider: DataProvider) -> HealthCheckResult:
"""
1. Call provider.test_connection() against the production API.
2. If test_base_url is configured, also verify sandbox connectivity.
3. Record response time, success/failure, and any error details.
4. Update data_providers.health_status and last_health_check.
5. If status transitions to 'degraded', emit an alert event.
"""
Health checks are scheduled via Temporal as a recurring workflow, reusing the existing Temporal infrastructure.
8. Plugin Toggle Behaviour
When an admin disables a plugin:
data_providers.enabled→false- All
data_provider_countriesrows for this provider →enabled = false ProviderRouterimmediately stops routing to this provider- In-flight requests complete but no new ones are dispatched
- Health check scheduling is paused
When an admin enables a plugin:
- Validate that credentials are configured (non-empty
credentialsJSONB) - Run a test connection; if it fails, block enablement with error message
data_providers.enabled→true- Re-enable country rows that were previously enabled
- If any country is claimed as
primaryand another provider already holdsprimaryfor that country, reject enablement with a conflict error - Resume health check scheduling
Ontology Mapping: KVK → Atlas
Entity Mappings
| KVK Source | Atlas Entity Type | Atlas Attribute | Trust |
|---|---|---|---|
kvkNummer | LegalEntity | registration_number | 1.00 |
kvkNummer | LegalEntity | registration_numbers[].{type:"kvk"} | 1.00 |
statutaireNaam | LegalEntity | legal_name | 0.99 |
rechtsvorm (transformed) | LegalEntity | entity_type | 0.99 |
formeleRegistratiedatum | LegalEntity | incorporation_date | 0.99 |
materieleRegistratie.datumEinde | LegalEntity | status | 0.98 |
'NL' (constant) | LegalEntity | jurisdiction | 1.00 |
handelsnamen[*].naam | LegalEntity | trade_names | 0.97 |
sbiActiviteiten[*].sbiCode | LegalEntity | sbi_codes | 0.95 |
totaalWerkzamePersonen | LegalEntity | total_employees | 0.90 |
| Address fields from vestigingsprofiel | Address | street, city, postal_code, country | 0.95 |
bagId from vestigingsprofiel | Address | bag_id | 0.98 |
| GPS from vestigingsprofiel | Address | gps_latitude, gps_longitude | 0.95 |
Relationship Mappings
| Relationship | Source Entity | Target Entity | Attributes |
|---|---|---|---|
RegisteredAt | LegalEntity | Address (bezoekadres) | is_current=true, type=registered_office |
OperatesAt | LegalEntity | Address (per vestiging) | is_current=true, branch_number={vestigingsnummer} |
Unmapped Data (Preserved in extra_data)
KVK fields that do not have a direct ontology attribute are stored in the entity's extra_data JSONB:
indNonMailing→extra_data.kvk.non_mailing_indicatorsbiActiviteiten[*].sbiOmschrijving→extra_data.kvk.sbi_descriptionseigenaar→extra_data.kvk.owner_info(for sole proprietorships)hoofdvestiging.vestigingsnummer→ stored asregistration_numbers[].{type:"vestigingsnummer"}
Navigation
The plugin management UI is located at Settings → Data Providers, which is accessible from the existing Settings page at /settings. No new top-level navigation entries are required.
Within the Settings page, the Data Providers section will be expanded to include the plugin list, detail views, and country-authority management described in Section 6.
Migration Path
Phase 1: Schema & Plugin Infrastructure (1 sprint)
- Add
authority_tiercolumn and unique index todata_provider_countries. - Add plugin metadata columns to
data_providers. - Implement
PluginManifestdataclass and registration validation. - Enhance
ProviderRouterwith authority-tier-aware routing. - Implement mapping spec parser and validator.
- Update NorthData's NL entry to
authority_tier = 'supplementary'.
Phase 2: KVK Provider (1 sprint)
- Implement
KVKProviderclient with Basisprofiel + Vestigingsprofiel + Naamgeving composition. - Write
mapping_spec.yamlwith all field mappings and transforms. - Create
KVKMapperthat applies the declarative mapping spec. - Build test fixtures from KVK sandbox responses.
- Integration tests against KVK test environment.
- Seed migration for KVK provider and country entry.
Phase 3: Plugin UI & Health (1 sprint)
- Extend Settings → Data Providers with plugin cards, detail view, and country-authority management.
- Implement mapping spec viewer (read-only YAML render in Mapping tab).
- Build health check Temporal workflow.
- Add enable/disable toggle with credential and conflict validation.
- End-to-end test: search NL company → KVK primary + NorthData supplementary → merged ontology entity.
Total Effort: ~3 sprints
Testing Strategy
Unit Tests
mapping_spec.yamlvalidation againstontology_schema_v3.yaml- Transform functions:
kvk_rechtsvorm_to_entity_type,kvk_adres_type_map,kvk_registratie_to_status - Authority tier unique index enforcement (duplicate
primaryrejection) - Plugin toggle logic (credential check, conflict detection, state transitions)
Integration Tests (KVK Sandbox)
- Basisprofiel fetch for known test KVK numbers → assert valid
ProviderResponse - Vestigingsprofiel fetch → assert address and SBI data
- Naamgeving fetch → assert trade name list
- Full
fetch_company_complete→ assert composed response map_to_ontologyon sandbox data → assert validOntologyMapping- End-to-end: KVK (
primary) + NorthData (supplementary) merge → assert trust-weighted survivorship
Fixture-Based Mapping Tests
Each plugin ships test_fixtures/ with saved API responses and expected ontology output. CI validates that the mapper produces the expected output — no live API call needed.
Security & Compliance
- Credential encryption: KVK API keys stored in
data_providers.credentialsusing the existing encrypted JSONB pattern. - Audit trail: All KVK API responses stored in
data_provider_responseswith full request/response context. - Data minimisation: Only fetch data Atlas actually uses. The mapping spec makes this auditable — unmapped fields are not indexed or searchable, only preserved in raw response.
- GDPR: KVK data is public commercial register data (Handelsregister). No personal data is returned for BV/NV entities. For sole proprietorships (eenmanszaak), the
eigenaarfield may contain personal information — this is stored in raw response only, not promoted to Person entities unless a dedicated mapper is added. - Sandbox isolation: Test environment uses fictitious data and a shared API key. Production credentials are never used in automated tests.
Rejected Alternatives
1. Hot-loadable Plugin Packages (Dynamic Import)
Considered allowing plugins to be installed as separate Python packages at runtime (e.g., pip install atlas-plugin-kvk). Rejected because:
- The deployment complexity outweighs the benefit for the current team size.
- Version compatibility between plugin and core is hard to enforce.
- All current providers are first-party; third-party plugins are not on the roadmap.
Dynamic loading can be reconsidered when Atlas supports a plugin marketplace.
2. Single Provider per Country (Strict Exclusivity)
Considered enforcing that each country has exactly one provider. Rejected because:
- KVK provides only
companydata; NorthData providesperson,financials, andownershipfor NL. - Most jurisdictions will need a registry provider (authoritative identity) combined with a data aggregator (rich context).
- The authority-tier model achieves the user's intent (clear ownership of who is the "official" source) without losing composability.
3. GraphQL Federation for Provider Composition
Considered using GraphQL to compose provider responses. Rejected because:
- Atlas's API is REST/FastAPI; adding a GraphQL layer introduces unnecessary complexity.
- The ProviderRouter already handles composition and fallback effectively.
- The mapping spec YAML provides a simpler, more auditable alternative.