Backend architecture
The backend is a FastAPI application under src/. It is organised by responsibility into
subpackages that map closely onto the planes described in the system overview.
Package map
Subpackages
| Package | Responsibility | Key files |
|---|---|---|
api/ | HTTP layer: app setup, middleware, 26 routers | main.py, auth.py, *_router.py |
database/ | asyncpg pool + ~21 repositories | __init__.py, entity_repository.py, repositories.py |
ontology/ | Entity resolution, reconciliation, canonical synthesis | entity_resolution.py, reconciliation.py, survivorship.py, schema_loader.py |
integrations/ | Plugin loading, provider routing, credentials | plugin_loader.py, provider_router.py, translation_registry.py |
graph/ | Neo4j/AGE clients and SQL→graph sync | neo4j_client.py, neo4j_sync.py, parity_service.py |
temporal/ | Durable workflows + activities | workflows.py, activities.py, client.py, worker.py |
mutation_queue/ | Mutation & conflict provenance | repository.py, conflict_repository.py, router.py |
risk_matrix/ | Configurable risk evaluation engine | scorer.py, router.py, batch_workflow.py |
risk_scoring.py | Legacy module-level scoring | — |
reference_data/ | Versioned reference datasets | repository.py, resolver.py, seed.py |
source_contracts/ | Declarative provider field contracts | router.py |
ontology_schema/ | Schema versioning + field configs | version_manager.py, router.py |
pipelines/ | LangGraph LLM pipelines + model factory | model_factory.py, mcp_client.py |
tools/ | MCP tool adapters for crews | mcp_tools.py |
services/ | Report generation, MinIO, geocoding | report_generator.py, minio_client.py |
security/ | Credential encryption (AES-GCM) | credential_encryption.py |
observability/ | Langfuse tracing lifecycle | — |
settings/ | Pipeline config, MCP status, prompts | pipeline_config.py |
models/ | Pydantic data models | investigation.py, report.py |
events/ | Domain events | domain_events/entity.py |
workflows/ | Experimental low-code workflow engine | router.py, engine/, builder/ |
Application startup
The FastAPI lifespan in src/api/main.py wires the system together at boot. Several steps are
intentionally fail-closed — if the ontology schema or OSINT files cannot load, the app refuses
to start rather than serve a half-initialised system.
Dependency injection
Repositories are created once during startup and injected into handlers via FastAPI Depends:
# Representative — see src/api/utils.py and the *_router.py modules
@router.get("/companies/{company_id}")
async def get_company(
company_id: str,
tenant=Depends(require_tenant), # 401 if tenant missing
repo: CompanyRepository = Depends(get_company_repo),
):
return await repo.get_by_id(company_id) # RLS scopes to tenant
require_tenant guarantees a tenant context exists; the repository's queries are automatically
scoped by Row-Level Security. See
API → Routers for the full router catalogue and
API → Request lifecycle for how a request flows through the middleware.
Error semantics
The API maps domain failures to meaningful HTTP status codes (src/api/error_handlers.py):
| Exception | Status | Meaning |
|---|---|---|
PluginValidationError | 503 | A plugin is mis-authored; the system fails closed |
MissingTenantCredentialsError | 424 | A required provider credential is not configured for the tenant |
HTTPException | 4xx/5xx | Explicit handler errors |
| unhandled | 500 | Unexpected error |