Skip to main content

Backend architecture

The backend is a FastAPI application under src/. It is organised by responsibility into subpackages that map closely onto the planes described in the system overview.

Package map

Subpackages

PackageResponsibilityKey files
api/HTTP layer: app setup, middleware, 26 routersmain.py, auth.py, *_router.py
database/asyncpg pool + ~21 repositories__init__.py, entity_repository.py, repositories.py
ontology/Entity resolution, reconciliation, canonical synthesisentity_resolution.py, reconciliation.py, survivorship.py, schema_loader.py
integrations/Plugin loading, provider routing, credentialsplugin_loader.py, provider_router.py, translation_registry.py
graph/Neo4j/AGE clients and SQL→graph syncneo4j_client.py, neo4j_sync.py, parity_service.py
temporal/Durable workflows + activitiesworkflows.py, activities.py, client.py, worker.py
mutation_queue/Mutation & conflict provenancerepository.py, conflict_repository.py, router.py
risk_matrix/Configurable risk evaluation enginescorer.py, router.py, batch_workflow.py
risk_scoring.pyLegacy module-level scoring
reference_data/Versioned reference datasetsrepository.py, resolver.py, seed.py
source_contracts/Declarative provider field contractsrouter.py
ontology_schema/Schema versioning + field configsversion_manager.py, router.py
pipelines/LangGraph LLM pipelines + model factorymodel_factory.py, mcp_client.py
tools/MCP tool adapters for crewsmcp_tools.py
services/Report generation, MinIO, geocodingreport_generator.py, minio_client.py
security/Credential encryption (AES-GCM)credential_encryption.py
observability/Langfuse tracing lifecycle
settings/Pipeline config, MCP status, promptspipeline_config.py
models/Pydantic data modelsinvestigation.py, report.py
events/Domain eventsdomain_events/entity.py
workflows/Experimental low-code workflow enginerouter.py, engine/, builder/

Application startup

The FastAPI lifespan in src/api/main.py wires the system together at boot. Several steps are intentionally fail-closed — if the ontology schema or OSINT files cannot load, the app refuses to start rather than serve a half-initialised system.

Dependency injection

Repositories are created once during startup and injected into handlers via FastAPI Depends:

# Representative — see src/api/utils.py and the *_router.py modules
@router.get("/companies/{company_id}")
async def get_company(
company_id: str,
tenant=Depends(require_tenant), # 401 if tenant missing
repo: CompanyRepository = Depends(get_company_repo),
):
return await repo.get_by_id(company_id) # RLS scopes to tenant

require_tenant guarantees a tenant context exists; the repository's queries are automatically scoped by Row-Level Security. See API → Routers for the full router catalogue and API → Request lifecycle for how a request flows through the middleware.

Error semantics

The API maps domain failures to meaningful HTTP status codes (src/api/error_handlers.py):

ExceptionStatusMeaning
PluginValidationError503A plugin is mis-authored; the system fails closed
MissingTenantCredentialsError424A required provider credential is not configured for the tenant
HTTPException4xx/5xxExplicit handler errors
unhandled500Unexpected error