Feature catalog
An exhaustive, categorised catalog of what TrustRelay Atlas does today, derived from a code-level capability audit. Each area links to its detailed page and to the relevant glossary terms. For capabilities planned but not yet shipped, see the Product roadmap.
Ingestion & data providers
Detailed: Ingestion pipelines · Plugins · Source contracts
- Declarative plugin architecture (
plugin.yaml,mapping_spec.yaml,client.py,mapper.py,transforms.py, tests); a mis-authored plugin fails closed (HTTP 503). - Two acquisition modes: sync (HTTP inside a Temporal activity) and async/investigation (a long-running Temporal child workflow agentic loop).
- Three bundled providers + a reference template: kvk (NL registry, sync), northdata (financials/ownership DE·AT·NL, sync), osint (7 crew modules, async),
_example. - TranslationRegistry builds a frozen
(plugin, source_field) → (entity_type, target_field, confidence)index at boot and hard-fails boot on schema drift / unknown fields / wrongtarget_schema. - ProviderRouter five-branch tenant-scoped credential resolution chain (→
MissingTenantCredentialsErrorHTTP 424); never reads env vars from a plugin. - Multi-provider trust merge (
fetch_company_complete): primary / supplementary (parallel) / fallback (sequential) tiers; highest-trust value per field; records a conflict when two providers within a 0.02 trust delta disagree. - Provider runs recorded with status + entity counts; a
stale_checkavoids re-fetching fresh data; per-provider rate limits and timeouts. - Source contracts bound which ontology fields each provider may emit (versioned).
Ontology & entity resolution
Detailed: Ontology · Entity resolution · Data model
- Versioned YAML ontology (v3.5.2) loaded into PostgreSQL at startup via
SchemaCache; app fails to boot if it cannot load; a Flyway migration publishes it byte-equal with a CI drift check. - 8 entity types (LegalEntity, Person, Address, Document, Domain, SanctionsMatch, PEPExposure, AdverseMedia) and 7 relationship types (Directorship, Ownership, RegisteredAt, DocumentsEntity, OwnsDomain, MatchedTo, MentionedIn).
- Entity resolution pipeline: blocking → candidate query → similarity scoring → three-way decision (MERGE ≥ 0.95 / NEEDS_REVIEW ≥ 0.60 / CREATE < 0.40).
- Blocking with type-specific keys (company-name normalisation strips 65+ legal suffixes + diacritics; person normalisation handles Turkish/Polish/Danish characters).
- Similarity scoring combining Jaro-Winkler, Levenshtein, token-set Jaccard, normalised-name exact-boost; field weights name 0.40 / id 0.30 / attr 0.15 / structural 0.15.
- Person matcher (
is_same_person_v2): DOB-conflict veto, DOB exact-match boost, role-at-company boost, middle-initial conflict blocking, 5 DOB formats,name_only_matchaudit signal. - Address reconciliation: Haversine clustering at 500 m, golden-record selection, never merges addresses across companies.
- Person de-duplication collapsing near-duplicates.
- UBO / ownership-chain computation: 25% baseline / 10% strict thresholds, max depth 10, circular-ownership detection.
- Multi-source disagreement as a first-class concept (ADR-017) with explicit missing-data signals.
Claims, survivorship & provenance
Detailed: Claims & survivorship · Mutation queue
- Per-attribute claims model (
entity_claims): every attribute can carry multiple provider claims with source, confidence, trust,retrieved_at. - Six survivorship strategies:
most_trusted,most_recent,most_complete,combine/aggregate,most_specific,canonical. - Two-level trust weighting: module trust (
cir10 …dfwo3) and provider trust (per-field, e.g. NorthDataregistration_number0.99, KVKpostal_code0.95, OSINT_default0.70; unknown → 0.50). resolve_field_valuepairwise decision;effective_confidence = (field_confidence + module_trust)/2.- Protected fields: PEP & sanctions flags can only be set by authorised screening crews (
spepws,amlrr) and never silently overwritten by registries. - Deal-breaker LLM consolidation for genuinely conflicting multi-value fields, with a deterministic fallback.
recompute_preferred_for_entityflips a singleis_preferredflag atomically (orphan-loser guard).- Canonical read path (
synthesize_entity_attributes) emitsCanonicalAttribute(value, source, trust, updated_at, alternative_claims?)with a single-source omission contract; legacyentity_datafallback. - Every survivorship decision is a mutation (before/after + provider attribution); a conflict repository raises review tasks on significant changes.
- Field-level lineage (
PropertyLineage) traces each canonical value to the provider + claim that produced it (ADR-020).
Risk scoring
Detailed: Risk scoring
- Per-module risk indicators (
risk_rules.py) with severity (critical/high/medium/low) across 11 risk categories (sanctions, pep, adverse_media, ownership, governance, corporate_status, jurisdiction, digital_footprint, regulatory, data_quality, secrecy_jurisdiction). - Configurable no-code risk-matrix engine: author schema (dimensions, factors, grid) → assign companies → evaluate (
risk_matrix_schemas/assignments/evaluations). - Three scoring methods: REFERENCE_LOOKUP (frozen reference dataset; max/avg/any_above), BOOLEAN, THRESHOLD_RANGES (numeric bucketing); missing method fails loud.
OntologyMatrixMapperresolves factor wire-mappings (e.g.LegalEntity.jurisdiction) from the graph; scores capped atmax_score, normalised per-dimension to 0–100.- Three aggregation methods:
weighted_average,weighted_max(EBA default 0.6·max + 0.4·weighted_avg),highest_dimension. - Escalation ratchet (one-way; a sanctions hit forces ≥ high).
- Deterministic / reproducible scoring: four SHA-256 hashes (input / override / evaluation-fingerprint / output) with idempotent cached results.
- Frozen reference-data snapshots at publish time so historical scores stay stable.
- Batch portfolio re-evaluation (
BatchReEvaluationWorkflow) with per-company retry/timeout and queryable progress; live-preview + migration heatmap of tier changes (ADR-018); EBA-aligned matrix support (ADR-019).
Investigations & OSINT modules
Detailed: Temporal workflows · What is Atlas
- Seven specialised OSINT modules run in parallel per investigation: CIR (company info), ROA (address), MEBO (management & UBO), SPEPWS (sanctions/PEP/watchlist), AMLRR (AML risk & adverse media), DFWO (digital footprint), FRLS (financial/regulatory/legal signals).
- Each module is an agentic crew (think → call-tool → observe) with its own agent YAML, system/user/analysis prompts, model config (OpenRouter), and tool assignments.
- MCP tool integration: Brightdata web scraping (incl. LinkedIn datasets), Exa semantic search, Tavily, Google Maps, an internal Digital-Footprint server.
- Investigation lifecycle: focused investigation, summary, rerun, cancel, delete, per-module transcripts, ontology view, logs + log summary, crew-activity, temporal-activity.
- Module output is bound to the ontology JSON contract (entities, relationships, risk_indicators, summary with
data_quality_score) and validated before resolution. - Evidence capture (
EvidenceRepository, evidence mapper) with source + status; retention up to 200K chars/tool-result. - Durable execution on Temporal; live progress panels (CrewActivityPanel, TemporalActivityPanel); per-tenant model instantiation (no global LLM key).
Knowledge graph
Detailed: Graph sync
- Dual-store: PostgreSQL system-of-record + Neo4j downstream read projection (
Neo4jSyncService). - Graph queries: company graph, entity-type schema, visualisation, ownership-chain, shortest path, common connections, centrality.
- Neo4j traversals: UBOs, ownership chains, connections, full entity view, risk-network, shared addresses, shared directors, address proximity, company stats.
- Sync orchestration (full / per-investigation / per-company / clean-all) + sync-candidate listing.
- Parity service auditing PostgreSQL ↔ Neo4j divergence (overall, by-type, per-investigation).
- Cypher query generation (
cypher_generator), AGE client alongside the Neo4j client. - Frontend GraphExplorer + cytoscape graph components;
RiskNetworkGraph.
Reporting
Detailed: Reporting
- Server-side report generation from the canonical graph → HTML/PDF via Jinja2 templates (cover, sanctions table, risk grid, key-value grids, ownership tree, section sheets, sparklines).
- PII sanitisation before output (
report_sanitizer.py). - Reports stored in MinIO; async generation returns 202 Accepted;
useReportpolls until ready. - ReportView tabs: Overview (risk gauge, critical findings, recommendations), Modules, Evidence, Ontology graph, Lineage (field provenance), Workflow timeline, Raw JSON.
- Ontology export router producing per-company PDF/DOCX; client-side PDF via
@react-pdf/rendererfor some views.
Analytics
Detailed: Risk scoring · Frontend
- Operations analytics endpoint +
useOperationsAnalytics; Analytics page. - Portfolio risk analytics: PortfolioRiskHeatmap, PortfolioRiskPieChart, PortfolioSpiderChart, RiskByCategory / Jurisdiction / Severity, RiskTimeline, RiskIndicatorTable, RiskOverview, CompanyRiskCard, PortfolioStatusPieChart.
- Company stats summary, company timeline, portfolio-status tracking with history.
- Metrics router +
metrics_service;quality_scorerfor data-quality scoring.
Multi-tenancy & security
Detailed: Security & multi-tenancy
- PostgreSQL Row-Level Security via a restricted
atlas_approle withFORCE ROW LEVEL SECURITY; fail-closed (unset tenant GUC → zero rows); the pool refuses to boot ifatlas_appauth fails (unless an emergency flag is set). - Keycloak realm-per-tenant OIDC;
PlatformAuthMiddlewarevalidates RS256 JWTs (sig/exp/aud/issuer), buildsAuthContext, setsapp.current_tenant_idper request; 60 sTenantCache. - Role-based authorisation (admin, analyst, viewer, workflow_editor); Studio/Settings admin-only; workflow-level RBAC.
- AES-256-GCM credential encryption with per-tenant HKDF-SHA256 subkeys (cross-tenant decrypt →
InvalidTag), versionedkey_idfor rotation, master key in process memory only. - Tenant-scoped credential repository + audit (audit / status / schema routers); rate limiting via slowapi + Redis keyed on
{tenant_id}:{ip}→ 429. - Tenant + role admin + public tenant-resolution routers; request- and background-scoped tenant DB sessions; frontend
keycloak-jswith PKCE over a same-origin proxy.
Studio & no-code configuration
Detailed: Extending Atlas · Ontology
The Studio lets compliance engineers configure the platform without code:
- Mappings (MappingDesigner): visual source-field → ontology-field wiring with confidence, merge-strategy badges, conflict shield, drag-to-connect ports.
- Sources: data-provider + source-contract configuration; provider credentials via plugin JSON-Schema forms.
- Evaluations: run/preview risk-matrix evaluations and portfolio scoring.
- Workflows: the low-code workflow authoring surface.
- Ontology schema management: list/create/get/update/activate/delete schema versions, validate, active + parsed views, version diff/history.
- Data segments: CRUD, toggle, activate, deactivate-all.
- Risk-matrix authoring (DimensionEditor, FactorEditor, RiskLevelEditor, AggregationConfig, OntologyMappingEditor, MatrixPreview).
- Agent/crew configuration: per-agent config, available tools + assignments, agent prompts, model configs, crew-LLM settings, pipeline config; LLM model catalog (sync/local/providers); MCP server registry + health checks.
- Reference data management: FATF black/grey lists, EU high-risk third countries, EU tax blacklist, secrecy jurisdictions, industry/product risk taxonomies, PEP tiers, UBO thresholds, sanctions defaults.
- SchemaWizard / builder: generate workflow & ontology schema from documents (capability mapper, requirement extractor, schema generator, semantic validator).
API surface
Detailed: API overview · Routers
- ~30 FastAPI domain routers behind a same-origin
/apiproxy: health, data_providers, data_provider_credentials, settings, auth, role_admin, tenant_admin, public_tenant, risk, ontology (+ entities / reconciliation / resolution / company / export), temporal, graph, metrics, analytics, investigation, report, evidence, lineage, company, people, entity_preview, mutation, source_contract, reference_data, risk_matrix, pipeline. - Company endpoints (stats summary, bulk create, portfolio-status patch, ontology view, timeline, ownership-chain); dedicated people + entity-preview routers.
- Async report generation (202 Accepted); Temporal router (workflow start/cancel/history); request sanitisation, rate-limit middleware, user-message localisation, OpenAPI docs.
Frontend (Atlas Console)
Detailed: Frontend
- React 18 + TypeScript + Vite + Palantir Blueprint + React Query + Zustand.
- Pages: Dashboard, Companies, CompanyDetailV2, Investigations (list/detail/new/focused), GraphExplorer, OntologyExplorer, MappingDesigner, RiskCenter, RiskMatrices, RiskCategories, Analytics, Reports, TaskInbox, WorkflowBuilder/Execution/Schemas, Studio, SchemaWizard, ApiDocs, Documentation.
- Live investigation panels (CrewActivityPanel, TemporalActivityPanel); report lineage/provenance/ontology/conflict/evidence/timeline views.
- Workflow engine UX (low-code, feature-flagged): PhaseTimeline, PhaseInteraction, ReviewPanel, PortalForm, DocumentUploader, SlaCountdown, DraftStatusIndicator, DegradedModeBanner, TaskRow; visual + YAML builder with preview/publish; draft auto-save.
Compliance & auditability
Detailed: Compliance coverage
- KYB/AML positioning across sanctions, PEP, adverse-media, UBO/beneficial-ownership, secrecy-jurisdiction handling.
- EBA-aligned risk-matrix engine + EU/FATF reference datasets.
- Auditability by design: claims + survivorship + mutations + conflicts + field-level lineage = defensible canonical values; protected screening fields; deterministic, hash-fingerprinted evaluations with frozen reference snapshots; AI-governance / model-lineage provenance (ADR-014).
Not yet in Atlas
The following high-demand capabilities are not in Atlas today — they are tracked, with build difficulty and phasing, on the Product roadmap: managed case lifecycle (queues / maker-checker / disposition), a customer-facing onboarding portal, perpetual-KYC continuous re-screening & change detection, a deterministic licensed-list screening engine, goAML STR/SAR filing, an AI copilot, field-level PII / GDPR DSR tooling, and a bitemporal graph. Real-time transaction monitoring is deliberately out of scope (off-thesis — Atlas is the resolved-ownership system of record that feeds a monitoring product).