Skip to main content

Feature catalog

An exhaustive, categorised catalog of what TrustRelay Atlas does today, derived from a code-level capability audit. Each area links to its detailed page and to the relevant glossary terms. For capabilities planned but not yet shipped, see the Product roadmap.

How to read this

This catalog is the single index of current capability. Terms in bold-linked form resolve to the glossary; each section header links to the page that explains it in depth. "Today" means present in the codebase now — the roadmap covers what's next.

Ingestion & data providers

Detailed: Ingestion pipelines · Plugins · Source contracts

  • Declarative plugin architecture (plugin.yaml, mapping_spec.yaml, client.py, mapper.py, transforms.py, tests); a mis-authored plugin fails closed (HTTP 503).
  • Two acquisition modes: sync (HTTP inside a Temporal activity) and async/investigation (a long-running Temporal child workflow agentic loop).
  • Three bundled providers + a reference template: kvk (NL registry, sync), northdata (financials/ownership DE·AT·NL, sync), osint (7 crew modules, async), _example.
  • TranslationRegistry builds a frozen (plugin, source_field) → (entity_type, target_field, confidence) index at boot and hard-fails boot on schema drift / unknown fields / wrong target_schema.
  • ProviderRouter five-branch tenant-scoped credential resolution chain (→ MissingTenantCredentialsError HTTP 424); never reads env vars from a plugin.
  • Multi-provider trust merge (fetch_company_complete): primary / supplementary (parallel) / fallback (sequential) tiers; highest-trust value per field; records a conflict when two providers within a 0.02 trust delta disagree.
  • Provider runs recorded with status + entity counts; a stale_check avoids re-fetching fresh data; per-provider rate limits and timeouts.
  • Source contracts bound which ontology fields each provider may emit (versioned).

Ontology & entity resolution

Detailed: Ontology · Entity resolution · Data model

  • Versioned YAML ontology (v3.5.2) loaded into PostgreSQL at startup via SchemaCache; app fails to boot if it cannot load; a Flyway migration publishes it byte-equal with a CI drift check.
  • 8 entity types (LegalEntity, Person, Address, Document, Domain, SanctionsMatch, PEPExposure, AdverseMedia) and 7 relationship types (Directorship, Ownership, RegisteredAt, DocumentsEntity, OwnsDomain, MatchedTo, MentionedIn).
  • Entity resolution pipeline: blocking → candidate query → similarity scoring → three-way decision (MERGE ≥ 0.95 / NEEDS_REVIEW ≥ 0.60 / CREATE < 0.40).
  • Blocking with type-specific keys (company-name normalisation strips 65+ legal suffixes + diacritics; person normalisation handles Turkish/Polish/Danish characters).
  • Similarity scoring combining Jaro-Winkler, Levenshtein, token-set Jaccard, normalised-name exact-boost; field weights name 0.40 / id 0.30 / attr 0.15 / structural 0.15.
  • Person matcher (is_same_person_v2): DOB-conflict veto, DOB exact-match boost, role-at-company boost, middle-initial conflict blocking, 5 DOB formats, name_only_match audit signal.
  • Address reconciliation: Haversine clustering at 500 m, golden-record selection, never merges addresses across companies.
  • Person de-duplication collapsing near-duplicates.
  • UBO / ownership-chain computation: 25% baseline / 10% strict thresholds, max depth 10, circular-ownership detection.
  • Multi-source disagreement as a first-class concept (ADR-017) with explicit missing-data signals.

Claims, survivorship & provenance

Detailed: Claims & survivorship · Mutation queue

  • Per-attribute claims model (entity_claims): every attribute can carry multiple provider claims with source, confidence, trust, retrieved_at.
  • Six survivorship strategies: most_trusted, most_recent, most_complete, combine/aggregate, most_specific, canonical.
  • Two-level trust weighting: module trust (cir 10 … dfwo 3) and provider trust (per-field, e.g. NorthData registration_number 0.99, KVK postal_code 0.95, OSINT _default 0.70; unknown → 0.50).
  • resolve_field_value pairwise decision; effective_confidence = (field_confidence + module_trust)/2.
  • Protected fields: PEP & sanctions flags can only be set by authorised screening crews (spepws, amlrr) and never silently overwritten by registries.
  • Deal-breaker LLM consolidation for genuinely conflicting multi-value fields, with a deterministic fallback.
  • recompute_preferred_for_entity flips a single is_preferred flag atomically (orphan-loser guard).
  • Canonical read path (synthesize_entity_attributes) emits CanonicalAttribute(value, source, trust, updated_at, alternative_claims?) with a single-source omission contract; legacy entity_data fallback.
  • Every survivorship decision is a mutation (before/after + provider attribution); a conflict repository raises review tasks on significant changes.
  • Field-level lineage (PropertyLineage) traces each canonical value to the provider + claim that produced it (ADR-020).

Risk scoring

Detailed: Risk scoring

  • Per-module risk indicators (risk_rules.py) with severity (critical/high/medium/low) across 11 risk categories (sanctions, pep, adverse_media, ownership, governance, corporate_status, jurisdiction, digital_footprint, regulatory, data_quality, secrecy_jurisdiction).
  • Configurable no-code risk-matrix engine: author schema (dimensions, factors, grid) → assign companies → evaluate (risk_matrix_schemas/assignments/evaluations).
  • Three scoring methods: REFERENCE_LOOKUP (frozen reference dataset; max/avg/any_above), BOOLEAN, THRESHOLD_RANGES (numeric bucketing); missing method fails loud.
  • OntologyMatrixMapper resolves factor wire-mappings (e.g. LegalEntity.jurisdiction) from the graph; scores capped at max_score, normalised per-dimension to 0–100.
  • Three aggregation methods: weighted_average, weighted_max (EBA default 0.6·max + 0.4·weighted_avg), highest_dimension.
  • Escalation ratchet (one-way; a sanctions hit forces ≥ high).
  • Deterministic / reproducible scoring: four SHA-256 hashes (input / override / evaluation-fingerprint / output) with idempotent cached results.
  • Frozen reference-data snapshots at publish time so historical scores stay stable.
  • Batch portfolio re-evaluation (BatchReEvaluationWorkflow) with per-company retry/timeout and queryable progress; live-preview + migration heatmap of tier changes (ADR-018); EBA-aligned matrix support (ADR-019).

Investigations & OSINT modules

Detailed: Temporal workflows · What is Atlas

  • Seven specialised OSINT modules run in parallel per investigation: CIR (company info), ROA (address), MEBO (management & UBO), SPEPWS (sanctions/PEP/watchlist), AMLRR (AML risk & adverse media), DFWO (digital footprint), FRLS (financial/regulatory/legal signals).
  • Each module is an agentic crew (think → call-tool → observe) with its own agent YAML, system/user/analysis prompts, model config (OpenRouter), and tool assignments.
  • MCP tool integration: Brightdata web scraping (incl. LinkedIn datasets), Exa semantic search, Tavily, Google Maps, an internal Digital-Footprint server.
  • Investigation lifecycle: focused investigation, summary, rerun, cancel, delete, per-module transcripts, ontology view, logs + log summary, crew-activity, temporal-activity.
  • Module output is bound to the ontology JSON contract (entities, relationships, risk_indicators, summary with data_quality_score) and validated before resolution.
  • Evidence capture (EvidenceRepository, evidence mapper) with source + status; retention up to 200K chars/tool-result.
  • Durable execution on Temporal; live progress panels (CrewActivityPanel, TemporalActivityPanel); per-tenant model instantiation (no global LLM key).

Knowledge graph

Detailed: Graph sync

  • Dual-store: PostgreSQL system-of-record + Neo4j downstream read projection (Neo4jSyncService).
  • Graph queries: company graph, entity-type schema, visualisation, ownership-chain, shortest path, common connections, centrality.
  • Neo4j traversals: UBOs, ownership chains, connections, full entity view, risk-network, shared addresses, shared directors, address proximity, company stats.
  • Sync orchestration (full / per-investigation / per-company / clean-all) + sync-candidate listing.
  • Parity service auditing PostgreSQL ↔ Neo4j divergence (overall, by-type, per-investigation).
  • Cypher query generation (cypher_generator), AGE client alongside the Neo4j client.
  • Frontend GraphExplorer + cytoscape graph components; RiskNetworkGraph.

Reporting

Detailed: Reporting

  • Server-side report generation from the canonical graph → HTML/PDF via Jinja2 templates (cover, sanctions table, risk grid, key-value grids, ownership tree, section sheets, sparklines).
  • PII sanitisation before output (report_sanitizer.py).
  • Reports stored in MinIO; async generation returns 202 Accepted; useReport polls until ready.
  • ReportView tabs: Overview (risk gauge, critical findings, recommendations), Modules, Evidence, Ontology graph, Lineage (field provenance), Workflow timeline, Raw JSON.
  • Ontology export router producing per-company PDF/DOCX; client-side PDF via @react-pdf/renderer for some views.

Analytics

Detailed: Risk scoring · Frontend

  • Operations analytics endpoint + useOperationsAnalytics; Analytics page.
  • Portfolio risk analytics: PortfolioRiskHeatmap, PortfolioRiskPieChart, PortfolioSpiderChart, RiskByCategory / Jurisdiction / Severity, RiskTimeline, RiskIndicatorTable, RiskOverview, CompanyRiskCard, PortfolioStatusPieChart.
  • Company stats summary, company timeline, portfolio-status tracking with history.
  • Metrics router + metrics_service; quality_scorer for data-quality scoring.

Multi-tenancy & security

Detailed: Security & multi-tenancy

  • PostgreSQL Row-Level Security via a restricted atlas_app role with FORCE ROW LEVEL SECURITY; fail-closed (unset tenant GUC → zero rows); the pool refuses to boot if atlas_app auth fails (unless an emergency flag is set).
  • Keycloak realm-per-tenant OIDC; PlatformAuthMiddleware validates RS256 JWTs (sig/exp/aud/issuer), builds AuthContext, sets app.current_tenant_id per request; 60 s TenantCache.
  • Role-based authorisation (admin, analyst, viewer, workflow_editor); Studio/Settings admin-only; workflow-level RBAC.
  • AES-256-GCM credential encryption with per-tenant HKDF-SHA256 subkeys (cross-tenant decrypt → InvalidTag), versioned key_id for rotation, master key in process memory only.
  • Tenant-scoped credential repository + audit (audit / status / schema routers); rate limiting via slowapi + Redis keyed on {tenant_id}:{ip} → 429.
  • Tenant + role admin + public tenant-resolution routers; request- and background-scoped tenant DB sessions; frontend keycloak-js with PKCE over a same-origin proxy.

Studio & no-code configuration

Detailed: Extending Atlas · Ontology

The Studio lets compliance engineers configure the platform without code:

  • Mappings (MappingDesigner): visual source-field → ontology-field wiring with confidence, merge-strategy badges, conflict shield, drag-to-connect ports.
  • Sources: data-provider + source-contract configuration; provider credentials via plugin JSON-Schema forms.
  • Evaluations: run/preview risk-matrix evaluations and portfolio scoring.
  • Workflows: the low-code workflow authoring surface.
  • Ontology schema management: list/create/get/update/activate/delete schema versions, validate, active + parsed views, version diff/history.
  • Data segments: CRUD, toggle, activate, deactivate-all.
  • Risk-matrix authoring (DimensionEditor, FactorEditor, RiskLevelEditor, AggregationConfig, OntologyMappingEditor, MatrixPreview).
  • Agent/crew configuration: per-agent config, available tools + assignments, agent prompts, model configs, crew-LLM settings, pipeline config; LLM model catalog (sync/local/providers); MCP server registry + health checks.
  • Reference data management: FATF black/grey lists, EU high-risk third countries, EU tax blacklist, secrecy jurisdictions, industry/product risk taxonomies, PEP tiers, UBO thresholds, sanctions defaults.
  • SchemaWizard / builder: generate workflow & ontology schema from documents (capability mapper, requirement extractor, schema generator, semantic validator).

API surface

Detailed: API overview · Routers

  • ~30 FastAPI domain routers behind a same-origin /api proxy: health, data_providers, data_provider_credentials, settings, auth, role_admin, tenant_admin, public_tenant, risk, ontology (+ entities / reconciliation / resolution / company / export), temporal, graph, metrics, analytics, investigation, report, evidence, lineage, company, people, entity_preview, mutation, source_contract, reference_data, risk_matrix, pipeline.
  • Company endpoints (stats summary, bulk create, portfolio-status patch, ontology view, timeline, ownership-chain); dedicated people + entity-preview routers.
  • Async report generation (202 Accepted); Temporal router (workflow start/cancel/history); request sanitisation, rate-limit middleware, user-message localisation, OpenAPI docs.

Frontend (Atlas Console)

Detailed: Frontend

  • React 18 + TypeScript + Vite + Palantir Blueprint + React Query + Zustand.
  • Pages: Dashboard, Companies, CompanyDetailV2, Investigations (list/detail/new/focused), GraphExplorer, OntologyExplorer, MappingDesigner, RiskCenter, RiskMatrices, RiskCategories, Analytics, Reports, TaskInbox, WorkflowBuilder/Execution/Schemas, Studio, SchemaWizard, ApiDocs, Documentation.
  • Live investigation panels (CrewActivityPanel, TemporalActivityPanel); report lineage/provenance/ontology/conflict/evidence/timeline views.
  • Workflow engine UX (low-code, feature-flagged): PhaseTimeline, PhaseInteraction, ReviewPanel, PortalForm, DocumentUploader, SlaCountdown, DraftStatusIndicator, DegradedModeBanner, TaskRow; visual + YAML builder with preview/publish; draft auto-save.

Compliance & auditability

Detailed: Compliance coverage

  • KYB/AML positioning across sanctions, PEP, adverse-media, UBO/beneficial-ownership, secrecy-jurisdiction handling.
  • EBA-aligned risk-matrix engine + EU/FATF reference datasets.
  • Auditability by design: claims + survivorship + mutations + conflicts + field-level lineage = defensible canonical values; protected screening fields; deterministic, hash-fingerprinted evaluations with frozen reference snapshots; AI-governance / model-lineage provenance (ADR-014).

Not yet in Atlas

The following high-demand capabilities are not in Atlas today — they are tracked, with build difficulty and phasing, on the Product roadmap: managed case lifecycle (queues / maker-checker / disposition), a customer-facing onboarding portal, perpetual-KYC continuous re-screening & change detection, a deterministic licensed-list screening engine, goAML STR/SAR filing, an AI copilot, field-level PII / GDPR DSR tooling, and a bitemporal graph. Real-time transaction monitoring is deliberately out of scope (off-thesis — Atlas is the resolved-ownership system of record that feeds a monitoring product).