Skip to main content

Add an OSINT module (crew)

An OSINT module is an LLM-driven crew that the investigation workflow runs in parallel with the others. Adding one means registering a crew slug, writing the agent + prompts, and declaring how its findings map into the ontology. Read Temporal workflows and Plugins first.

How a module runs

Each module is an activity inside InvestigationWorkflow (src/temporal/workflows.py). The seven modules are dispatched in full parallel (each as its own execute_module activity with a 20-minute timeout and a 3-attempt retry with exponential backoff), then their findings are reconciled. Your new module joins that fan-out.

1. Register the crew slug

Add your slug to the OSINT crew set in src/integrations/plugin_loader.py (the _OSINT_CREW_SLUGS frozenset). This is what tells the loader your crew is a valid OSINT module.

2. Create the agent + prompts

plugins/osint/agents/<crew_name>/
├── __init__.py
├── agent.py # the LangGraph agentic loop for this module
└── prompts/
├── system.md # the crew's instructions
└── tools.md # tool-use guidance

The agent runs a think → call-tool → observe loop. Tool results are truncated to MAX_LLM_TOOL_RESULT_CHARS (12,000) for the model's context, but full results are saved as evidence (MAX_EVIDENCE_TOOL_RESULT_CHARS, 200,000).

3. Declare findings → ontology mapping

In plugins/osint/mapping_spec.yaml, add a crew_extractions.<slug> block describing the entities, relationships, and risk indicators your crew produces:

crew_extractions:
<slug>:
entity_mappings:
LegalEntity:
source: "$"
identity: { registration_number: "$.company.registration_number" }
attributes:
legal_name: { source: "$.company.legal_name", required: true }
nested_extractions: []
relationship_mappings: {}

risk_extractions:
- crew_type: "<slug>"
source_array_field: "$.risk_indicators[*]"
field_mappings:
severity: { source: "$.severity", transform: "to_severity" }

The TranslationRegistry walks these crew_extractions at boot and validates every mapped field against the active ontology — so a typo or a non-existent field fails the boot, not a production run.

4. Register extractions in the ontology registry

Update src/ontology/registry.py with the CrewExtraction / RelationshipExtraction / RiskExtraction entries for your crew so the projector knows how to turn your findings into ontology entities, relationships, and risk indicators.

5. Bind MCP tools

Crews access tools through create_mcp_client(tenant_id), which resolves the tenant's MCP servers (web search, scraping, maps). Tools are tenant-scoped — a tenant without OSINT credentials gets a MissingTenantCredentialsError (HTTP 424) rather than a silent platform fallback. To add a new tool/server, register it (per-tenant via the credentials store for OSINT, or in the platform mcp_servers table for shared tools) and list it in the server's available_tools.

6. Add risk rules

Risk rules for the module live alongside the others (src/temporal/risk_rules.py / src/ontology/risk_rules.py). They pattern-match on the entities/relationships your crew produced and emit RiskIndicators with a severity and one of the eleven risk categories.

Invariants to respect

  • Your crew's output must conform to the ontology's crew output contract (entities, relationships, risk_indicators, summary) — it is validated, and validation failure re-prompts the model up to the iteration limit.
  • The investigation tolerates partial failure: if your module fails, the others still complete.
  • The activity inherits the tenant context; all persistence is RLS-scoped.

Next: Add an entity type if your module needs a new kind of entity.