Security & multi-tenancy
Atlas is multi-tenant by design, and tenant isolation is enforced where it is hardest to bypass: in the database, with PostgreSQL Row-Level Security (RLS). Authentication is delegated to Keycloak, provider credentials are encrypted per tenant, and the architecture is fail-closed — a missing tenant context denies access rather than leaking data. See ADR-022 and ADR-024.
Three layers of isolation
- Identity — each tenant has its own Keycloak realm. A JWT carries the tenant and user.
- Application —
PlatformAuthMiddlewarevalidates the JWT, extracts the tenant, and stores it onrequest.state.require_tenantdenies any request without a tenant context. - Database — the request's tenant is set on the connection (
SET app.current_tenant), and RLS policies filter every query. Even a bug in a handler cannot return another tenant's rows.
Request → tenant → RLS
The application connects through a restricted atlas_app role specifically so that RLS is
enforced — a superuser connection would bypass policies. A guarded
allow_owner_db_fallback flag exists only for emergency recovery.
Why isolation can't leak
If tenant B asks for a record that belongs to tenant A, the RLS predicate appended by PostgreSQL returns zero rows and the API responds 404 — there is no code path that returns the row.
Authentication & authorization
- Authentication — Keycloak OIDC; the backend validates JWTs
(
src/integrations/keycloak_client.py,src/api/auth.py). The frontend useskeycloak-jswith PKCE over a same-origin proxy. - Authorization — role-based. Roles such as
admin,analyst,viewer, andworkflow_editorgate endpoints; Studio/Settings are admin-only. Workflow-level RBAC lives insrc/workflows/auth.py.
Credential encryption
Provider and LLM credentials are stored encrypted at rest in data_provider_credentials and
decrypted only when a client is instantiated. EnvKeyEncryptor (src/security/credential_encryption.py)
uses AES-GCM with a key from the environment.
Credential resolution is tenant-scoped with a fallback chain; if no usable credential exists for a
required provider, the system raises MissingTenantCredentialsError (HTTP 424) rather than
falling back to someone else's key. See Plugins.
Defense-in-depth summary
| Control | Mechanism |
|---|---|
| Tenant data isolation | PostgreSQL RLS via restricted atlas_app role |
| Identity | Keycloak realm per tenant, OIDC/JWT |
| Authorization | Role-based endpoint guards |
| Secrets at rest | AES-GCM credential encryption |
| Fail-closed | require_tenant, schema/registry boot checks, 424 on missing creds |
| Rate limiting | slowapi middleware + Redis |
| Transport | nginx ingress + TLS (cert-manager) |
Deep dive: enforcement internals
This reflects src/database/connection.py, src/api/auth.py, src/api/rate_limit.py, and
src/security/credential_encryption.py.
The fail-closed pool
The pool authenticates as the restricted atlas_app role, against which tenant tables have
FORCE ROW LEVEL SECURITY — so even the application role cannot bypass RLS.
If the atlas_app credentials fail and the emergency ALLOW_OWNER_DB_FALLBACK flag is not set,
the service refuses to boot rather than silently falling back to an RLS-bypassing owner role.
Per-request tenant binding
A 60-second TenantCache avoids a tenant lookup on every request. Exempt paths (/health,
/tenants/resolve, docs) skip auth. Rate limiting keys on {tenant_id}:{ip} (falling back to
anon:{ip}), backed by Redis, returning 429 with Retry-After.
Credential encryption — AES-256-GCM + HKDF per tenant
Each tenant gets a derived subkey, so ciphertext from one tenant is cryptographically undecryptable with another tenant's context:
EncryptedPayloadstoresnonce || ciphertext+tagplus akey_id(versioned for rotation).- Cross-tenant decrypt uses the wrong subkey →
InvalidTag(the decrypt fails); the same happens on tampering. Akey_idmismatch raises a distinctValueErrorso rotation is unambiguous. - The master key lives in process memory only — never logged, never written to disk.
Invariants
- No tenant context ⇒ no rows. An unset
app.current_tenant_idmakes RLS-protected queries return empty, never another tenant's data. - atlas_app cannot bypass RLS (FORCE RLS); only the explicit emergency flag changes that, with a loud warning.
- Tenant id is required, never defaulted — missing it is an explicit error, not a silent fallback.
To add a tenant-scoped endpoint, see Add a frontend feature → backend.