Defensive Intelligence

Security

Built for sensitive environments.

OBEL is not a wrapper. It's a security layer between your team and every AI model — with enforced controls that can't be bypassed from the outside.

The problem

Most organisations don't know what their teams are sending to AI.

When you give a team direct access to an AI model — through a web interface or an API key — you lose visibility over what's being submitted. Employees paste customer records, financial data, internal credentials, and sensitive correspondence into chat windows without thinking twice.

The model provider sees it. The prompt logs capture it. And unless something is actively scrubbing and auditing every interaction, your organisation has no record of what left — and no way to stop it happening again.

PII leakage

Customer names, tax file numbers, Medicare details, and contact information are routinely pasted into prompts — sent in plaintext to third-party model providers.

Secret exposure

API keys, database connection strings, and internal credentials appear in prompts when developers ask models for help with code. Once sent, you can't unsend them.

Classification violations

Documents marked PROTECTED or SENSITIVE are summarised, translated, or analysed by models with no visibility into what classification level the content carries.

No audit trail

Without a record of what was submitted to which model and when, incident response is guesswork. You can't investigate what you didn't log.

How it works

Every request passes through a security pipeline before it reaches any model.

OBEL sits between your users and the AI providers. Nothing reaches a model that hasn't been scrubbed, classified, and authorised. The pipeline is sequential and non-bypassable — each step must complete before the next begins. If any step fails, the request is denied.

01

User submits a prompt

A team member types a message in the OBEL interface. At this point, the input is untrusted — it may contain PII, secrets, classified references, or prompt injection attempts. Nothing is assumed to be clean.

02

PII scrubber runs

The scrubber scans the raw text against a rule set covering Australian government identifiers, financial data, contact information, and secrets. Every match is replaced with a typed placeholder. The original value is logged to a security event — never forwarded.

03

ARGUS-i™ classifies the content

The scrubbed text is passed to the ARGUS-i classification engine, which assigns a protective marking based on the content. PROTECTED and above are blocked before inference begins. The classification result and rationale are written to the audit vault.

04

Cost and quota are checked

Before any model call is made, the user's remaining budget is verified. If they've reached their limit, the request is blocked. No inference runs and no charge is incurred. This prevents surprise spend overruns and enforces organisational policy automatically.

05

The scrubbed prompt is sent to the model

Only at this point — after scrubbing, classification, and authorisation — does OBEL forward the request to the selected model. The model receives the cleaned text with placeholders, not the original sensitive content.

06

The response is returned and the audit record is written

The model response is returned to the user immediately. In parallel, OBEL commits an audit record to the tamper-evident GitHub vault — capturing the session ID, classification result, scrubber hits, model used, and token cost. This is non-blocking: the audit write never delays the response.

Design principles

Security decisions that were made at the architecture level, not bolted on after.

FAIL-SHUT, always

Every security gate in OBEL is designed to deny by default. If the classifier encounters an error, the request is blocked — not passed through. If the scrubber can't complete, the prompt never leaves. There is no graceful degradation that opens a hole. We call this FAIL-SHUT, and it is non-negotiable across every layer of the pipeline.

Scrub first, ask nothing

PII scrubbing happens before anything else — before classification, before cost checks, before the LLM is selected. Your users never need to remember to redact sensitive information. The scrubber identifies and replaces names, government identifiers, financial details, contact data, and secrets automatically. The original values are never forwarded.

Every action is recorded

OBEL maintains an append-only audit trail for every prompt, classification decision, scrubber hit, and cost event. Audit records are committed to a private GitHub repository — making them tamper-evident by design. If a record were altered after the fact, the commit chain would break. This is intentional: your audit trail should be something you can stake a compliance report on.

Isolation is structural, not policy

Multi-tenancy is enforced at the database layer through row-level security, not application-level filtering. Every query is scoped to the authenticated organisation before it reaches the database. A bug in application code cannot expose another tenant's data, because the database itself refuses the query. The service-role key — which bypasses RLS — is never accessible to client-side code or users.

Technical controls

What's enforced at every layer.

Each control below is enforced in code and cannot be disabled by users or administrators. Changes to these components require an explicit architectural decision logged against the production baseline.

PII Scrubbing

  • Every prompt passes through the scrubber before reaching any model
  • Detects names, emails, phone numbers, tax file numbers, ABNs, credit cards
  • Replaced with typed placeholders — the LLM never sees the original value
  • PII events logged to a tamper-evident security_events table
  • No bypass, no dry-run, no opt-out — enforced at the gateway

ARGUS-i™ Sovereign Classification

  • Every message classified: UNOFFICIAL → OFFICIAL → OFFICIAL:SENSITIVE → PROTECTED → SECRET → TOP SECRET
  • PROTECTED and above are hard-blocked — inference never starts
  • FAIL-SHUT gate: if the classifier errors, the request is denied
  • Classification rationale logged alongside every blocked request
  • Sovereign schema is version-locked and immutable per release

AES-256-GCM Vault

  • All API keys and secrets encrypted at rest using AES-256-GCM
  • Per-record nonces — reusing a nonce is cryptographically impossible
  • Decryption only occurs server-side; keys never leave in plaintext
  • Vault key stored in environment, never in the database
  • Key hint (last 4 chars) stored for rotation verification

GitHub Audit Trail

  • Every LLM interaction committed to a private GitHub repository
  • Commits are append-only — no deletion without breaking the chain
  • Commit SHA stored on each session row for cross-reference
  • Audit commits are non-blocking — never slow down user responses
  • Tamper-evident record suitable for compliance and incident review

Data Isolation

  • Strict row-level security on every database table
  • Every query is scoped to the authenticated user's organisation
  • Service-role access restricted to background system tasks only
  • No cross-tenant data access — ever
  • Supabase service role key never exposed to the browser

Cost Governance

  • Per-user and per-org monthly spend limits enforced at the gateway
  • Budget checks happen before inference — no charge on blocked requests
  • Admins can view real-time usage for every member
  • Prepay credit model — impossible to overspend your balance
  • Usage costs tracked and visible to organisation admins

Architecture

Where does unscrubbed data actually live?

The short answer: unscrubbed prompts never leave the OBEL data plane. The PII scrubber and ARGUS-i™ classifier run server-side inside the request handler — before any bytes are forwarded to an LLM endpoint. What the model receives is always the cleaned version.

Control Plane

— what users see
  • Next.js web application (Vercel, Sydney region for AU customers)
  • Authentication & identity management (Clerk)
  • Admin dashboard, audit log viewer, usage reporting
  • Organisation settings, model configuration, persona management
  • Billing and subscription management (Stripe)

The control plane handles identity, configuration, and UI. It does not process raw prompt content — that happens exclusively in the data plane.

Data Plane

— where prompts live
  • Server-side API request handler (Next.js API route, Vercel serverless)
  • PII scrubber runs here — raw prompt never leaves this boundary
  • ARGUS-i™ classifier runs here — classification happens before any model call
  • Scrubbed prompt forwarded to selected LLM provider (OpenAI, Anthropic, etc.)
  • Audit vault commit dispatched asynchronously after response is returned

The data plane is where security is enforced. It runs in an isolated server-side context. For Gov Highside, the equivalent component runs inside your classified network — no external calls.

Key guarantee: Raw, unscrubbed prompt text exists only inside the server-side request handler for the duration of the scrubbing and classification pass — it is never stored in plaintext anywhere (only the scrubbed version is persisted), and it is never forwarded to NinthLABS staff, telemetry systems, or third parties other than the LLM provider you have selected and configured.

Scrubber tuning

What happens when the scrubber flags something? Can it be tuned?

Yes — and understanding the false positive model matters for IT teams rolling out OBEL across departments with different sensitivity requirements.

What happens on a scrubber hit

  • The matched value is replaced with a typed placeholder (e.g., [EMAIL], [TFN], [PHONE])
  • The original value is never forwarded to the model or stored in the conversation log
  • A security event row is written to the audit log, capturing the hit type and count — not the original value
  • The user sees the placeholder in their conversation — they know the scrubber acted
  • Admins can view scrubber hit counts in the audit log and filter by user or date range

Per-organisation governance controls

  • Org admins can configure the ARGUS-i™ block threshold: PROTECTED, SECRET, or TOP SECRET — below that threshold, requests are flagged but not blocked
  • The 'Log Clean Conversations' setting controls whether sessions with zero scrubber hits are committed to the audit vault
  • Custom AI profiles (Personas) can be configured with department-specific instructions that reduce friction for known safe workflows
  • Sovereign block events are logged with rationale — admins can review and identify patterns that indicate false positives
  • SIEM integration (Enterprise+) allows security teams to correlate scrubber events with other data sources

False positive handling

  • The scrubber uses deterministic rules (regex + pattern matching) — not ML inference — which makes its behaviour predictable and auditable
  • Replacements are typed: a Tax File Number is always [TFN], not a generic [REDACTED] — this preserves the meaning of the sentence for the model
  • If a pattern fires incorrectly (e.g., a product SKU matching a phone pattern), admins can report it via the support channel for a rule adjustment
  • The sovereign classification block threshold is configurable per deployment — a department that works with PROTECTED documents regularly can have a higher threshold agreed at the contract level
  • Audit export allows security teams to review block rationale in bulk and identify systematic false positive patterns

What cannot be tuned

  • The scrubber itself cannot be disabled — it runs on every request without exception
  • FAIL-SHUT behaviour cannot be changed to FAIL-OPEN — if the classifier errors, the request is denied
  • Security events (scrubber hit logs) cannot be suppressed — every hit produces an audit record
  • The sovereign schema (PSPF classification levels) is version-locked — it cannot be modified per user
  • These constraints are intentional and non-negotiable — they are what the product's compliance posture is built on

Responsible disclosure

Found a vulnerability?

We take every security report seriously. If you've found something — regardless of severity — email us directly. We won't downplay it, and we won't keep you in the dark.

We acknowledge within 24 hours and aim to respond with an assessment within 48. Critical issues are triaged immediately.

security@ninthlabs.ai

Ready to get started?

Start a 14-day free trial — full access to the security pipeline, no credit card required. Upgrade when you're ready to bring your own keys or connect a model pack.

For government or air-gapped deployments, see the Gov Highside page.