Zentrafuge Labs — AI Safety Infrastructure

⚠️

LLMs say unsafe things

Even well-prompted models can produce harmful responses when users are distressed or adversarial.

📋

Compliance teams need audit trails

Regulators and insurers expect documented, reviewable safety decisions — not black-box inference.

⚡

Crisis moments need instant response

When a user is in distress, your AI cannot wait for a human. It needs to act correctly, immediately.

🔌

Safety shouldn't be rebuilt per product

Most teams re-invent the same guardrails. Zentrafuge Labs extracts that work into tested infrastructure.

Product — v0.1.0

guardrails_core stable · v0.1.0

A structured safety layer that sits between a user message and your LLM's draft response. Evaluates risk, enforces safe language, and returns an auditable decision — before anything reaches the user.

🧠

Emotional context analysis

Detects distress signals, intensity, and masked emotional states beyond simple sentiment scoring.

🛡️

Risk classification

Four-level system: low → medium → high → critical. Each level triggers defined, documented actions.

📝

Structured audit output

Every decision produces human-readable audit notes. Log them, store them, show them to insurers.

🔧

Response enforcement

High-risk responses are automatically patched with safety-aware language and crisis resources.

🔌

Model-agnostic

Works with OpenAI, Anthropic, Mistral, or any model. No vendor lock-in. Pure Python.

⚡

No infrastructure required

No database. No authentication layer. No background jobs. Install and integrate in hours.

from guardrails_core import evaluate_guardrails

decision = evaluate_guardrails(
    user_id="user-abc",
    user_message="I don't want to be here anymore.",
    assistant_draft="I'm really sorry you're feeling this way.",
    conversation_context=[],
    debug=True,
)

print(decision.to_dict())

{
  "approved":        false,
  "risk_level":      "critical",
  "actions_taken":   ["force_safety_footer", "override_response"],
  "modified_output": "I hear you, and I'm glad you said that...",
  "audit_notes":     "Critical risk detected. Safety override applied. Crisis resources appended."
}

How it works

Four risk levels. Clear, documented actions.

Every message is assessed against emotional and safety signals. The result is a structured decision your application can act on immediately.

Level	Trigger	Action taken	Audit output
low	Neutral or positive message, no distress signals	Response approved as-is	Pass noted, no intervention
medium	Mild emotional distress or frustration detected	Response approved, context flag added	Emotional signal logged for review
high	Significant distress, hopelessness, or risk language	Response patched with safety-aware language	Full decision trail, intervention noted
critical	Active crisis indicators — self-harm, suicidal ideation	Response overridden, crisis resources appended	Full audit trail, escalation recommended

Licensing

Simple, transparent pricing.

Guardrails v0 is source-available and commercially licensed. All tiers include the full codebase, documentation, and a written licence agreement.

Evaluation

Free

For developers evaluating Guardrails in a non-production environment.

Full source code access
Integration documentation
Community support via email
Non-commercial use only

Request access

Commercial

£2,500 / year

For startups and growing products handling sensitive user conversations.

Commercial use licence
Full source + audit documentation
Compliance-ready licence agreement
Priority email support
Version updates included

Get a licence →

Enterprise

Custom

For larger organisations, NHS-adjacent services, or multi-product deployments.

Everything in Commercial
Custom policy configuration
SLA and uptime commitments
Joint compliance review
White-label options available
Dedicated support contact

Talk to us

Roadmap

More modules in development.

Guardrails is the first release from Zentrafuge Labs. Additional modules are being extracted from production and hardened for commercial release.

Emotion Parser

Heuristic-driven emotional tone detection. Detects masked states, intensity, and regulation level.

Three-Tier Memory SDK

Micro, super, and persistent memory for AI companions. Storage-agnostic, GDPR-compliant.

Personalization Engine

Learns communication style and emotional preferences over time. Works alongside any memory layer.

Proactive Engagement

Timing intelligence for AI check-ins. Determines when — and how — to initiate proactively.

Interested in early access to any of these modules? Get in touch.

Built in production.
Not in theory.

Zentrafuge Labs is the R&D arm of Zentrafuge Limited — a UK company building AI companions for veterans. Every module we licence was built for real users, in a real product, handling genuinely sensitive conversations.

Guardrails v0 powers the safeguarding layer of Radio Check, a live mental health platform for UK veterans. It has been tested against real crisis language, integrated with professional counselling workflows, and reviewed against BACP ethical guidelines.

Founded by Anthony Donnelly, Medway, UK. Company No. 16669197.

Risk levels with distinct enforcement logic

Infrastructure dependencies required

Live

Deployed in production on a veteran support platform

Built and supported from Medway, England

Make your AI safer
before it ships.

LLMs say unsafe things

Compliance teams need audit trails

Crisis moments need instant response

Safety shouldn't be rebuilt per product

Emotional context analysis

Risk classification

Structured audit output

Response enforcement

Model-agnostic

No infrastructure required

Four risk levels. Clear, documented actions.

Any AI product where conversations matter.

Simple, transparent pricing.

More modules in development.

Emotion Parser

Three-Tier Memory SDK

Personalization Engine

Proactive Engagement

Built in production.
Not in theory.

Licence enquiry or early access

Make your AI saferbefore it ships.

LLMs say unsafe things

Compliance teams need audit trails

Crisis moments need instant response

Safety shouldn't be rebuilt per product

Emotional context analysis

Risk classification

Structured audit output

Response enforcement

Model-agnostic

No infrastructure required

Four risk levels. Clear, documented actions.

Any AI product where conversations matter.

Simple, transparent pricing.

More modules in development.

Emotion Parser

Three-Tier Memory SDK

Personalization Engine

Proactive Engagement

Built in production.Not in theory.

Licence enquiry or early access

Make your AI safer
before it ships.

Built in production.
Not in theory.