Skip to content

Preclinical

Adversarial testing platform for healthcare AI agents.

Preclinical runs adversarial multi-turn conversations against your healthcare AI agents, then grades the transcripts against safety rubrics. It simulates patients using red-team techniques to probe for weaknesses in triage accuracy, medical advice safety, and compliance.

What it does

  1. Adversarial Pen Testing -- AI-powered attacks that probe for weaknesses using medical-specific vectors
  2. Rubric-Based Grading -- Automated evaluation against configurable criteria
  3. Multi-Provider Support -- Test agents across Vapi, LiveKit, Pipecat, OpenAI, and Browser
  4. Evidence-Based Reports -- Detailed explanations with transcript quotes for every verdict

Quick Start

git clone https://github.com/Mentat-Lab/preclinical.git
cd preclinical
cp .env.example .env
# Add model credentials (OpenAI or Anthropic)
docker compose up

Create an agent, pick scenarios, and start a test run.

The Problem

Healthcare AI agents are being deployed rapidly, but without standardized testing frameworks. This creates risks:

Risk Description
Safety Agents may provide dangerous medical advice
Compliance Agents may violate HIPAA or state regulations
Quality Agents may have poor accuracy, tone, or response times
Trust Healthcare organizations can't confidently deploy AI agents

Use It Your Way

Open http://localhost:3000, create an agent, and start testing.

pip install preclinical
preclinical run <agent-id> --creative --watch

/plugin marketplace add Mentat-Lab/preclinical
/plugin install preclinical@preclinical
/preclinical:setup
Then use /preclinical:run, /preclinical:benchmark, /preclinical:diagnose, and more.

If you clone the repo, the plugin loads automatically — no install step needed.

npx skills add Mentat-Lab/preclinical
Then just ask: "Test my healthcare agent with emergency scenarios"

Works with Cursor, Windsurf, Copilot, Cline, and more.

from preclinical import Preclinical

client = Preclinical()
run = client.run("agent-id", creative_mode=True)
print(f"Pass rate: {run.pass_rate}%")

Next Steps