Skip to main content

For the past two years, the conversation around AI agent safety has been dominated by philosophy — hypothetical risks, alignment debates, and corporate responsibility pledges. Meanwhile, development teams shipping AI agents to production have been left without practical tools to actually test whether their agents behave safely under adversarial conditions. This week, Microsoft changed that equation by open-sourcing two complementary tools — RAMPART and Clarity — that treat AI safety as an engineering discipline rather than a boardroom talking point.

TL;DR

  • Microsoft open-sourced RAMPART (Risk Assessment and Measurement Platform for Agentic Red Teaming) and Clarity on 20 May 2026, bringing AI agent safety testing into CI/CD pipelines
  • RAMPART is a pytest-native framework built on PyRIT that embeds automated red-team tests directly into your build pipeline — catching prompt injection, data exfiltration, and behavioural regressions before deployment
  • Clarity acts as an AI-powered “structured sounding board” that pressure-tests product assumptions and surfaces edge cases before a single line of code is written
  • The tools support statistical trial policies (e.g., “this action must be safe in at least 80% of runs”), accounting for the probabilistic nature of LLM behaviour
  • Together, they shift AI safety from a one-off review to a continuous, living engineering practice embedded in your development workflow

The Problem: AI Agents Ship Without Safety Tests

If you are building traditional web applications, you have decades of battle-tested security testing infrastructure — OWASP ZAP, Burp Suite, SAST/DAST scanners, penetration testing frameworks. Your CI/CD pipeline likely runs security checks on every commit.

AI agents? Not so much. Most teams shipping agentic applications today rely on manual red-teaming sessions, ad-hoc prompt testing, or — let us be honest — hope. The problem is not that teams do not care about safety. It is that the tooling has not existed to make safety testing as routine as running pytest.

As Ram Shankar Siva Kumar, Microsoft’s AI red team founder, put it: “It’s high time we stop talking about AI safety as a philosophy and start thinking about AI safety as an engineering discipline.”

RAMPART: Red-Team Testing That Lives in Your Pipeline

RAMPART is the more immediately actionable of the two tools. Built on top of Microsoft’s PyRIT (Python Risk Identification Tool) — which has been open-source for over two years — RAMPART is a pytest-native framework specifically designed for agentic AI applications.

What It Actually Does

RAMPART lets you write test cases that attack and probe your AI agents, covering both adversarial scenarios (prompt injection, data exfiltration, jailbreaking) and benign failure modes (behavioural regression, exceeding approved tool use, hallucinated actions).

The critical difference from PyRIT is the target audience. Where PyRIT is optimised for black-box discovery by security researchers after a system is built, RAMPART is built for engineers as the system is being built. It slots directly into your CI/CD pipeline, running alongside your unit tests and integration tests.

Statistical Trials: Embracing Probabilistic Behaviour

This is where RAMPART gets genuinely interesting. Traditional software either passes or fails a test. AI agents are probabilistic — the same input can produce different outputs across runs. RAMPART handles this by supporting statistical trial policies.

You can set thresholds like “this action must be safe in at least 80% of runs across 300 iterations.” When Microsoft’s red team identified a single attack vector, RAMPART was able to generate close to 100 different variants and test across approximately 300 iterations — turning one finding into comprehensive regression coverage.

For development teams, this means you can quantify your agent’s safety posture rather than relying on pass/fail binaries that do not reflect real-world LLM behaviour.

Integration Is Minimal

RAMPART requires only an adapter connecting your agent to the test suite. If your team already uses pytest (and in 2026, who does not?), the barrier to adoption is remarkably low. Write your test scenarios, connect your agent, set your safety thresholds, and let your pipeline do the rest.

Clarity: Catching Bad Assumptions Before You Write Code

Where RAMPART catches problems in code, Clarity catches them in thinking. It is an AI agent that acts as a structured sounding board — posing the kinds of questions that experienced architects, product managers, and security engineers would ask.

Clarity guides teams through four structured stages:

  1. Problem clarification — Are you solving the right problem?
  2. Solution exploration — Have you considered alternatives?
  3. Failure analysis — What happens when things go wrong?
  4. Decision tracking — Why did you choose this approach?

Microsoft described it as “an AI thinking partner that pushes back.” The goal is to surface edge cases and security implications before development begins — when changing course is cheap. As the team noted: “We wanted to give product managers and engineers a way to pressure-test their assumptions at the start of a project, when the right conversation can save months of rework.”

Why This Matters for Your Development Team

The release of RAMPART and Clarity marks a turning point for three reasons:

1. AI Safety Becomes a Build-Time Concern

Safety is no longer something you bolt on at the end with a manual review. It becomes a continuous, automated check that runs on every commit — the same way we treat security scanning for traditional applications. This is the DevSecOps model applied to AI agents.

2. Red-Team Findings Become Regression Tests

Every vulnerability your security team discovers can be encoded as a repeatable RAMPART test. That finding never slips through again. Over time, your safety test suite becomes a living record of every attack vector your agents have been hardened against.

3. The Probabilistic Testing Gap Closes

The statistical trial approach is a genuine innovation. Until now, testing probabilistic AI systems with deterministic test frameworks has been awkward at best. RAMPART’s threshold-based approach finally gives teams a principled way to assert safety properties over non-deterministic outputs.

Getting Started: A Practical Checklist

If your team is building or deploying AI agents, here is how to start incorporating these tools:

  • Audit your current agent testing: What safety tests do you run today? Most teams will find the answer is “not enough”
  • Add RAMPART to your CI pipeline: Start with the most critical attack vectors — prompt injection and unauthorised tool use — and expand coverage iteratively
  • Set statistical thresholds: Define acceptable safety rates for each agent action, accounting for probabilistic behaviour
  • Run Clarity sessions on new agent features: Before your next sprint planning, use Clarity to pressure-test the assumptions behind any new agentic feature
  • Encode every incident: When something goes wrong in production, turn it into a RAMPART test so it never happens again

The Bigger Picture

RAMPART and Clarity are part of a broader industry shift. We have previously written about the Five Eyes agentic AI guidance and the security blind spots in AI-generated code. What Microsoft has done is provide the tooling to actually implement the practices these frameworks recommend.

The message is clear: if you are shipping AI agents without automated safety testing in your pipeline, you are operating with the same risk posture as a team shipping web applications without a WAF or SAST scanner. The tools now exist. The excuses do not.

At REPTILEHAUS, we build and deploy AI agent systems for clients across Dublin and beyond — and safety testing is baked into every engagement from day one. If your team is looking to integrate AI agents into your workflows and wants to get the safety engineering right from the start, get in touch.


📷 Photo by Jefferson Santos (@jefflssantos) on Unsplash