How do I test a Decagon agent?

Connect your Decagon support agent to TestMu AI, ingest the same return policies and product docs your AOPs run on, and let it auto-generate scenarios. An AI evaluator then chats with your agent like a real customer across thousands of intents, scoring every answer and action call with a clear pass or fail.

Why should I test agents built on Decagon?

Decagon ships Agent Operating Procedures fast, but a live AOP still quotes your refund policy, processes cancellations, and calls into Zendesk, Salesforce, or Intercom. Editing one procedure or one help article can quietly break a path that worked yesterday. Independent testing catches invented policies, wrong retrievals, broken refund actions, and mis-escalations before a customer hits them.

What can TestMu AI test in a Decagon agent?

AOP adherence across thousands of intents, knowledge grounding and hallucination guardrails on your help center content, correct action calls like refunds or order lookups, sensitive-topic escalation and human handoff, multi-turn memory, and regression after every procedure change. Each response is scored on 9 quality metrics with customizable pass/fail thresholds.

How does TestMu AI compare to Decagon Simulations and Watchtower?

Decagon's Simulations and Watchtower are built into the platform you author AOPs in. TestMu AI is an independent QA layer that probes the same agent from the outside, the way a customer would, with no access to your internal logic. Many teams run both: Decagon's tools while authoring, TestMu AI as a separate gate that signs off each release.

Can TestMu AI test my Decagon agent if it answers calls?

Yes. If your Decagon agent runs on a voice channel through Decagon Voice, the same evaluator validates spoken conversations against the 9 quality metrics, so a caller asking to cancel a subscription gets the same accurate, grounded answer your chat customers do.

Can I automate Decagon regression testing in CI/CD?

Yes. TestMu AI supports scheduled runs using preset frequencies or full custom cron expressions with IANA timezone support. You can also trigger a run from your CI/CD pipeline whenever an AOP changes, so regressions surface before deployment, not in production.

Test the Support Agents You Build on Decagon

Deploy autonomous AI evaluators against your Decagon agents and the AOPs they run. Replay refund and cancellation conversations to catch invented policies and broken action calls before real customers do.

Start free with Google

Start free with Email

Automate Browser Flows from your
Terminal with Kane CLI

Explore Kane CLI

Trusted by 2M+ users globally at

+Read case study

Deep Dive into Decagon Testing

AI-native evaluators that generate scenarios, replay every AOP, and score your Decagon agents on 9 quality metrics before customers hit them.

AOP Conversations

Knowledge Grounding

Scenario Generation

Go-Live Assessment

Test the AOPs Your Agent Runs On

Decagon agents run the Agent Operating Procedures your team writes. Replay every AOP branch across 9 quality metrics to confirm customers get the path you intended.

Procedure Adherence

When a customer says their package never arrived, confirm the agent follows your reshipment AOP and checks eligibility instead of jumping to a refund it was never allowed to offer.

Scenarios From Your Help Center

Auto-generate 60-100+ test conversations from your return policies, product docs, and the Zendesk or Intercom knowledge your agent already reads.

Go-Live Verdict Per AOP

Get a Green, Yellow, or Red readiness call before you ship a new Agent Operating Procedure or change an existing refund flow.

Complete Coverage for Your Decagon AOPs

Confidence by Evaluation Volume

HIGH (100+ evaluations), MEDIUM (50-99), LOW (20-49), VERY LOW (below 20). Know whether a passing AOP was tested enough to trust at launch.

9 Quality Metrics Across Every Conversation

Bias, hallucination, completeness, context awareness, response quality, flow, user satisfaction, file handling, and file accuracy, scored on every refund, lookup, and cancellation path.

4-Dimension Go-Live Assessment

Each AOP run scores Functional Completeness, Quality Standards, Risk Profile, and Operational Readiness, each weighted at 25%.

Pass/Fail Analysis Output

Pinpoint every match and discrepancy in your Decagon agent's answers and action calls, tracked as Pass, Fail, or Partial against your criteria.

Built for Every Layer of Decagon QA

Project and Environment Management

Separate test projects per AOP, manage staging and production agent endpoints, and scope variables with bulk creation.

Test Profiles and Personas

Run scenarios across calm, confused, and frustrated customer personas, with reusable order and account data injected per run.

Custom Validation Criteria

Define evidence-based pass/fail rules per scenario, with High/Medium/Low confidence tracking on every refund and policy answer.

Security and Infrastructure

Execute via HyperExecute with optional secure tunnels for firewall-restricted Decagon agent endpoints.

Scheduling Engine

Automate runs using preset frequencies or full custom cron expressions with IANA timezone support.

Observability and Reporting

Track agent quality across AOP releases with unified dashboards, exportable reports, and real-time regression trends.

Start Free Testing

Success Stories of TestMu AI (Formerly LambdaTest)

50%

reduction in test execution time

“HyperExecute is a highly reliable test execution platform and has excellent customer support.”

Sagar Uday Kumar

Sr. Engineering Manager

Some Love from our Customers

As Best Egg expanded its product offerings and entered new markets, we knew our old testing infrastructure couldn’t keep up.
With support from Tenny Agustin, our Engineering Operations Lead, we modernized our approach with

TestMu AI

Best Egg

best-egg

Excited to Share My Learning Journey with Kane AI & Lambda Tool!
I'm pleased to announce that I've recently gained hands-on experience exploring Kane AI through the Lambda Tool and it’s been a fantastic journey of upskilling!

KaneAI

Suryateja Goud

suryateja-goud

See how is #Futureready to enable blazing-fast test orchestration seamlessly integrated with organizations' existing CI/CD platforms, using #Microsoft Azure.

TestMu AI

Microsoft India

MicrosoftIndia

View all reviews

Frequently asked questions

TestMu AI (Formerly LambdaTest)/Decagon Testing

TestMu AI forEnterprise

Get access to solutions built on Enterprise
grade security, privacy, & compliance

Advanced access controls
Advanced data retention rules
Advanced Local Testing
Premium Support options
Early access to beta features
Private Slack Channel
Unlimited Manual Accessibility DevTools Tests