How do I test a Relevance AI agent?

Connect your Relevance AI agent to TestMu AI, ingest your sales or support docs or a knowledge source, and let it auto-generate scenarios. An AI evaluator then chats with your agent like a real lead or customer across thousands of scenarios, checking its answers, knowledge grounding, and the actions it calls, with a clear pass or fail.

Why should I test agents built on Relevance AI?

Relevance AI lets ops, sales, and support teams build an AI workforce without code, but a launched BDR or support agent still researches accounts, writes to your CRM, sends outreach, and quotes your knowledge base. A small edit to a prompt, action, or data source can quietly break a workflow that worked yesterday. Independent testing catches wrong answers, invented case studies, broken action calls, and lost context before a prospect ever sees them.

What can TestMu AI test in a Relevance AI agent?

Conversation accuracy across thousands of scenarios, correct tool and action calls like CRM updates and email sends, knowledge grounding and hallucination guardrails, multi-turn memory, handoffs between agents on your canvas, escalation to a human, and regression after every change. Each response is scored on 9 quality metrics with customizable pass/fail thresholds.

How does TestMu AI test multi-agent teams on Relevance AI?

An AI evaluator chats with your team's entry point and follows the work as it routes between agents, for example from a research agent to an outreach agent to a human. It checks that context survives each handoff, that routing and escalation rules fire correctly, and that every agent stays accurate and on task, scoring each step on the same 9 quality metrics.

Can I test the tools and actions my Relevance AI agent calls?

Yes. When your agent updates a CRM, sends an email through a campaign tool, searches the web, or calls a custom API, TestMu AI verifies the action fires with the right parameters, not hallucinated inputs. It tracks each call as Pass, Fail, or Partial against your criteria so a broken or wrong call is caught before it touches a live account.

Can I automate Relevance AI regression testing in CI/CD?

Yes. TestMu AI supports scheduled runs using preset frequencies or full custom cron expressions with IANA timezone support. You can also trigger runs from your CI/CD pipeline to catch regressions every time you change a prompt, action, or knowledge source before you redeploy the agent.

Does this replace Relevance AI's own tooling?

No, it complements it. TestMu AI is an independent QA layer that evaluates your agent from the outside, the way a customer would. Keep building in Relevance AI and use KaneAI and agent testing to prove the agent behaves correctly before and after every release.

Test the AI Workforce You Build on Relevance AI

Deploy autonomous AI evaluators to test the agents you build on Relevance AI across thousands of scenarios. Catch hallucinated answers, broken actions, and lost context before a real account hits them.

Start free with Google

Start free with Email

Automate Browser Flows from your
Terminal with Kane CLI

Explore Kane CLI

Trusted by 2M+ users globally at

+Read case study

Deep Dive into Relevance AI Testing

AI-native evaluators that plan, run, and score your Relevance AI agent conversations, tool calls, and handoffs across thousands of scenarios.

Agent Conversations

Multi-Agent Teams

Scenario Generation

Go-Live Assessment

Test Your Relevance AI Agent Conversations

Your Relevance AI support or sales agent answers customers and qualifies leads across chat and phone. Score every turn on 9 quality metrics.

9 Quality Metrics

Score bias, hallucination, completeness, context awareness, and response quality on every pricing or refund request.

Grounded in Your Knowledge

Confirm the agent answers from your uploaded docs and connected knowledge base instead of inventing a policy or case study.

Go-Live Verdict

Get a Green, Yellow, or Red production-readiness call before a BDR or support agent goes live.

Complete Relevance AI Testing Coverage

Confidence by Evaluation Volume

HIGH (100+ evaluations), MEDIUM (50-99), LOW (20-49), VERY LOW (below 20). More runs mean a more trustworthy verdict.

9 Quality Metrics on Every Reply

Bias, hallucination, completeness, context awareness, response quality, flow, user satisfaction, file handling, and file accuracy on every answer.

4-Dimension Go-Live Assessment

Each run scores Functional Completeness, Quality Standards, Risk Profile, and Operational Readiness, each weighted at 25%.

Pass/Fail on Tools and Answers

Pinpoint every match and discrepancy in your Relevance AI agent's replies and action calls, tracked as Pass, Fail, or Partial against your criteria.

Built for Every Layer of Relevance AI QA

Project and Environment Management

Create test projects, manage environments, and scope variables with bulk creation across your agents.

Test Profiles and Personas

Inject reusable account data and run scenarios across a pre-built or custom persona library, from skeptical buyers to frustrated customers.

Custom Validation Criteria

Define evidence-based pass/fail rules per scenario, with High/Medium/Low confidence tracking on answers and action calls.

Security and Infrastructure

Execute via HyperExecute with optional secure tunnels for firewall-restricted Relevance AI agent endpoints.

Scheduling Engine

Automate runs using preset frequencies or full custom cron expressions with IANA timezone support.

Observability and Reporting

Monitor agent performance with unified dashboards, exportable reports, and real-time quality trends.

Start Free Testing

Success Stories of TestMu AI (Formerly LambdaTest)

50%

reduction in test execution time

“HyperExecute is a highly reliable test execution platform and has excellent customer support.”

Sagar Uday Kumar

Sr. Engineering Manager

Some Love from our Customers

As Best Egg expanded its product offerings and entered new markets, we knew our old testing infrastructure couldn’t keep up.
With support from Tenny Agustin, our Engineering Operations Lead, we modernized our approach with

TestMu AI

Best Egg

best-egg

Excited to Share My Learning Journey with Kane AI & Lambda Tool!
I'm pleased to announce that I've recently gained hands-on experience exploring Kane AI through the Lambda Tool and it’s been a fantastic journey of upskilling!

KaneAI

Suryateja Goud

suryateja-goud

See how is #Futureready to enable blazing-fast test orchestration seamlessly integrated with organizations' existing CI/CD platforms, using #Microsoft Azure.

TestMu AI

Microsoft India

MicrosoftIndia

View all reviews

Frequently asked questions

TestMu AI (Formerly LambdaTest)/Relevance AI Testing

TestMu AI forEnterprise

Get access to solutions built on Enterprise
grade security, privacy, & compliance

Advanced access controls
Advanced data retention rules
Advanced Local Testing
Premium Support options
Early access to beta features
Private Slack Channel
Unlimited Manual Accessibility DevTools Tests