Skip to main content

Agent Features & Metrics - Customer Reference Guide

Agent Types Overviewโ€‹

The platform supports 5 agent types, each designed for a specific testing scenario:


Agent TypePrimary Use CaseKey Differentiator
ChatTest text-based chatbot agentsMulti-turn text conversation evaluation with 9 quality metrics
VoiceTest chatbot agents via audio conversationsSame as Chat, but conversations happen as audio (WAV) instead of text
Phone Caller InboundTest voice agents that receive callsPre-evaluation (live simulated calls) + Post-evaluation (production recording analysis)
Phone Caller OutboundTest voice agents that make callsPre-evaluation (live outbound calls) + Post-evaluation (production recording analysis)
Image AnalyzerValidate AI-generated imagesImage quality scoring against prompts and brand guidelines

Chat Agentโ€‹


๐Ÿ“„ Workflow-Based Test Generationโ€‹

Connect your knowledge sources and let the platform auto-generate test scenarios โ€” no manual scripting needed.

  • Document Upload โ€” Upload knowledge base documents (PDF, text files) that your chatbot is expected to understand
  • Source Integrations โ€” Connect Confluence, JIRA, or GitHub as knowledge sources
  • AI Test Generation โ€” Automatically generates test scenarios from uploaded documents
  • Real-time Progress โ€” Live streaming of test generation progress

๐ŸŽญ Scenario Managementโ€‹

Build and manage the exact conversations you want to test, manually or via AI.

  • Manual Scenario Creation โ€” Create test scenarios with title, description, and expected behavior
  • AI-Generated Scenarios โ€” Auto-generate scenarios from uploaded knowledge sources
  • Validation Criteria โ€” Define custom pass/fail criteria per scenario (e.g., "Agent must mention return policy", "Agent should not hallucinate product prices")
  • Special Instructions โ€” Add specific instructions to guide scenario execution
  • Persona Assignment โ€” Assign user personas to scenarios (e.g., "frustrated customer", "first-time user")
  • Test Profile Association โ€” Link test data profiles to scenarios for data-driven testing
  • Scenario Deletion โ€” Remove scenarios that are no longer needed

๐Ÿ—‚๏ธ Test Suitesโ€‹

Group related scenarios together and track results over time.

  • Suite Creation โ€” Group multiple scenarios into test suites
  • Test Profile Selection โ€” Assign a test data profile to the entire suite
  • Run History โ€” View all past runs with status, score, and timestamps
  • Status Filtering โ€” Filter results by Passed, Failed, In Progress

๐Ÿ”Œ Endpoint Profilesโ€‹

Configure how the platform connects to your agent's API โ€” supporting everything from simple REST calls to multi-phase auth flows.

  • Postman Collection Import โ€” Upload Postman collections (with optional environment files) to configure your agent's API. Supports nested folders, collection variables, and environment variable substitution
  • Manual Configuration โ€” Define endpoints via JSON with URL, method, headers, and body
  • Multi-Phase Execution โ€” Configure suite_setup (login/auth), scenario_setup (session creation), and chat (conversation) phases
  • Variable Management โ€” Define static values, auto-generated values (UUID, mobile number, timestamp, email), and extracted values from API responses
  • Retry & Caching โ€” Configure retry-on-failure and cache suite setup results to speed up execution
  • Test Endpoint โ€” Dry-run your endpoint profile to verify connectivity before running full evaluations
  • Import/Export โ€” Export profiles as JSON and import across projects
  • Default Profile โ€” Mark one profile as the default for quick evaluation runs

๐Ÿ—ƒ๏ธ Test Profiles (Test Data)โ€‹

Create reusable data sets to power data-driven testing across multiple scenarios.

  • Custom Key-Value Data โ€” Create reusable test data profiles with typed fields (string, number, boolean, email, URL, JSON)
  • Default Profile โ€” Mark one profile as the default
  • Import/Export โ€” Share test profiles across projects via JSON export/import
  • Data Injection โ€” Test data is injected at runtime for data-driven scenario execution

๐Ÿงช Playgroundโ€‹

Interactively test your agent configuration before running a full evaluation suite.

  • Interactive Chat โ€” Send messages to your configured agent endpoint in real-time
  • Multi-Turn Conversations โ€” Maintain conversation context across multiple messages
  • Connection Testing โ€” Test connectivity via cURL command verification
  • Schema Analysis โ€” Automatic detection of request/response schema from your endpoint

โšก Evaluation Executionโ€‹

Run evaluations at scale with real-time feedback on your agent's quality.

  • Metric Selection โ€” Choose which quality metrics to evaluate (or run all)
  • Endpoint Profile Selection โ€” Pick which endpoint profile to evaluate against
  • HyperExecute Integration โ€” Run evaluations at scale using LambdaTest's HyperExecute infrastructure with optional tunnel configuration for testing agents behind firewalls or private networks
  • Real-Time Streaming โ€” Live progress updates during evaluation via Server-Sent Events

๐ŸŽš๏ธ Metric Threshold Configurationโ€‹

Define exactly what "passing" means for your project โ€” then enforce it automatically.

  • Per-Metric Thresholds โ€” Set minimum acceptable score (0.0โ€“1.0) for each metric
  • Higher/Lower is Better โ€” Configure directionality for each metric
  • Named Configurations โ€” Create named threshold configs (e.g., "Strict", "Default")
  • Active/Inactive Toggle โ€” Enable or disable threshold configurations

๐Ÿšฆ Go-Live Assessmentโ€‹

Get a clear, defensible production-readiness verdict before you ship.

Production Readiness Verdicts
VerdictMeaning
๐ŸŸข GREENReady for production
๐ŸŸก YELLOWReady with caveats
๐Ÿ”ด REDNot ready
  • Overall Score โ€” Weighted composite score (0โ€“100)
  • Confidence Level โ€” Based on number of evaluations run
    • HIGH: 100+ evaluations
    • MEDIUM: 50โ€“99 evaluations
    • LOW: 20โ€“49 evaluations
    • VERY_LOW: Fewer than 20 evaluations
  • Dimension Scores โ€” Functional Completeness, Quality Standards, Risk Profile, Operational Readiness (each weighted 25%)
  • Scenario Coverage โ€” Matrix showing well-tested vs. untested scenarios
  • Risk Assessment โ€” AI-powered failure pattern analysis with prioritized action items
  • Validation Criteria Summary โ€” Aggregated compliance rate across all criteria
  • AI Insights โ€” Actionable recommendations for improvement

๐Ÿ—“๏ธ Scheduled Runsโ€‹

Automate ongoing regression coverage without manual intervention.

  • Cron-Based Scheduling โ€” Schedule evaluations with preset frequencies (Hourly, Daily, Weekdays, Weekly, Monthly) or custom cron expressions
  • Timezone Support โ€” Full IANA timezone selection
  • Pause/Resume โ€” Temporarily pause and resume scheduled runs
  • Run History โ€” Track all scheduled execution results

Voice Agentโ€‹

note

The Voice agent is functionally identical to the Chat Agent, with one key difference: conversations happen as audio (WAV) instead of text. The platform conducts voice-based conversations with your agent and evaluates the audio interaction using the same quality metrics.


All features from the Chat Agent are available, including:

  • Workflow-based test generation with document upload and source integrations
  • Scenario management with AI generation, validation criteria, personas, and special instructions
  • Test suites with test profile selection and run history
  • Endpoint profiles with Postman collection import
  • Test profiles for data-driven testing
  • Playground for interactive testing
  • Evaluation execution with metric selection and HyperExecute integration
  • Metric threshold configuration
  • Go-Live assessment with production readiness verdict
  • Scheduled runs


Phone Caller Inbound Agentโ€‹

Two Evaluation Modes
ModeWhat It Does
Pre-evaluationThe platform simulates customers calling your voice agent with live test calls, then evaluates the resulting conversations
Post-evaluationUpload your production call recordings and transcripts from real customer interactions for evaluation on the platform

๐Ÿ“ž Phone Number Managementโ€‹

Register and manage the phone numbers your voice agent answers on.

  • Add Phone Numbers โ€” Register your agent's phone numbers with country code selection (20+ countries supported)
  • Default Phone Number โ€” Set a default number for quick test execution
  • Phone Number Display โ€” Masked display for security with country flag identification
  • Edit & Delete โ€” Update phone number details or remove numbers no longer in use

๐ŸŽญ Scenario Managementโ€‹

Generate realistic inbound call scenarios at scale with AI or build them manually.

  • AI Scenario Generation โ€” Generate up to 20 inbound test scenarios with configurable personas, languages, and special instructions
  • Manual Scenario Creation โ€” Create scenarios with name, description, expected output
  • Scenario Deletion โ€” Remove scenarios no longer needed
  • Persona Selection โ€” Choose from available personas or create custom ones to simulate different caller types
  • Language Support โ€” Generate scenarios in multiple languages (English, Spanish, etc.)

๐ŸŽ™๏ธ Voice Configuration (Per Scenario)โ€‹

Control every detail of how the simulated caller sounds and behaves.

  • Voice Selection โ€” Choose from a library of voices with audio preview (multiple voice providers and accents available)
  • Voice Preview โ€” Listen to voice samples with animated waveform visualization before selecting
  • Background Sound โ€” Enable simulated background noise (15 presets: cafe, street, factory, rain, crowd, market, train, radio interference, etc.)
  • Response Timing โ€” Configure wait time after speech ends (0.5โ€“5.0 seconds) to handle agents that speak in multiple sentences
  • Max Call Duration โ€” Set maximum call length (60โ€“1800 seconds) to prevent runaway calls
  • First Speaker โ€” Choose who speaks first โ€” the simulated user or the agent

๐Ÿ‘ค Agent Profiles (Inbound-Specific)โ€‹

Create reusable caller personas to standardize how test calls are placed across suites.

  • Agent Profile Creation โ€” Configure agent personas with name, phone number, voice, and background noise
  • Profile Library โ€” Organization-level reusable agent profiles
  • Active/Inactive Toggle โ€” Enable or disable profiles

๐Ÿ—‚๏ธ Test Suitesโ€‹

Batch your inbound scenarios into suites and run them all with a single action.

  • Suite Creation โ€” Group multiple scenarios with per-scenario voice and phone configuration
  • Test Profile Association โ€” Link test data for data-driven voice testing
  • Agent Profile Assignment โ€” Associate agent profiles with suites
  • Run Suites โ€” Execute all scenarios in a suite with a single action

๐Ÿ“ก Call Execution & Monitoringโ€‹

Trigger, track, and manage live test calls in real-time.

  • Initiate Test Calls โ€” Trigger simulated inbound calls to your voice agent
  • Real-Time Monitoring โ€” Track call status as calls progress
  • Call Duration Tracking โ€” Live duration counter during active calls
  • Call Termination โ€” End calls in progress if needed

Phone Caller Outbound Agentโ€‹

note

Phone Caller Outbound supports the same two evaluation modes as Inbound (Pre-evaluation and Post-evaluation) and shares all features, with a few key differences in pre-evaluation mode only.


Outbound-Specific Pre-evaluation Featuresโ€‹

  • Scenario Generation โ€” Generate up to 7 outbound test scenarios (vs. 20 for inbound)
  • Caller Profile Selection โ€” Select an outbound caller profile when generating scenarios
  • Outbound Number Pool โ€” Reserve phone numbers from the outbound pool for test calls
  • Passive Mode โ€” Listen to outbound calls without interfering (for QA monitoring)
  • First Speaker Default โ€” Agent speaks first (vs. simulator for inbound)

Outbound Pool Managementโ€‹

  • View Pool Status โ€” See available outbound numbers
  • Reservations โ€” View and manage per-suite number reservations
  • Clear Reservations โ€” Release reserved numbers when done

tip

All other features โ€” phone number management, voice configuration, test suites, call execution, post-evaluation (voice analytics, recording upload, batch analysis), go-live assessment, metrics, and scheduling โ€” are identical to Phone Caller Inbound.


Image Analyzer Agentโ€‹


๐Ÿ–ผ๏ธ Image Analysisโ€‹

Upload single images or batch-process up to 50 at once โ€” via file upload or URL.

  • Single Image Analysis โ€” Upload an image (or provide a URL) along with the original prompt used to generate it
  • Batch Analysis โ€” Analyze up to 50 images at once with a shared prompt
  • Drag & Drop Upload โ€” Drag and drop images directly into the upload area
  • Supported Formats โ€” JPG, JPEG, PNG, GIF, WEBP, BMP (max 20 MB per image)

๐ŸŽฏ Custom Evaluation Criteriaโ€‹

Define what "good" means for your images โ€” choose from three criteria types:


  • Allowed colors and prohibited colors
  • Required fonts
  • Logo requirements (text description)

All criteria support Active/Inactive toggling, create/edit/delete operations, and search by name, description, or type.


๐Ÿ“‹ Analysis Historyโ€‹

  • Search โ€” Find past analyses by image name or prompt text
  • Status Tracking โ€” View analysis status (Pending, Completed, Failed)
  • Detailed View โ€” Click any analysis to view full results
  • Bookmarking โ€” Bookmark important analyses for quick access

๐Ÿ“Š Analytics Dashboardโ€‹

  • Overall Statistics โ€” Average score, highest score, lowest score, total analyses count
  • Quality Trends โ€” Daily score breakdown over the last 30 days with bar chart visualization
  • Prompt Performance โ€” Top 20 prompts ranked by average quality score, showing min/max scores, usage count, and comparison vs. overall average

Shared Features Across Agentsโ€‹


๐Ÿ—‚๏ธ Project Managementโ€‹

  • Create Agents โ€” Name, description, and agent type selection
  • Agent Listing โ€” View all agents with type-specific icons and filtering
  • Agent Dashboard โ€” Overview of workflows, suites, and recent activity

๐Ÿ“‹ Test Profilesโ€‹

Available for: Chat ยท Voice ยท Phone Caller (Inbound/Outbound)
  • Custom key-value data with typed fields (string, number, boolean, email, URL, textarea, JSON)
  • Default profile marking
  • Import/Export as JSON

๐ŸŽญ Personasโ€‹

Available for: Chat ยท Voice ยท Phone Caller (Inbound/Outbound)
  • Pre-built persona library
  • Custom persona creation
  • Persona assignment to scenarios

๐Ÿ—“๏ธ Schedulingโ€‹

Available for: Chat ยท Voice ยท Phone Caller (Inbound/Outbound)
  • Cron-based scheduling (Hourly, Daily, Weekdays, Weekly, Monthly, Custom)
  • Timezone support (full IANA timezone list)
  • Pause/Resume/Delete scheduled runs
  • View next scheduled run time and run history

๐Ÿšฆ Go-Live Assessmentโ€‹

Available for: Chat ยท Voice ยท Phone Caller (Inbound/Outbound)
VerdictScore RangeMeaning
๐ŸŸข GREENโ‰ฅ 80Ready for Production
๐ŸŸก YELLOW65โ€“79Ready with Caveats
๐Ÿ”ด RED< 65Not Ready
  • Overall score with confidence level
  • Scenario coverage analysis
  • AI-powered risk assessment and recommendations

๐ŸŒ Environment Managementโ€‹

Available for: All agent types
  • Create and manage test environments
  • Variable management (name, value, type, persistence)
  • Per-environment variable scoping
  • Bulk variable creation

โœ… Validation Criteriaโ€‹

Available for: Chat ยท Voice ยท Phone Caller (Inbound/Outbound)
  • Custom pass/fail criteria per scenario
  • Evidence-based validation with confidence levels (High/Medium/Low)
  • Compliance percentage tracking
  • Per-criterion results: Pass, Fail, or Unable to Verify

Feature Availability Matrixโ€‹


FeatureChatVoicePhone Caller InboundPhone Caller OutboundImage Analyzer
Workflow & Document Uploadโœ…โœ…โŒโŒโŒ
AI Scenario Generationโœ…โœ…โœ… (up to 20)โœ… (up to 7)โŒ
Manual Scenario Creationโœ…โœ…โœ…โœ…โŒ
Test Suitesโœ…โœ…โœ…โœ…โŒ
Endpoint Profilesโœ…โœ…โŒโŒโŒ
Test Profilesโœ…โœ…โœ…โœ…โŒ
Personasโœ…โœ…โœ…โœ…โŒ
Playgroundโœ…โœ…โŒโŒโŒ
Audio-Based ConversationsโŒโœ…โŒโŒโŒ
Phone NumbersโŒโŒโœ…โœ…โŒ
Voice Selection (per scenario)โŒโŒโœ…โœ…โŒ
Background Noise SimulationโŒโŒโœ…โœ…โŒ
Live Call Execution (Pre-eval)โŒโŒโœ…โœ…โŒ
Production Recording Upload (Post-eval)โŒโŒโœ…โœ…โŒ
Batch Recording AnalysisโŒโŒโœ…โœ…โŒ
Call Recording PlaybackโŒโŒโœ…โœ…โŒ
DTMF SupportโŒโŒโœ…โœ…โŒ
Passive ModeโŒโŒโŒโœ…โŒ
Outbound PoolโŒโŒโŒโœ…โŒ
Image Upload & AnalysisโŒโŒโŒโŒโœ…
Custom Evaluation CriteriaโŒโŒโŒโŒโœ…
Brand Guideline ChecksโŒโŒโŒโŒโœ…
Image Analytics DashboardโŒโŒโŒโŒโœ…
Metric Thresholdsโœ…โœ…โœ…โœ…โŒ
Go-Live Assessmentโœ…โœ…โœ…โœ…โŒ
Validation Criteriaโœ…โœ…โœ…โœ…โŒ
Scheduled Runsโœ…โœ…โœ…โœ…โŒ
HyperExecute Integrationโœ…โœ…โŒโŒโŒ
Import/Export Profilesโœ…โœ…โŒโŒโŒ

Metrics Availability Matrixโ€‹


Metric CategoryChatVoicePhone Caller InboundPhone Caller OutboundImage Analyzer
Bias Detectionโœ…โœ…โŒโŒโŒ
Hallucination Detectionโœ…โœ…โŒโŒโŒ
Completenessโœ…โœ…โŒโŒโŒ
Context Awarenessโœ…โœ…โŒโŒโŒ
Response Qualityโœ…โœ…โŒโŒโŒ
Conversation Flowโœ…โœ…โŒโŒโŒ
User Satisfactionโœ…โœ…โŒโŒโŒ
File Handling Qualityโœ…โœ…โŒโŒโŒ
File Generation Accuracyโœ…โœ…โŒโŒโŒ
Latency & Interaction DynamicsโŒโŒโœ…โœ…โŒ
Accuracy & Effectiveness (FCR, etc.)โŒโŒโœ…โœ…โŒ
CSAT & User ExperienceโŒโŒโœ…โœ…โŒ
Business Operational MetricsโŒโŒโœ…โœ…โŒ
Audio Voice QualityโŒโŒโœ…โœ…โŒ
STT EvaluationโŒโŒโœ…โœ…โŒ
Issue Detection TagsโŒโŒโœ…โœ…โŒ
Validation Criteriaโœ…โœ…โœ…โœ…โŒ
Image Quality ScoreโŒโŒโŒโŒโœ…
Prompt ComplianceโŒโŒโŒโŒโœ…
Brand/Technical Spec ComplianceโŒโŒโŒโŒโœ…

Test across 3000+ combinations of browsers, real devices & OS.

Book Demo

Help and Support

Related Articles