Agent Features & Metrics - Customer Reference Guide
Agent Types Overviewโ
The platform supports 5 agent types, each designed for a specific testing scenario:
| Agent Type | Primary Use Case | Key Differentiator |
|---|---|---|
| Chat | Test text-based chatbot agents | Multi-turn text conversation evaluation with 9 quality metrics |
| Voice | Test chatbot agents via audio conversations | Same as Chat, but conversations happen as audio (WAV) instead of text |
| Phone Caller Inbound | Test voice agents that receive calls | Pre-evaluation (live simulated calls) + Post-evaluation (production recording analysis) |
| Phone Caller Outbound | Test voice agents that make calls | Pre-evaluation (live outbound calls) + Post-evaluation (production recording analysis) |
| Image Analyzer | Validate AI-generated images | Image quality scoring against prompts and brand guidelines |
Chat Agentโ
- Features
- Metrics
๐ Workflow-Based Test Generationโ
Connect your knowledge sources and let the platform auto-generate test scenarios โ no manual scripting needed.
- Document Upload โ Upload knowledge base documents (PDF, text files) that your chatbot is expected to understand
- Source Integrations โ Connect Confluence, JIRA, or GitHub as knowledge sources
- AI Test Generation โ Automatically generates test scenarios from uploaded documents
- Real-time Progress โ Live streaming of test generation progress
๐ญ Scenario Managementโ
Build and manage the exact conversations you want to test, manually or via AI.
- Manual Scenario Creation โ Create test scenarios with title, description, and expected behavior
- AI-Generated Scenarios โ Auto-generate scenarios from uploaded knowledge sources
- Validation Criteria โ Define custom pass/fail criteria per scenario (e.g., "Agent must mention return policy", "Agent should not hallucinate product prices")
- Special Instructions โ Add specific instructions to guide scenario execution
- Persona Assignment โ Assign user personas to scenarios (e.g., "frustrated customer", "first-time user")
- Test Profile Association โ Link test data profiles to scenarios for data-driven testing
- Scenario Deletion โ Remove scenarios that are no longer needed
๐๏ธ Test Suitesโ
Group related scenarios together and track results over time.
- Suite Creation โ Group multiple scenarios into test suites
- Test Profile Selection โ Assign a test data profile to the entire suite
- Run History โ View all past runs with status, score, and timestamps
- Status Filtering โ Filter results by Passed, Failed, In Progress
๐ Endpoint Profilesโ
Configure how the platform connects to your agent's API โ supporting everything from simple REST calls to multi-phase auth flows.
- Postman Collection Import โ Upload Postman collections (with optional environment files) to configure your agent's API. Supports nested folders, collection variables, and environment variable substitution
- Manual Configuration โ Define endpoints via JSON with URL, method, headers, and body
- Multi-Phase Execution โ Configure suite_setup (login/auth), scenario_setup (session creation), and chat (conversation) phases
- Variable Management โ Define static values, auto-generated values (UUID, mobile number, timestamp, email), and extracted values from API responses
- Retry & Caching โ Configure retry-on-failure and cache suite setup results to speed up execution
- Test Endpoint โ Dry-run your endpoint profile to verify connectivity before running full evaluations
- Import/Export โ Export profiles as JSON and import across projects
- Default Profile โ Mark one profile as the default for quick evaluation runs
๐๏ธ Test Profiles (Test Data)โ
Create reusable data sets to power data-driven testing across multiple scenarios.
- Custom Key-Value Data โ Create reusable test data profiles with typed fields (string, number, boolean, email, URL, JSON)
- Default Profile โ Mark one profile as the default
- Import/Export โ Share test profiles across projects via JSON export/import
- Data Injection โ Test data is injected at runtime for data-driven scenario execution
๐งช Playgroundโ
Interactively test your agent configuration before running a full evaluation suite.
- Interactive Chat โ Send messages to your configured agent endpoint in real-time
- Multi-Turn Conversations โ Maintain conversation context across multiple messages
- Connection Testing โ Test connectivity via cURL command verification
- Schema Analysis โ Automatic detection of request/response schema from your endpoint
โก Evaluation Executionโ
Run evaluations at scale with real-time feedback on your agent's quality.
- Metric Selection โ Choose which quality metrics to evaluate (or run all)
- Endpoint Profile Selection โ Pick which endpoint profile to evaluate against
- HyperExecute Integration โ Run evaluations at scale using LambdaTest's HyperExecute infrastructure with optional tunnel configuration for testing agents behind firewalls or private networks
- Real-Time Streaming โ Live progress updates during evaluation via Server-Sent Events
๐๏ธ Metric Threshold Configurationโ
Define exactly what "passing" means for your project โ then enforce it automatically.
- Per-Metric Thresholds โ Set minimum acceptable score (0.0โ1.0) for each metric
- Higher/Lower is Better โ Configure directionality for each metric
- Named Configurations โ Create named threshold configs (e.g., "Strict", "Default")
- Active/Inactive Toggle โ Enable or disable threshold configurations
๐ฆ Go-Live Assessmentโ
Get a clear, defensible production-readiness verdict before you ship.
| Verdict | Meaning |
|---|---|
| ๐ข GREEN | Ready for production |
| ๐ก YELLOW | Ready with caveats |
| ๐ด RED | Not ready |
- Overall Score โ Weighted composite score (0โ100)
- Confidence Level โ Based on number of evaluations run
- HIGH: 100+ evaluations
- MEDIUM: 50โ99 evaluations
- LOW: 20โ49 evaluations
- VERY_LOW: Fewer than 20 evaluations
- Dimension Scores โ Functional Completeness, Quality Standards, Risk Profile, Operational Readiness (each weighted 25%)
- Scenario Coverage โ Matrix showing well-tested vs. untested scenarios
- Risk Assessment โ AI-powered failure pattern analysis with prioritized action items
- Validation Criteria Summary โ Aggregated compliance rate across all criteria
- AI Insights โ Actionable recommendations for improvement
๐๏ธ Scheduled Runsโ
Automate ongoing regression coverage without manual intervention.
- Cron-Based Scheduling โ Schedule evaluations with preset frequencies (Hourly, Daily, Weekdays, Weekly, Monthly) or custom cron expressions
- Timezone Support โ Full IANA timezone selection
- Pause/Resume โ Temporarily pause and resume scheduled runs
- Run History โ Track all scheduled execution results
Chat agents are evaluated on 9 quality metrics, each scored on a 0โ100% scale.
| # | Metric | What It Measures |
|---|---|---|
| 1 | Bias Detection | Detects biased, discriminatory, or unfair responses |
| 2 | Hallucination Detection | Identifies false, fabricated, or unsupported information |
| 3 | Completeness | Whether the agent fully addresses the user's question |
| 4 | Context Awareness | How well the agent understands and maintains conversation context |
| 5 | Response Quality | Overall quality, clarity, and helpfulness of responses |
| 6 | Conversation Flow | Natural flow and coherence across multi-turn conversations |
| 7 | User Satisfaction | Estimated end-user satisfaction based on conversation quality |
| 8 | File Handling Quality | Quality of file upload/download interactions (if applicable) |
| 9 | File Generation Accuracy | Accuracy of generated files against requirements (if applicable) |
Every evaluation result includes:
- Overall Score (weighted average of all metrics)
- Per-metric score with pass/fail badge (based on configured thresholds)
- Per-metric detailed analysis text
- Multi-turn conversation transcript (User/Agent exchanges)
- Strengths identified in the conversation
- Areas for improvement
- Actionable recommendations
- Validation criteria results (Pass/Fail/Unable to Verify per criterion with evidence and confidence level)
Voice Agentโ
The Voice agent is functionally identical to the Chat Agent, with one key difference: conversations happen as audio (WAV) instead of text. The platform conducts voice-based conversations with your agent and evaluates the audio interaction using the same quality metrics.
- Features
- Metrics
All features from the Chat Agent are available, including:
- Workflow-based test generation with document upload and source integrations
- Scenario management with AI generation, validation criteria, personas, and special instructions
- Test suites with test profile selection and run history
- Endpoint profiles with Postman collection import
- Test profiles for data-driven testing
- Playground for interactive testing
- Evaluation execution with metric selection and HyperExecute integration
- Metric threshold configuration
- Go-Live assessment with production readiness verdict
- Scheduled runs
Same 9 quality metrics as the Chat agent:
| # | Metric |
|---|---|
| 1 | Bias Detection |
| 2 | Hallucination Detection |
| 3 | Completeness |
| 4 | Context Awareness |
| 5 | Response Quality |
| 6 | Conversation Flow |
| 7 | User Satisfaction |
| 8 | File Handling Quality |
| 9 | File Generation Accuracy |
Phone Caller Inbound Agentโ
| Mode | What It Does |
|---|---|
| Pre-evaluation | The platform simulates customers calling your voice agent with live test calls, then evaluates the resulting conversations |
| Post-evaluation | Upload your production call recordings and transcripts from real customer interactions for evaluation on the platform |
- Pre-evaluation (Live Test Calls)
- Post-evaluation (Recording Analysis)
- Shared (Both Modes)
- Metrics
๐ Phone Number Managementโ
Register and manage the phone numbers your voice agent answers on.
- Add Phone Numbers โ Register your agent's phone numbers with country code selection (20+ countries supported)
- Default Phone Number โ Set a default number for quick test execution
- Phone Number Display โ Masked display for security with country flag identification
- Edit & Delete โ Update phone number details or remove numbers no longer in use
๐ญ Scenario Managementโ
Generate realistic inbound call scenarios at scale with AI or build them manually.
- AI Scenario Generation โ Generate up to 20 inbound test scenarios with configurable personas, languages, and special instructions
- Manual Scenario Creation โ Create scenarios with name, description, expected output
- Scenario Deletion โ Remove scenarios no longer needed
- Persona Selection โ Choose from available personas or create custom ones to simulate different caller types
- Language Support โ Generate scenarios in multiple languages (English, Spanish, etc.)
๐๏ธ Voice Configuration (Per Scenario)โ
Control every detail of how the simulated caller sounds and behaves.
- Voice Selection โ Choose from a library of voices with audio preview (multiple voice providers and accents available)
- Voice Preview โ Listen to voice samples with animated waveform visualization before selecting
- Background Sound โ Enable simulated background noise (15 presets: cafe, street, factory, rain, crowd, market, train, radio interference, etc.)
- Response Timing โ Configure wait time after speech ends (0.5โ5.0 seconds) to handle agents that speak in multiple sentences
- Max Call Duration โ Set maximum call length (60โ1800 seconds) to prevent runaway calls
- First Speaker โ Choose who speaks first โ the simulated user or the agent
๐ค Agent Profiles (Inbound-Specific)โ
Create reusable caller personas to standardize how test calls are placed across suites.
- Agent Profile Creation โ Configure agent personas with name, phone number, voice, and background noise
- Profile Library โ Organization-level reusable agent profiles
- Active/Inactive Toggle โ Enable or disable profiles
๐๏ธ Test Suitesโ
Batch your inbound scenarios into suites and run them all with a single action.
- Suite Creation โ Group multiple scenarios with per-scenario voice and phone configuration
- Test Profile Association โ Link test data for data-driven voice testing
- Agent Profile Assignment โ Associate agent profiles with suites
- Run Suites โ Execute all scenarios in a suite with a single action
๐ก Call Execution & Monitoringโ
Trigger, track, and manage live test calls in real-time.
- Initiate Test Calls โ Trigger simulated inbound calls to your voice agent
- Real-Time Monitoring โ Track call status as calls progress
- Call Duration Tracking โ Live duration counter during active calls
- Call Termination โ End calls in progress if needed
๐ Voice Analyticsโ
Upload and analyze real production recordings โ no new calls needed.
- Recording Upload โ Upload production call recordings (MP3, WAV) from real customer interactions
- Transcript Upload โ Upload transcripts alongside recordings for STT comparison
- Batch Upload & Analysis โ Upload and analyze multiple recordings in parallel
- Selective Metric Analysis โ Choose specific metric categories or individual metrics to analyze
- Bookmarking โ Bookmark important recordings for review
- Tagging โ Organize recordings with custom tags
- Search & Filter โ Find recordings by name, status, date, or tags
โถ๏ธ Recording Playbackโ
Listen to any call and follow along with a full, speaker-identified transcript.
- Audio Player โ Play/pause controls with duration tracking
- Full Transcript โ Speaker-identified transcript (agent vs. user)
- DTMF Detection โ Phone keypad inputs (0โ9, *, #) captured and displayed in transcript
- Download โ Download recording audio and transcript files
๐ฆ Go-Live Assessmentโ
| Verdict | Score Range | Meaning |
|---|---|---|
| ๐ข GREEN | โฅ 80 | Ready for Production |
| ๐ก YELLOW | 65โ79 | Ready with Caveats |
| ๐ด RED | < 65 | Not Ready |
| โช NO_DATA | โ | Insufficient data to assess |
- Overall Score & Confidence โ Weighted score with confidence based on call volume (HIGH: 100+, MEDIUM: 50โ99, LOW: 20โ49, VERY_LOW: < 20)
- Dimension Scores โ Functional Completeness, Quality Standards, Risk Profile, Operational Readiness
- Scenario Coverage โ Well-tested vs. untested vs. high-risk scenarios
- Failure Pattern Analysis โ AI-powered root cause analysis
- Validation Criteria Compliance โ Aggregated pass/fail rates
- Prioritized Action Items โ Improvement recommendations with expected impact
๐๏ธ Metric Configurationโ
Select exactly which metrics to run โ skip what's not relevant to reduce time and cost.
- Configurable Metrics โ Select which metric categories to evaluate per project, or choose individual metrics for granular control
- Category Toggles โ Enable/disable entire metric categories
- Reduce Complexity โ Skip non-critical metrics to reduce analysis time and cost
๐๏ธ Scheduled Runsโ
Keep coverage running continuously without manual effort.
- Cron-Based Scheduling โ Schedule test executions with preset frequencies (Hourly, Daily, Weekdays, Weekly, Monthly) or custom cron expressions
- Timezone Support โ Full IANA timezone selection
- Pause/Resume โ Temporarily pause and resume scheduled runs
- Run History โ Track all scheduled execution results
Phone Caller agents are evaluated across 8 metric categories with 30+ individual metrics.
A. Conversation Flow & Interaction Dynamicsโ
| Metric | Unit | What It Measures |
|---|---|---|
| Average Latency | ms | Time taken for the agent to respond after user stops speaking |
| Words Per Minute (WPM) | wpm | Agent's speaking speed |
| AI Talk Ratio | % | Percentage of call time the agent is speaking |
| User Talk Ratio | % | Percentage of call time the user is speaking |
| AI Interrupting User | % | How often the agent interrupts the user |
| User Interrupting AI | % | How often the user interrupts the agent |
B. Accuracy & Effectivenessโ
| Metric | Unit | What It Measures |
|---|---|---|
| First Call Resolution (FCR) | % | Whether the issue was resolved in a single call |
| Intent Recognition Accuracy | % | How accurately the agent understood user intent |
| Task Completion Success Rate | % | Percentage of assigned tasks completed successfully |
| Instruction Following | % | Adherence to configured instructions |
| Response Consistency | % | Consistency of responses to similar inputs |
C. User Experience & Satisfactionโ
| Metric | Unit | What It Measures |
|---|---|---|
| CSAT (Customer Satisfaction) | % | Overall customer satisfaction score |
| CSAT Reason | Text | Explanation for the satisfaction score |
| User Sentiment | Text | Detected emotional sentiment from user speech |
| Early Termination | % | Percentage of calls NOT terminated prematurely |
D. Business Operational Metricsโ
| Metric | Unit | What It Measures |
|---|---|---|
| Containment Rate | % | Issues resolved without human escalation |
| AI to Human Handoff Rate | % | Frequency of escalation to human agents |
E. Audio Voice Qualityโ
| Metric | Unit | What It Measures |
|---|---|---|
| Average Pitch | Hz | Voice pitch measurement (normal: 85โ300 Hz) |
| Voice Quality Index | 0โ5 scale | Composite voice quality score |
| Signal-to-Noise Ratio | % | Audio clarity vs. background noise |
F. Speech-to-Text (STT) Evaluationโ
| Metric | Unit | What It Measures |
|---|---|---|
| STT Accuracy | % | Transcription accuracy |
| STT Verdict | Pass/Fail | Overall transcription quality judgment |
| STT Summary | Text | Detailed transcription quality notes |
| Mismatch Examples | List | Specific instances where transcription differed from actual speech |
G. Validation Resultsโ
| Metric | Unit | What It Measures |
|---|---|---|
| Compliance | % | Compliance rate against custom validation criteria |
| Pass/Fail/Unable to Verify | Count | Per-criterion validation breakdown |
H. Detected Issue Tags (Automated)โ
The system automatically detects and flags the following issues in every call recording.
| Issue Tag | What It Detects |
|---|---|
| Latency Issues | Slow response times |
| Hallucination in Call Flow | Agent generating incorrect information |
| Transcript Issue | Transcription errors |
| Patchy Audio | Audio quality problems |
| Running in Loop | Agent repeating the same response |
| Incorrect STT | Speech recognition errors |
| Interruption Handling | Poor handling of user interruptions |
| Number Issue | Numeric data errors |
| Background Noise | Noise interference |
| No Response | Agent silence when a response was expected |
| BLANK/EMPTY STT | Empty transcription segments |
Threshold Configurationsโ
| Metric | ๐ข Excellent | ๐ก Good | ๐ด Poor |
|---|---|---|---|
| Average Latency | โค 1000ms | โค 2500ms | > 2500ms |
| Words Per Minute | โฅ 160 (Fast) | 131โ160 (Good) | < 110 (Slow) |
| Voice Quality Index | โฅ 2.5 / 5 | โ | < 2.5 / 5 |
| Average Pitch | 85โ300 Hz (Normal) | โ | < 85 or > 300 Hz |
Phone Caller Outbound Agentโ
Phone Caller Outbound supports the same two evaluation modes as Inbound (Pre-evaluation and Post-evaluation) and shares all features, with a few key differences in pre-evaluation mode only.
- Differences from Inbound
- Metrics
Outbound-Specific Pre-evaluation Featuresโ
- Scenario Generation โ Generate up to 7 outbound test scenarios (vs. 20 for inbound)
- Caller Profile Selection โ Select an outbound caller profile when generating scenarios
- Outbound Number Pool โ Reserve phone numbers from the outbound pool for test calls
- Passive Mode โ Listen to outbound calls without interfering (for QA monitoring)
- First Speaker Default โ Agent speaks first (vs. simulator for inbound)
Outbound Pool Managementโ
- View Pool Status โ See available outbound numbers
- Reservations โ View and manage per-suite number reservations
- Clear Reservations โ Release reserved numbers when done
All other features โ phone number management, voice configuration, test suites, call execution, post-evaluation (voice analytics, recording upload, batch analysis), go-live assessment, metrics, and scheduling โ are identical to Phone Caller Inbound.
Same as Phone Caller Inbound โ all 8 categories and 30+ metrics.
Image Analyzer Agentโ
- Features
- Metrics
๐ผ๏ธ Image Analysisโ
Upload single images or batch-process up to 50 at once โ via file upload or URL.
- Single Image Analysis โ Upload an image (or provide a URL) along with the original prompt used to generate it
- Batch Analysis โ Analyze up to 50 images at once with a shared prompt
- Drag & Drop Upload โ Drag and drop images directly into the upload area
- Supported Formats โ JPG, JPEG, PNG, GIF, WEBP, BMP (max 20 MB per image)
๐ฏ Custom Evaluation Criteriaโ
Define what "good" means for your images โ choose from three criteria types:
- Brand Guidelines
- Technical Specifications
- Custom Rules
- Allowed colors and prohibited colors
- Required fonts
- Logo requirements (text description)
- Required dimensions (width ร height)
- Aspect ratio requirements
- Allowed file formats
- Maximum file size
- Minimum resolution
- Freeform rule text
- Checklist items to validate
All criteria support Active/Inactive toggling, create/edit/delete operations, and search by name, description, or type.
๐ Analysis Historyโ
- Search โ Find past analyses by image name or prompt text
- Status Tracking โ View analysis status (Pending, Completed, Failed)
- Detailed View โ Click any analysis to view full results
- Bookmarking โ Bookmark important analyses for quick access
๐ Analytics Dashboardโ
- Overall Statistics โ Average score, highest score, lowest score, total analyses count
- Quality Trends โ Daily score breakdown over the last 30 days with bar chart visualization
- Prompt Performance โ Top 20 prompts ranked by average quality score, showing min/max scores, usage count, and comparison vs. overall average
| Metric | Scale | What It Measures |
|---|---|---|
| Quality Score | 0โ100 | Overall image quality and prompt adherence |
| Matches | List | Elements that correctly match the original prompt |
| Discrepancies | List | Missing or incorrect elements vs. the prompt |
| Overall Assessment | Text | Summary of how well the image matches the prompt |
| Detailed Observations | Text | In-depth analysis of specific image aspects |
Quality Score Interpretation:
| Score Range | Label | Rating |
|---|---|---|
| 90โ100 | Excellent | ๐ข |
| 80โ89 | Good | ๐ข |
| 60โ79 | Fair | ๐ก |
| 0โ59 | Poor | ๐ด |
Custom Criteria Compliance:
For each active criterion, results show:
- Compliance status: PASS, FAIL, or PARTIAL (color-coded badges)
- Compliance details with specific observations
Shared Features Across Agentsโ
๐๏ธ Project Managementโ
- Create Agents โ Name, description, and agent type selection
- Agent Listing โ View all agents with type-specific icons and filtering
- Agent Dashboard โ Overview of workflows, suites, and recent activity
๐ Test Profilesโ
- Custom key-value data with typed fields (string, number, boolean, email, URL, textarea, JSON)
- Default profile marking
- Import/Export as JSON
๐ญ Personasโ
- Pre-built persona library
- Custom persona creation
- Persona assignment to scenarios
๐๏ธ Schedulingโ
- Cron-based scheduling (Hourly, Daily, Weekdays, Weekly, Monthly, Custom)
- Timezone support (full IANA timezone list)
- Pause/Resume/Delete scheduled runs
- View next scheduled run time and run history
๐ฆ Go-Live Assessmentโ
| Verdict | Score Range | Meaning |
|---|---|---|
| ๐ข GREEN | โฅ 80 | Ready for Production |
| ๐ก YELLOW | 65โ79 | Ready with Caveats |
| ๐ด RED | < 65 | Not Ready |
- Overall score with confidence level
- Scenario coverage analysis
- AI-powered risk assessment and recommendations
๐ Environment Managementโ
- Create and manage test environments
- Variable management (name, value, type, persistence)
- Per-environment variable scoping
- Bulk variable creation
โ Validation Criteriaโ
- Custom pass/fail criteria per scenario
- Evidence-based validation with confidence levels (High/Medium/Low)
- Compliance percentage tracking
- Per-criterion results: Pass, Fail, or Unable to Verify
Feature Availability Matrixโ
| Feature | Chat | Voice | Phone Caller Inbound | Phone Caller Outbound | Image Analyzer |
|---|---|---|---|---|---|
| Workflow & Document Upload | โ | โ | โ | โ | โ |
| AI Scenario Generation | โ | โ | โ (up to 20) | โ (up to 7) | โ |
| Manual Scenario Creation | โ | โ | โ | โ | โ |
| Test Suites | โ | โ | โ | โ | โ |
| Endpoint Profiles | โ | โ | โ | โ | โ |
| Test Profiles | โ | โ | โ | โ | โ |
| Personas | โ | โ | โ | โ | โ |
| Playground | โ | โ | โ | โ | โ |
| Audio-Based Conversations | โ | โ | โ | โ | โ |
| Phone Numbers | โ | โ | โ | โ | โ |
| Voice Selection (per scenario) | โ | โ | โ | โ | โ |
| Background Noise Simulation | โ | โ | โ | โ | โ |
| Live Call Execution (Pre-eval) | โ | โ | โ | โ | โ |
| Production Recording Upload (Post-eval) | โ | โ | โ | โ | โ |
| Batch Recording Analysis | โ | โ | โ | โ | โ |
| Call Recording Playback | โ | โ | โ | โ | โ |
| DTMF Support | โ | โ | โ | โ | โ |
| Passive Mode | โ | โ | โ | โ | โ |
| Outbound Pool | โ | โ | โ | โ | โ |
| Image Upload & Analysis | โ | โ | โ | โ | โ |
| Custom Evaluation Criteria | โ | โ | โ | โ | โ |
| Brand Guideline Checks | โ | โ | โ | โ | โ |
| Image Analytics Dashboard | โ | โ | โ | โ | โ |
| Metric Thresholds | โ | โ | โ | โ | โ |
| Go-Live Assessment | โ | โ | โ | โ | โ |
| Validation Criteria | โ | โ | โ | โ | โ |
| Scheduled Runs | โ | โ | โ | โ | โ |
| HyperExecute Integration | โ | โ | โ | โ | โ |
| Import/Export Profiles | โ | โ | โ | โ | โ |
Metrics Availability Matrixโ
| Metric Category | Chat | Voice | Phone Caller Inbound | Phone Caller Outbound | Image Analyzer |
|---|---|---|---|---|---|
| Bias Detection | โ | โ | โ | โ | โ |
| Hallucination Detection | โ | โ | โ | โ | โ |
| Completeness | โ | โ | โ | โ | โ |
| Context Awareness | โ | โ | โ | โ | โ |
| Response Quality | โ | โ | โ | โ | โ |
| Conversation Flow | โ | โ | โ | โ | โ |
| User Satisfaction | โ | โ | โ | โ | โ |
| File Handling Quality | โ | โ | โ | โ | โ |
| File Generation Accuracy | โ | โ | โ | โ | โ |
| Latency & Interaction Dynamics | โ | โ | โ | โ | โ |
| Accuracy & Effectiveness (FCR, etc.) | โ | โ | โ | โ | โ |
| CSAT & User Experience | โ | โ | โ | โ | โ |
| Business Operational Metrics | โ | โ | โ | โ | โ |
| Audio Voice Quality | โ | โ | โ | โ | โ |
| STT Evaluation | โ | โ | โ | โ | โ |
| Issue Detection Tags | โ | โ | โ | โ | โ |
| Validation Criteria | โ | โ | โ | โ | โ |
| Image Quality Score | โ | โ | โ | โ | โ |
| Prompt Compliance | โ | โ | โ | โ | โ |
| Brand/Technical Spec Compliance | โ | โ | โ | โ | โ |
