Optimizing Virtual Supervisor forms for AI Scoring
Creating evaluation forms that work well for both human and AI review helps improve scoring accuracy and consistency in Genesys Cloud Quality Management. This article outlines best practices, examples, and design principles.
AI Scoring evaluates conversations based solely on the transcript and the evaluation form questions. It does not consider metadata such as routing paths, timestamps, or platform-level data. To ensure accurate results, design your forms with clear, measurable, and transcript-driven questions.
Well-designed virtual supervisor forms make AI scoring more accurate, efficient, and fair. By focusing on transcript-based, measurable questions and concise help text, organizations can ensure reliable AI evaluations and reduce manual review effort.
Human reviewers can interpret tone, intent, and context beyond the transcript. AI scoring models, however, can only analyze what is said or written.
To create forms suitable for both:
- Write questions that rely only on text-based evidence.
- Avoid emotional or subjective phrasing.
- Use consistent terminology throughout the form (for example, always “agent” and “customer”).
- Provide clear help text that defines what counts as a “Yes,” “No,” or specific response.
Example:
✅ Good: “Did the agent greet the customer using a standard phrase such as ‘hi,’ ‘hello,’ or ‘good morning’?”
? Avoid: “Did the agent sound friendly when greeting the customer?”
Focus on transcript-only evidence
Ask only questions that can be verified through text. Avoid tone or attitude-based language.
Use measurable and binary structures
Frame questions so AI can select Yes/No or a clear multiple-choice answer.
Write in complete sentences
Avoid shorthand or internal labels. Clear language improves model comprehension.
Define boundaries and examples
Clarify what counts as compliance and what does not. Provide example phrases.
Eliminate subjectivity
Focus on actions, not feelings (for example, “Did the agent acknowledge the issue?” rather than “Was the agent empathetic?”).
Include help text examples
Add transcript snippets to show acceptable and unacceptable answers.
Keep questions focused
Each question should measure only one behavior.
Use consistent language
Standardize terminology—always use “agent” and “customer” across the form.
Provide edge case guidance
Define exceptions so the model doesn’t misinterpret partial compliance.
Align questions with business goals
Each question should tie directly to customer experience, compliance, or efficiency objectives.
Help text provides essential context for both AI and human evaluators.
When writing help text:
- Define the purpose of the question and expected behavior.
- Describe what qualifies as a Yes or No.
- Include examples of acceptable and unacceptable phrases.
- Keep it concise (3–5 sentences or ≤500 characters).
- Provide short transcript examples for clarity.
Example:
Question: Did the agent verify the customer’s identity before providing account-specific support?
Help Text: The agent must confirm at least one credential (account number, date of birth, or phone number).
Yes: The agent requested and confirmed a credential.
No: The agent provided information without verifying identity.
Question | Answer Type | Help Text Example |
---|---|---|
Did the agent greet the customer at the start of the conversation? | Yes / No | The agent must start with a standard greeting such as “hi,” “hello,” or “good morning.” |
Did the agent verify the customer’s identity before discussing account details? | Yes / No | The agent must confirm at least one credential (account number, date of birth, or phone number). |
Did the agent acknowledge the customer’s issue before moving to resolution? | Yes / No | The agent should use acknowledgment phrases such as “I’m sorry,” “I understand,” or “I know this must be frustrating.” |
Did the agent confirm resolution or next steps before closing the conversation? | Yes / No | The agent must summarize next steps or confirm that the issue is resolved. |
Error | When It Occurs | How to Resolve |
---|---|---|
Rate limit error | The daily maximum of 50 AI scoring requests per agent is reached. | Space out evaluations, avoid unnecessary retries, or request a quota increase. |
Duplicate evaluation | The same form is resubmitted for the same interaction. | Use different evaluator IDs or forms for second reviews. |
Processing failure | The model cannot process vague or unsupported question formats. | Simplify phrasing, use transcript-based wording, and reprocess after correction. |
Low confidence | The model cannot determine an answer due to ambiguity. | Refine questions and add examples or clarifying help text. |
To maintain form quality and AI scoring accuracy:
- Review AI scoring results monthly to identify low-confidence questions.
- Refine help text or question phrasing based on scoring feedback.
- Align human evaluator training with AI-defined criteria.
- Add domain-specific variations (for example, healthcare, retail) as needed.
How should I design greeting questions for AI scoring?
How can I include identity verification checks in AI-scored forms?
How do I ensure AI accurately scores interruptions?
Should I include questions about using the customer’s name?
How can I measure empathy without using subjective language?
What’s the best way to confirm that the agent provided a resolution?
How should escalation be evaluated by AI?
How do I handle compliance or disclosure questions in AI scoring?
How can I design a question to handle dead air or silence?
How should I confirm that the agent closed the conversation properly?