With Architect’s speech recognition feature, you can designate words and phrases associated with any flow, providing callers the ability to verbally select the options they want, rather than using the telephone keypad to navigate the IVR. When you configure a flow in Architect, you can define one of the following options to present to callers who enter a menu:

  • Specify the numeric key a caller presses to select the schedule or menu item
  • Set speech recognition terms to interpret a caller’s verbal selection
  • Provide both numeric and verbal options, giving callers the choice of using the keypad or verbally communicating which menu they want to reach

For more information, click to expand the following speech recognition sections:

For information on the languages in which Architect supports automatic runtime data playback, speech recognition, and text-to-speech, see Genesys Cloud supported languages.

In addition to speech recognition terms, you can define such parameters as minimum confidence levels and timeouts played to the caller if the speech engine cannot determine the caller’s selection, manage custom attributes, and record notes about configuration settings and changes. If you have the appropriate rights, you can override these default settings for any menu. All configuration settings trickle down to the menu levels unless overridden.

For every flow, you can enter keywords and phrases a caller might use to verbally indicate the location to which they want to transfer. You can choose from a list of languages and you can enter multiple words and phrases, particularly useful if callers use different terminology for the same destination. For example, callers who want to reach the customer service department might say “customer service”, “customer support”, “support”, “product assistance”, and so on. 

All configuration settings trickle down to the sub menu below it unless overridden. You can add speech terms for any of the supported languages of flow, and you can add, edit, and delete terms from the list. If you have the appropriate rights granted by your administrator, you can override the speech recognition default settings at the schedule or menu level. For example, you can: 

  • Adjust the minimum confidence level, or the highest scoring the phrase must receive to achieve a successful match.
  • Set the maximum time to wait after caller provides a valid speech input and stops talking before the system presents a positive match.
  • Choose the length of time to wait after caller stops talking and when the verbal selection is invalid before the system indicates no match is found.
  • Specify the length of time allotted to the caller to make his or her verbal selection before the system times out and either prompts the caller to make a selection or disconnects the call.

The Company Directory speech recognition setting enables the company directory for the entire flow, or just for the starting menu or task. This option is enabled by default, and directs the system to ask callers to say the name of the person or department they wish to transfer to, you must also modify the main greeting prompt to include this option for the caller. You can re-record the prompt in the main menu’s Initial Greeting area. If your prompts are professionally recorded, obtain an updated recording from the recording company and upload it.

Work with speech recognition

To help familiarize yourself with speech recognition tasks, review the following pages:

Article Description

Specify speech recognition settings

Specify default settings for the configuration’s speech recognition behavior.

Override default speech recognition settings

In menu actions, if you have the appropriate rights, you can override default speech recognition settings as defined in the call flow’s Settings area.

Enter speech recognition terms

Enter keywords and phrases a caller might use to verbally indicate the location to which they want to transfer. 

DTMF and speech recognition settings in menu choices

When an action is added to a menu choices, Architect provides additional settings, such as DTMF and speech recognition, that are not available when the action is used in a task sequence.