Audio Connector overview

Audio Connector is a mechanism and generic protocol to provide a bi-directional and near real-time stream of voice interactions from the Genesys Cloud platform to a third-party voice bot provider and back. Audio Connector enables partners and customers to extend the open Genesys Cloud platform with their own voice-based bot services and fulfill bi-directional audio streaming needs such as active voice biometrics. For more information about how to implement an Audio Connector connection, see the AudioHook Protocol Specification in the Genesys Cloud Developer Center. For information about how to showcase a simple service that implements the Genesys AudioConnector Protocol, see the Genesys AudioConnector Sample Service GitHub repository.

You can use Audio Connector to stream near real-time conversational audio and metadata to customers and partners, analyze and process the streamed audio and data, and stream it back to the Genesys Cloud platform.

To troubleshoot your AudioHook protocol-based integration, subscribe to AudioHook-related operational events. For more information about operational events, see the Operational Event Catalog in the Genesys Cloud Developer Center. For more information about how to set up the Operational Console, see Troubleshoot using the Genesys Cloud Operational Console.

Notes:

Genesys Cloud supports up to five Audio Connector integrations. To request for more than five integrations in your organization, submit an idea with your use case for further evaluation in the Genesys Cloud Ideas Portal.
Audio Connector supports one bi-directional stream. Also, the following considerations apply to the streaming session:
- Only the external audio channel is sent from the client to the server. For more information, see AudioHook features in the Genesys Cloud Developer Center.
- The Call Audio Connector action in Architect forks the voice stream, sends it to the configured URL, and then pauses the flow execution at this point, until the bi-directional stream ends (either by the caller or the middleware Audio Connector Server).
- The bi-directional streaming session is active only in the IVR channel. It does not transfer to an agent.
- All standard Architect transfer functions and behavior apply if the call gets transferred to an agent after the end of the bi-directional streaming session.
Mapping metadata through the Call Audio Connector action in Architect allows you to further steer the call after the end of the streaming session. For more information, see How can I pass output data from the third-party bot to the Architect flow when using the Call Audio Connector action?
Genesys Cloud charges during the connection but not for client-initiated pause.
Genesys Cloud does not charge additional fair use minutes for voice transcription while you use Audio Connector.
Create Audio Connector servers in the same or near region to keep latency to a minimum. As the audio streaming is sent to the third-party URL through internet, the distance from your Genesys Cloud organization to the Audio Connector Server might affect the call quality.
Audio Connector is not supported under BYOC Premises.
Audio Connector is not supported for premises-based Edge (LDM).

Architecture overview

AudioConnector architecture overview

Conversational session management features

Handled by
Conversational session management feature	Audio Connector	Partner Cloud
Start connection
Silence detection
Send audio
Receive audio
Generic SSML (Speech Synthesis Markup Language) support in TTS output – play voice files
Hang-up detection
Barge-in
DTMF detection
Punctuation detection
Close connection
Switch between speech models
Support for custom TTS
Fallback

When you use Audio Connector, the voice call audio streams from and to the Genesys Cloud platform over a secure web socket. The Architect Call Audio Connector action enables Audio Connector to route your voice interaction to a third-party voice bot and start streaming to it. After your platform analyzes the audio and enables you to perform an action or implement a process through third-party applications according to your organization’s goals and needs, Audio Connector streams the audio back to Genesys Cloud.

For more information about limitations, see Limits > AudioHook and Limits > Voice Bots in the Genesys Cloud Developer Center.

Example use case

When you stream Genesys Cloud interactions through an Audio Connector server, you can monitor and process interactions for active voice biometrics, transcription, and Agent Assist. Your Audio Connector server can pause, resume, and end the streaming. You can enable multiple streams and use cases to run in parallel. An arbitrary number of streams to third parties means that any partner can build a solution that can listen in on calls to add value. You can use the audio stream returned from the Audio Connector server for TTS or call steering via Architect.

Use cases include:

Active voice biometrics; for example, verbal recognition of passwords
Language work for non-native Genesys Cloud features
Routing voice interactions to your preferred bot provider

For more information, see About Audio Connector.