PureCloud local control fault tolerance


High-level architecture overview

The following diagram:

  • Illustrates the PureCloud architecture with telephony equipment at your sites (local control mode).
  • Highlights designed redundancy, but does not provide details. For more information about redundancy, see the tables below the diagram.
  • Does not include your customers (the consumers) who talk to agents. Consumers can connect using either the Public Switched Telephone Network (PSTN) and SIP trunks, or using video chat, text chat, or email through PureCloud services running in Amazon Web Services (AWS). For consumers who connect through the services in AWS, connectivity is similar to the agent connectivity described below.

localcontrolfaulttolerance1

The following tables identify the call-outs in the diagram.

Note:  The tables that appear below include estimated recovery times for various types of failures. The estimated recovery times are not guaranteed and could be longer if multiple systems fail simultaneously.

PureCloud AWS-hosted items

Callout Name Description
0

Amazon Elastic Load Balancers (ELBs)

 

Amazon ELBs are a core component of PureCloud. ELBs are clusters of servers that run in multiple availability zones. ELBs load balance HTTP requests across multiple availability zones and backend Elastic Compute Cloud (EC2) instances. For more information about EC2, see callout 6: PureCloud microservices running on Amazon Elastic Compute Cloud.

When the ELB detects that a backend instance is either at capacity or has failed, it routes traffic to other instances in seconds to compensate. Both PureCloud’s public APIs and backend instances are fronted by ELBs.

1

Amazon CloudFront content delivery network (CDN)

All PureCloud static content (HTML, images, JavaScript, and so on), is served from the Amazon CloudFront CDN. Amazon CloudFront uses Amazon’s highly reliable infrastructure. The distributed nature of edge locations used by Amazon CloudFront automatically routes users to the closest available location as network conditions require.

Origin requests from the edge locations to AWS origin servers (for example, Amazon EC2 and Amazon S3), travel over network paths that Amazon constantly monitors and optimizes for both availability and performance.

2 Amazon Availability Zones (AZs) Amazon AZs consist of one or more discrete data centers, each with redundant power, networking, and connectivity, housed in separate facilities. AZs offer the ability to operate production applications and databases that are more highly available, fault tolerant, and scalable than is possible from a single data center. Dedicated fiber lines connect AZs so that normal connectivity has very low latency and AZ outages are detectable in seconds.
3 Cache clusters PureCloud uses multiple cache clusters for caching data from databases and external services to make PureCloud responsive. The nodes in each cluster spread across multiple AZs so that the loss of a singe AZ does not bring down the cluster. If any node in the cluster fails, then the other nodes take over the failed node’s work in seconds. If a complete cache cluster fails, then the cache cluster can be redeployed and begins caching data from the data’s home of record in minutes.
4 Amazon Simple Storage Service (S3)

Amazon S3 provides durable infrastructure to store important data and is designed for durability of 99.999999999% of objects. Your data is redundantly stored across multiple facilities and multiple devices in each facility. Amazon S3 is designed for up to 99.99% availability of objects over a year and is backed by the Amazon S3 service level agreement. This ensures that the service is reliable.

5 Database clusters PureCloud uses multiple database clusters to store data. The nodes in each cluster are spread across multiple AZs so that the loss of one AZ does not bring down the cluster. If one node in the cluster fails, then the other nodes take over the work of the failed node in seconds, using the replicated data. If a complete database cluster fails, then the database cluster can be redeployed and populated from hourly backups stored in S3 in hours.
6 PureCloud microservices running on Amazon Elastic Compute Cloud (EC2) More than 100 microservices compose PureCloud. Each microservice instance runs on its own EC2 instance, and each microservice has at least two instances running in separate AZs. Amazon ELBs load balance requests across the microservice instances and detect failed instances within seconds, routing requests to other instances. The EC2 instances are also in Amazon Auto Scaling groups, which scale up the number of instances to meet demand automatically and replace failed instances with new healthy instances in minutes.
7 Amazon DynamoDB In addition to the database clusters, PureCloud makes heavy use of DynamoDB. DynamoDB is highly available, with automatic and synchronous data replication across three facilities in a region. This helps protect your data against individual machine failures or even facility-level failures. All of the PureCloud DynamoDB tables are backed up hourly to S3 to ensure data restoration in hours in the event of a catastrophic failure.

Customer data center-hosted items

Callout Name Description
8

Customer Internet service providers (ISPs)

Because PureCloud is a cloud service running AWS, it is important that you use highly available ISPs for your Internet connectivity. Having multiple Internet connections spread across multiple ISPs is recommended.

9

Agent computers

The PureCloud agent user interface can run completely within a web browser for a zero-install deployment, which allows agents to switch desks quickly or switch computers in the event of computer hardware failure or mechanical desk failure. The diagram above does not include supervisor and administrator computers, but supervisors and administrators use the same user interface and the same fault tolerance descriptions apply to them.

A

 Agent phones

Agents can use stand-alone SIP phones or a SIP softphone that runs on their computers. If phone hardware fails, an agent can switch out physical phones and use the new phone in the user interface. Agent phones register with two Edge devices (a primary and a back-up). If the primary device fails, then agent phones can switch to the back-up device in seconds.
B PureCloud Edge devices PureCloud Edge devices form the telephony backbone of PureCloud and are responsible for all call audio and signaling. PureCloud Edge devices can still perform basic call control, ACD routing, and IVR functionality, even when their AWS connection is down. If an Edge device fails, then all active calls on the device terminate, but other Edge devices continue to operate and take over the load of the failed device.
C Customer SIP trunks In the local control model, you are responsible for your own SIP trunks. For redundancy, use multiple SIP trunks from multiple vendors. SIP trunks must handle Edge device outages and load balance across multiple Edge devices.
D Customer sites and data centers You should separate agents, Edge devices, phones, and agent computers into multiple sites that are geographically separate so that a power failure or natural disaster at one site does not affect the other sites.