We expect that most scenarios where MRCP is used with the Voicegain ASR will be on Edge (on-prem). The main reason is because the MRCP protocol was not designed for user over public Internet:
- MRCP server uses multiple ports
- It uses RTP with:
- susceptibility to packet loss, due to it being a UDP protocol
- large port ranges - may require 1:1 NAT on a router rather than port forwarding
- Encryption is not supported in many MRCP servers - specifically, the MRCP server used by Voicegain does not support encryption.
To this we have 3 solutions:
- Use of a local MRCP server with a proxy to ASR in the Cloud - for more see here.
- Docker-compose setup of MRCP+ASR - this is a custom solution, which has to be configured to each deployment. If you are interested, contact us at firstname.lastname@example.org
- Edge deployment of a complete MRCP+ASR setup, the advantage of this over docker-compose is that:
- You can manage it from Voicegain Cloud
- Ability to use Voicegain STT Web-APIs if desired.
- Support for tools like GREG for example.
- You can monitor it using Grafana.
- Usage based billing (docker-compose only supports port-based licenses).
Two scenarios for deploying MRCP to Edge Kubernetes
There are two basic options for deploying MRCP to Kubernetes on Edge:
- Deploy N independent instances (each instance being an independent Kubernetes cluster) of a standard MRCP Edge configuration containing MRCP, ASR, and any other services. Each instance will consists of:
- On hardware - a single server with a Kubernetes installed
- On Cloud - one or two compute instances, together forming a Kubernetes cluster
- Deploy a single, multi-node Kubernetes cluster, within that cluster multiple MRCP, ASR, GPUs.
The advantage of the first option is a simplicity of maintenance, and we would suggest it for bare-metal deployments, in particular if your IT team does not have much expertise in managing Kubernetes clusters.
We would suggest the second option for Cloud deployments with managed Kubernetes, e.g. Google GKE. The second option has an advantage in monitoring (just a single instance to be monitored) plus it gives more flexibility in configuration, and can allow for better resource utilization, better HA behavior, etc.
Ports and protocols
MRCP server can be accessed on these ports and protocols:
- MRCP v1:
- 1554 TCP: for RTSP
- 5001-5xxx UDP: for RTP
- MRCP v2
- 8060 TCP and UDP: for SIP - we strongly suggest using TCP SIP - with UDP sometimes the traffic from our MRCP to the client gets blocked and session is not established.
- 1544 TCP: for RTSP
- 5001-5xxx UDP: for RTP
Generally, the requests will be made to the IPs of the Kubernetes nodes that host the MRCP servers, except for cases where the network is NAT'ed, see below.
Normally all requests go directly to the IPs of the Kubernetes nodes that host the MRCP servers, however, in some scenarios, the MRCP client may be on a host from which the route to the Kubernetes nodes has to go via a NAT device. In that case we will need to create a custom Edge deployment configuration which has the correct ext-ip plugged in. Here is a guide explaining how to determine the ext-ip for the MRCP-IVR-Proxy, but the ideas are applicable also to this use case.
Typically, MRCP clients, for example, most of the VXML platforms, will achieve High Availability (HA) by specifying IPs of multiple MRCP servers to which then the requests are made in a round-robin fashion. If one server does not respond to an invite, then it will be skipped. In this approach, if an MRCP server dies, then only the sessions alive on that server during the crash will be lost.
This is directly supported by Voicegain Edge Deployments of MRCP. Usually, 2 or more nodes will host MRCP service. Moreover, there will be 2 or more instances of ASR service.