[UPDATE: we have changed this article to use OkHttp3 version 4 - previous version of the article was referencing OkHttp3 version 3.]
Request
Transcription request has to be made to https://api.voicegain.ai/v1/asr/transcribe/async
The body of the request will be:
{
"sessions": [
{
"asyncMode": "REAL-TIME",
"websocket": {
"useSTOMP": true,
"name" : "My-Broadcast",
"label" : "My-Broadcast-today",
"minimumDelay": 100,
"ifExists" : "KILL"
}
}
],
"audio": {
"source": { "stream": { "protocol": "WEBSOCKET" } },
"format": "L16",
"channel" : "mono",
"rate": 16000,
"capture": false
},
"settings": {
"asr": {
"noInputTimeout": 59999,
"completeTimeout": 0
}
}
}
The broadcast websocket specific parameters are:
- useSTOMP - broadcast websocket needs to use STOMP protocol
- name of the websocket - the websocket definition with this name must exists (you can e.g. create it using the Web Console)
- label for the generated transcript in the broadcast archive
- minimumDelay - the purpose of the minimum delay >0 is to handle the hypotheses rewrites on the server side - very low value of minimumDelay will result in a lot of corrections happening on the client side
- ifExists - determines what to do if some other transcription session is already using this websocket for broadcasting - possible values are KILL, FAIL, TEST
Some notes about the audio streaming websocket:
- the protocol must be "WEBSOCKET"
- channel must be "mono"
- capture may be set to true for debugging - if set to true then response will have the uuid of the captured audio - it can the be retrieved using this web method: GET https://api.voicegain.ai/v1/data/{uuid}/file
Response
The response will be e.g.:
{
"sessions": [
{
"sessionId": "0-0kfrdm3561ujwshczv51fownlfm5",
"asyncMode": "REAL-TIME",
"websocket": {
"name": "My-Broadcast",
"url": "wss://cc.voicegain.ai/nats-websocket/port"
},
"audio": {
"stream": {
"websocketUrl": "wss://api.voicegain.ai/v1/0/socket/e5b22dc5-2a45-4525-9f12-55d4bd190e15"
}
}
}
Two websocket urls are returned:
- audio.stream.websocketUrl -- this will be used to stream the audio to the recognizer. Audio needs to be streamed using binary format. The format to be used is specified in the initial request - it must be mono. For available Audio formats see here.
- sessions[].websocket.url and sessions[].websocket.name
these are in this case largely for informational purpose only as typically the CC-App viewer will be used to view the live transcript.
Using javax.websocket
In Java (using javax.websocket), the connection to audio stream websocket would be established as follows, e.g.:
private void connectToWebSocket() {
WebSocketContainer container = ContainerProvider.getWebSocketContainer();
try {
container.connectToServer(this, audioStreamWebsocketUrl);
} catch (DeploymentException | IOException ex) {
ex.printStackTrace();
}
}
and then sending the data in binary using the wss session (note - the data has to be sent in binary, not as websocket messages):
session.getBasicRemote().sendBinary(bb);
Using OkHttp3
Alternatively, you can also use OkHttp3 v4 library that will work on Java and Android. Once you define the listener opening connection is simple:
// open audio streaming websocket
OkHttpClient audioClient = new OkHttpClient.Builder()
.readTimeout(0, TimeUnit.MILLISECONDS)
.build();
Request audioRequest = new Request.Builder()
.url(wssUrlStr)
.build();
MyListener audioListener = new MyListener("audio");
WebSocket audioWebSocket = audioClient.newWebSocket(audioRequest, audioListener);
Sending binary data is also very simple, e.g.:
boolean send(WebSocket ws, byte[] array, int offset, int length) throws Exception {
ByteBuffer bb = ByteBuffer.wrap(array,offset,length);
try {
ws.send(ByteString.of(bb));
return true;
} catch (IllegalStateException e) {
System.out.println("Assuming the other side closed the Websocket");
return false;
}
}
Example Code
You can see this code example for more details of how to use websockets with Voicegain API, just note that the example covers a use case where the results are streamed back via websocket rather than being broadcast via websocket.
Comments
0 comments
Please sign in to leave a comment.