This Article shows how to use synchronous /asr/transcribe method to transcribe inline audio data into a transcribed utterance (one or more).
Note that the longer the audio the less likely it is that multiple alternatives will be generated.
Maximum length of audio allowed for this synchronous web api is 60 seconds. For longer audio you will need to use async transcribe method, e.g. as described here.
All examples assume bash shell - on Windows you can use GitBash or you can install Linux.
NOTE: For production use you will likely need async APIs. There are just too many limitations on what the synchronous api can do.
Input JSON
The simplest input JSON will contain only the audio input encoded base64. A sample command that generates such JSON with inlined content of audio.wav file is (you can download sample audio from here: https://s3.us-east-2.amazonaws.com/files.public.voicegain.ai/3sec.wav):
echo -e {\ \"audio\" : { \"source\" : { \"inline\" : \"`(base64 -w0 3sec.wav)`\" } } }\ > body.json
Note, the above assumes that 3sec.wav file contains a proper RIFF header. If the file was a raw audio file (without headers) you would need to add format, rate, and channel parameters as describe here.
Invocation using curl
The web API can be invoked using the following curl POST command (note sometimes you may have to add -k option to ignore the SSL certificate error). The JWT token can be obtained as described here.
curl -i \ -H "Content-Type: application/json" \ -H 'Accept: application/json' \ -H "Authorization: Bearer eyJhbGciOiJIUzI1NixsInR5cCI6IkpXVCJ9.eyJhdWQiOixodHRwczovL2FwaS5hc2NhbGxvbi5pby9xcGkvdjEvcmVjb2duaXplIiwic3ViIjoiNWY0ZGViNWItZGUzOC00ZmE3LTg0MjYtY2M5M2Y3YzJhMjJmIn0.RaXAyjoj7Bgz_1batCa5LYmTRqkFxvRKv6_BOGCO70w" \ -d @body.json \ --verbose \ https://api.voicegain.ai/v1/asr/transcribe
Response
If successful you will get a response like this one:
{
"session":{
"sessionId":"0-0kfsabprp07zgi16t9vkxeaq5u46"
},
"result":{
"status":"MATCH",
"lastEvent":"RECOGNITION-COMPLETE",
"alternatives":[{
"utterance":"she had no doubt in the world of its being a very fine day",
"confidence":0.9973621368408203
}]
}
}
status equal MATCH indicates that recognition was successful.
alternatives field will contains an list of possible recognition hypotheses which consist of an utterance and its confidence.
Comments
0 comments
Article is closed for comments.