This Article shows how to use synchronous /asr/transcribe method to transcribe inline audio data into a transcribed utterance (one or more).
Note that the longer the audio the less likely it is that multiple alternatives will be generated.
Maximum length of audio allowed for this synchronous web api is 60 seconds. For longer audio you will need to use async transcribe method, e.g. as described here.
All examples assume bash shell - on Windows you can use GitBash or you can install Linux.
Input JSON
The simplest input JSON will contain only the audio input encoded base64. A sample command that generates such JSON with inlined content of audio.wav file is:
echo -e {\ \"audio\" : { \"source\" : { \"inline\" : \"`(base64 -w0 3sec.wav)`\" } } }\ > body.json
Note, the above assumes that 3sec.wav file contains a proper RIFF header. If the file was a raw audio file (without headers) you would need to add format, rate, and channel parameters as describe here.
Invocation using curl
The web API can be invoked using the following curl POST command (note sometimes you may have to add -k option to ignore the SSL certificate error). The JWT token can be obtained as described here.
curl -i \ -H "Content-Type: application/json" \ -H 'Accept: application/json' \ -H "Authorization: Bearer eyJhbGciOiJIUzI1NixsInR5cCI6IkpXVCJ9.eyJhdWQiOixodHRwczovL2FwaS5hc2NhbGxvbi5pby9xcGkvdjEvcmVjb2duaXplIiwic3ViIjoiNWY0ZGViNWItZGUzOC00ZmE3LTg0MjYtY2M5M2Y3YzJhMjJmIn0.RaXAyjoj7Bgz_1batCa5LYmTRqkFxvRKv6_BOGCO70w" \ -d @body.json \ --verbose \ https://api.voicegain.ai/v1/asr/transcribe
Response
If successful you will get a response like this one:
{
"session":{
"sessionId":"0-0kfsabprp07zgi16t9vkxeaq5u46"
},
"result":{
"status":"MATCH",
"lastEvent":"RECOGNITION-COMPLETE",
"alternatives":[{
"utterance":"she had no doubt in the world of its being a very fine day",
"confidence":0.9973621368408203
}]
}
}
status equal MATCH indicates that recognition was successful.
alternatives field will contains an list of possible recognition hypotheses which consist of an utterance and its confidence.
Comments
0 comments
Article is closed for comments.