Overview
We will assume that the audio to be transcribed is available via a URL, e.g., hosted on AWS S3.
The process will consist of 3 steps:
- make the async transcription request in OFF-LINE mode
- wait for the offline transcription task to finish
- retrieve the result of transcription
Sample python script
The the python code that accomplishes this can be found here:
To run n the code you need to put your JWT token in the indicated place
JWT = "<Your JWT here>"
you also need to provide path to where your source audio to be transcribe is located:
audio_url = "https://s3.us-east-2.amazonaws.com/files.public.voicegain.ai/3sec.wav"
We suggest you run the script once with the audio file we provide to verify that the script works for you. If it works correctly, the transcript you will see will be:
She had no doubt in the world of it's being a very fine day.
As the code runs it polls about every 5 seconds to see if the transcription is finished.
The intermediate results may contain output like this:
{"session":
{"sessionId": "O-0-0kgv84axk082w25jzjddro0m6tzj", "asyncMode": "OFF-LINE"},
"result": {"final": false},
"responseType": "AsyncResultFull",
"progress": {"phase": "PROCESSING", "audioStartTime": 0}
}
The final result response will be much larger and will contain fields like:
"final": true
"responseType": "AsyncResultFull"
"words": [ ... ]
"transcript": " ... "
You can use the transcript from this final result or you can use the transcript retrieved using
GET https://api.voicegain.ai/v1/asr/transcribe/{sessionId}/transcript?format=text
request at the end of the python script.
This transcript will be available for retrieval for amount of time specified in poll.persist value (in milliseconds) in the initial request:
"poll": {
"afterlife": 60000,
"persist": 86400000
},
Processing multiple files
Note: if you have multiple files to process modify the script to submit all of them up-front and then in the second part of the script you can monitor the transcription process. This way all files will be submitted to the offline processing queue and will be transcribed in parallel.
Polling interval
You do not need to poll as frequently as 5 seconds as in the example, but we suggest to poll no less frequently as the specified poll.afterlife time (in milliseconds) - this way polling will always be done in memory.
Comments
0 comments
Please sign in to leave a comment.