Here is an overview of the 12 languages supported in Voicegain Speech-to-Text.
Language | Offline (batch) transcription | Real-time (streaming) transcription | Punctuation and formatting | Availability | status |
en (English - UK and GB combined) | yes | yes | both | current | production |
es (Spanish - focus on Latin America) | yes | yes | both | current | beta prod |
hi (Hindi) | yes | no | no | current | beta prod |
de (German) | yes | no | no | current | alpha prod |
pt (Portuguese - focus on Brazil) | yes | no | no | deploy to prod upon request | alpha |
pl (Polish) | yes | no | no | deploy to prod upon request | alpha |
nl (Dutch) | yes | no | no | deploy to prod upon request | alpha |
ko (Korean) | yes | no | no | deploy to prod upon request | alpha |
uk (Ukrainian) | yes | no | no | deploy to prod upon request | alpha |
fr (French - both Quebec and Parisian) | yes | no | no | deploy to prod upon request | alpha |
ar (Arabic) | yes | no | no | deploy to prod upon request | alpha |
it (Italian) | yes | no | no | have data ready, can train upon request | ready to train |
What does "deploy to prod upon request" availability mean?
It means that we have to enable this language model on production. Please send us an email to support@voicegain.ai and we should have it on prod within a day or two in offline version. Real-time version will take 1 to 2 weeks.
What does "alpha" status mean?
The Alpha early access models differ from full-featured production models in the following ways:
- They are not good at rejecting background noise, music, etc.
- The vocabulary may be limited - they may not be good at recognizing names of products, people, places, etc. Generally the vocabulary is the core every day vocabulary of a given language.
- They will not be good at recognizing heavy or unusual accents.
- Punctuation and capitalization is not available.
- Formatting of digits, time, dates, currencies is not available.
- For languages not using Latin alphabet, there could be occasional glitches in the characters in the transcript.
- Initially most of those models are available in offline/batch mode only. We are working on training the real-time/streaming models.
As alpha models are being trained on additional data, their accuracy will improve. We are also working on punctuation, capitalization, and formatting of each of those models.
Upon request we can quickly (2 to 4 weeks) improve accuracy of the alpha models, as well as add punctuation and formatting.
Choosing a language from API
In order to use language other that the default English, the easiest way is to include it in the body of the API request in the settings.asr, e.g. to use Spanish:
{ "sessions" : [...],
"audio" : {...},
"settings" : {
"asr" : {
"languages" : ["es"],
...
}
...
}
}
Languages available only in OFFLINE mode (Whisper model)
We also offer a Whisper model available via Voicegain API. It is available only in OFFLINE mode and supports these languages. Note that when deployed on-prem whisper model will have higher resource requirements compared to native Voicegain model.
"ar" - Arabic
"de" - German
"en" - English
"es" - Spanish
"fr" - French
"hi" - Hindi
"ko" - Korean
"nl" - Dutch
"pl" - Polish
"pt" - Portuguese
"it" - Italian
"uk" - Ukrainian
"af" - Afrikaans
"am" - Amharic
"as" - Assamese
"az" - Azerbaijani
"ba" - Bashkir
"be" - Belarusian
"bg" - Bulgarian
"bn" - Bengali
"bo" - Tibetan
"br" - Breton
"bs" - Bosnian
"ca" - Catalan
"cs" - Czech
"cy" - Welsh
"da" - Danish
"el" - Greek
"et" - Estonian
"eu" - Basque
"fa" - Persian
"fi" - Finnish
"fo" - Faroese
"gl" - Galician
"gu" - Gujarati
"ha" - Hausa
"he" - Hebrew
"hr" - Croatian
"ht" - Haitian
"hu" - Hungarian
"hy" - Armenian
"id" - Indonesian
"is" - Icelandic
"ja" - Japanese
"jw" - Javanese
"ka" - Georgian
"kk" - Kazakh
"km" - Khmer
"kn" - Kannada
"la" - Latin
"lb" - Luxembourgish
"ln" - Lingala
"lo" - Lao
"lt" - Lithuanian
"lv" - Latvian
"mg" - Malagasy
"mi" - Maori
"mk" - Macedonian
"ml" - Malayalam
"mn" - Mongolian
"mr" - Marathi
"ms" - Malay
"mt" - Maltese
"my" - Burmese
"ne" - Nepali
"nn" - Norwegian Nynorsk
"no" - Norwegian
"oc" - Occitan
"pa" - Punjabi
"ps" - Pashto
"ro" - Romanian
"ru" - Russian
"sa" - Sanskrit
"sd" - Sindhi
"si" - Sinhala
"sk" - Slovak
"sl" - Slovenian
"sn" - Shona
"so" - Somali
"sq" - Albanian
"sr" - Serbian
"su" - Sundanese
"sv" - Swedish
"sw" - Swahili
"ta" - Tamil
"te" - Telugu
"tg" - Tajik
"th" - Thai
"tk" - Turkmen
"tl" - Tagalog
"tr" - Turkish
"tt" - Tatar
"ur" - Urdu
"uz" - Uzbek
"vi" - Vietnamese
"yi" - Yiddish
"yo" - Yoruba
"zh" - Chinese
Do not see a language that you need?
Since our language models are created exclusively with End-to-End Deep Learning, we can perform transfer learning from one language to another, and quickly support new languages and dialects to better meet your use case. Don’t see your language listed below? Contact us at support@voicegain.ai, as new languages and dialects are released frequently.
Comments
0 comments
Please sign in to leave a comment.