Languages supported in Voicegain Speech-to-Text – Voicegain

Here is an overview of the 12 languages supported in Voicegain Speech-to-Text.

Language	Offline (batch) transcription	Real-time (streaming) transcription	Punctuation and formatting	Availability	status
en (English - UK and GB combined)	yes	yes	both	current	production
es (Spanish - focus on Latin America)	yes	yes	both	current	production
hi (Hindi)	yes	no	no	current	beta prod
de (German)	yes	no	no	current	alpha prod
pt (Portuguese - focus on Brazil)	yes	no	no	deploy to prod upon request	alpha
pl (Polish)	yes	no	no	deploy to prod upon request	alpha
nl (Dutch)	yes	no	no	deploy to prod upon request	alpha
ko (Korean)	yes	no	no	deploy to prod upon request	alpha
uk (Ukrainian)	yes	no	no	deploy to prod upon request	alpha
fr (French - both Quebec and Parisian)	yes	no	no	deploy to prod upon request	alpha
ar (Arabic)	yes	no	no	deploy to prod upon request	alpha
it (Italian)	yes	no	no	have data ready, can train upon request	ready to train

What does "deploy to prod upon request" availability mean?

It means that we have to enable this language model on production. Please send us an email to support@voicegain.ai and we should have it on prod within a day or two in offline version. Real-time version will take 1 to 2 weeks.

What does "alpha" status mean?

The Alpha early access models differ from full-featured production models in the following ways:

They are not good at rejecting background noise, music, etc.
The vocabulary may be limited - they may not be good at recognizing names of products, people, places, etc. Generally the vocabulary is the core every day vocabulary of a given language.
They will not be good at recognizing heavy or unusual accents.
Punctuation and capitalization is not available.
Formatting of digits, time, dates, currencies is not available.
For languages not using Latin alphabet, there could be occasional glitches in the characters in the transcript.
Initially most of those models are available in offline/batch mode only. We are working on training the real-time/streaming models.

As alpha models are being trained on additional data, their accuracy will improve. We are also working on punctuation, capitalization, and formatting of each of those models.

Upon request we can quickly (2 to 4 weeks) improve accuracy of the alpha models, as well as add punctuation and formatting.

Choosing a language from API

In order to use language other that the default English, the easiest way is to include it in the body of the API request in the settings.asr, e.g. to use Spanish:

{  "sessions" : [...],
  "audio" : {...},
  "settings" : {
    "asr" : {
      "languages" : ["es"],
      ...
    }
    ...
  }
}

Languages available only in OFFLINE mode (Whisper model)

We also offer a Whisper model available via Voicegain API. It is available only in OFFLINE mode and supports these languages. Note that when deployed on-prem whisper model will have higher resource requirements compared to native Voicegain model.

"ar" - Arabic
"de" - German
"en" - English
"es" - Spanish
"fr" - French
"hi" - Hindi
"ko" - Korean
"nl" - Dutch
"pl" - Polish
"pt" - Portuguese
"it" - Italian
"uk" - Ukrainian
"af" - Afrikaans
"am" - Amharic
"as" - Assamese
"az" - Azerbaijani
"ba" - Bashkir
"be" - Belarusian
"bg" - Bulgarian
"bn" - Bengali
"bo" - Tibetan
"br" - Breton
"bs" - Bosnian
"ca" - Catalan
"cs" - Czech
"cy" - Welsh
"da" - Danish
"el" - Greek
"et" - Estonian
"eu" - Basque
"fa" - Persian
"fi" - Finnish
"fo" - Faroese
"gl" - Galician
"gu" - Gujarati
"ha" - Hausa
"he" - Hebrew
"hr" - Croatian
"ht" - Haitian
"hu" - Hungarian
"hy" - Armenian
"id" - Indonesian
"is" - Icelandic
"ja" - Japanese
"jw" - Javanese
"ka" - Georgian
"kk" - Kazakh
"km" - Khmer
"kn" - Kannada
"la" - Latin
"lb" - Luxembourgish
"ln" - Lingala
"lo" - Lao
"lt" - Lithuanian
"lv" - Latvian
"mg" - Malagasy
"mi" - Maori
"mk" - Macedonian
"ml" - Malayalam
"mn" - Mongolian
"mr" - Marathi
"ms" - Malay
"mt" - Maltese
"my" - Burmese
"ne" - Nepali
"nn" - Norwegian Nynorsk
"no" - Norwegian
"oc" - Occitan
"pa" - Punjabi
"ps" - Pashto
"ro" - Romanian
"ru" - Russian
"sa" - Sanskrit
"sd" - Sindhi
"si" - Sinhala
"sk" - Slovak
"sl" - Slovenian
"sn" - Shona
"so" - Somali
"sq" - Albanian
"sr" - Serbian
"su" - Sundanese
"sv" - Swedish
"sw" - Swahili
"ta" - Tamil
"te" - Telugu
"tg" - Tajik
"th" - Thai
"tk" - Turkmen
"tl" - Tagalog
"tr" - Turkish
"tt" - Tatar
"ur" - Urdu
"uz" - Uzbek
"vi" - Vietnamese
"yi" - Yiddish
"yo" - Yoruba
"zh" - Chinese
"yue" - Yue (Cantonese)

Do not see a language that you need?

Since our language models are created exclusively with End-to-End Deep Learning, we can perform transfer learning from one language to another, and quickly support new languages and dialects to better meet your use case. Don’t see your language listed below? Contact us at support@voicegain.ai, as new languages and dialects are released frequently.

What does "deploy to prod upon request" availability mean?

What does "alpha" status mean?

Choosing a language from API

Languages available only in OFFLINE mode (Whisper model)

Do not see a language that you need?

Related articles