Spoken Language Support

This section documents Dragonfly’s support for spoken languages.

Languages with speech recognition engine support

Speech recognition engines supported by Dragonfly have a set spoken language. This language can be checked via the engine.language property, which returns an ISO 639-1 code (e.g. “en”):

from dragonfly import get_engine
engine = get_engine()

# Print the engine language.
print("Engine language: {}".format(engine.language))

Each speech recognition engine supported by Dragonfly supports many languages. These are listed below with citations.

It is worth noting that Dragonfly’s use of ISO 639-1 language codes means that no distinction is made between variants of languages. For example, U.S. English and U.K. English will both yield "en" and be treated as the same language, even though there are some differences.

Languages supported by Dragon

The following languages are supported by Dragon Professional Individual version 15 [1]:

  • English (multiple variants)
  • Dutch
  • French
  • German
  • Italian
  • Spanish

Please check the linked Nuance knowledgebase page for the languages supported by other versions and editions of Dragon.

Languages supported by Windows Speech Recognition

The following languages are supported by Windows Speech Recognition (WSR) as of 2016 [2]:

  • English (U.S.) (*)
  • English (U.K.)
  • Chinese (Simplified) (*)
  • Chinese (Traditional)
  • French (France)
  • German (Germany)
  • Japanese
  • Spanish (Spain)

* Successfully tested.

Microsoft does not appear to be documenting the languages available for WSR any more, which is why the provided citation for this section is an archive.org link. Currently, the best way to find out if your language is supported is to look for available speech models in the Windows language settings: Settings > Time & Language > Language.

Languages supported by CMU Pocket Sphinx

The CMU Pocket Sphinx engine documentation page has a section on spoken language support. There are CMU Pocket Sphinx models and dictionaries available from Source Forge for the following languages [3]:

  • English (U.S.) (*)
  • English (Indian)
  • Catalan
  • Chinese (Mandarin) (*)
  • Dutch
  • French
  • German
  • Greek
  • Hindi
  • Italian
  • Kazakh
  • Portuguese
  • Russian (*)
  • Spanish

* Successfully tested.

English (U.S.) is the default language used by the CMU Pocket Sphinx engine.

Languages supported by Kaldi

The following languages are supported by the Kaldi engine back-end:

  • English (U.S.)

It is possible for Kaldi to support other languages in the future. This requires finding decent models for other languages and making minor modifications to enable their use by the Kaldi Active Grammar library.

You can request to have your language supported by opening a new issue or by contacting David Zurow (@daanzu) directly.

Languages with built-in grammar support

Dragonfly’s Integer, IntegerRef and Digits classes have support for multiple spoken languages. Each supported language has a sub-package under dragonfly.language. The current engine language will be used to load the language-specific content classes in these sub-packages.

This functionality is optional. Languages other than those listed below can still be used if the speech recognition supports them.

The following languages are supported:

  • Arabic - “ar”
  • Dutch - “nl”
  • English - “en”
  • German - “de”
  • Indonesian - “id”
  • Malaysian - “ms”

English has additional time, date and character related classes.

Language classes reference

ShortIntegerRef

ShortIntegerRef is a modified version of IntegerRef which allows for greater flexibility in the way that numbers may be pronounced, allowing for words like “hundred” to be dropped. This may be particularly useful when navigating files by line or page number.

Some examples of allowed pronunciations:

Pronunciation Result
one 1
ten 10
twenty three 23
two three 23
seventy 70
seven zero 70
hundred 100
one oh three 103
hundred three 103
one twenty seven 127
one two seven 127
one hundred twenty seven 127
seven hundred 700
thousand 1000
seventeen hundred 1700
seventeen hundred fifty three 1753
seventeen fifty three 1753
one seven five three 1753
seventeen five three 1753
four thousand 4000

The class works in the same way as IntegerRef, by adding the following as an extra.

ShortIntegerRef("name", 0, 1000),

References

[1]https://nuance.custhelp.com/app/answers/detail/a_id/6280/kw/Dragon%20NaturallySpeaking%20languages%20supported/related/1
[2]https://web.archive.org/web/20160501101405/http://www.microsoft.com:80/enable/products/windowsvista/speech.aspx
[3]https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/