Preparing your dictionary and language model for Shout
For decoding, three binary files are needed: a lexical tree file, a language model file and an acoustic model file. The acoustic models need to be trained by Shout. The language model and lexical tree are created using the applications shout_dct2lextree and shout_lm2bin.Preparing a dictionary
The application shout_dct2lextree needs two input files and will output a binary 'lexical tree' file. The input phone list file consists of the entire list of phones. The first two lines of this format define the total number of phone and non-speech models. It uses the following syntax:- "Number of phones:" [number of phone models]
- "Number of SIL's:" [number of non-speech models]
- One non-speech model name per line (times the specified number of non-speech models)
- One phone model name per line (times the specified number of phone models)
The pronunciation dictionary contains one word per line followed by a string of phone or non-speech model names separated by one or more spaces. Make sure that the first two lines in your DCT are as follows:
- <s> SIL
- </s> SIL