Shout_lm2bin Class Reference
With this class (main class of the application shout_lm2bin) an ARPA language model can be stored in a LanguageModel object and written to disk. More...

Public Member Functions | |
Shout_lm2bin (char *lexName, char *lmName, char *binLmName) | |
~Shout_lm2bin () | |
Protected Member Functions | |
int | getNumberOfValidEntries (char *word1, char *word2, FILE *file) |
void | sortWords (int numberOfItems, int *wordList, int *sortList) |
Protected Attributes | |
MemMappedFile * | memMap |
int | numberOfWords |
char ** | vocabulary |
StringLookup * | wordTree |
Detailed Description
With this class (main class of the application shout_lm2bin) an ARPA language model can be stored in a LanguageModel object and written to disk.Constructor & Destructor Documentation
Shout_lm2bin::Shout_lm2bin | ( | char * | lexName, | |
char * | lmName, | |||
char * | binLmName | |||
) |
The main function of the shout_lm2bin application will call this constructor. All work is done by this constructor.
The lexical tree is needed to obtain the wordID for each word in the ARPA model. Looking up the IDs is done by just calling the getWordID() method from the LexicalTree object.
After loading the lexical tree, the ARPA model is loaded and the uni-, bi- and tri-gram probabilities and backoff values are loaded and translated into LanguageModel format.
Finally, the binary language model is stored to file in the format so that it can be loaded by LanguageModel.
References LMEntryType_1::backoff, LanguageModel::bi_Hash, LanguageModel::bi_lmData, LanguageModel::bi_tableLength, TrainHash::fillMapping(), TrainHash::finalizeHash(), LanguageModel::four_Hash, LanguageModel::four_lmData, LanguageModel::four_tableLength, WriteFileLittleBigEndian::freadEndianSafe(), WriteFileLittleBigEndian::fwriteEndianSafe(), TrainHash::getIndex(), StringLookup::getWordID(), TrainHash::initialiseMapping(), memMap, numberOfWords, LMEntryType_1::p, StringFunctions::RightTrim(), MemMappedFile::setWritePermission(), StringFunctions::splitList(), TrainHash::storeHash(), LanguageModel::tri_Hash, LanguageModel::tri_lmData, LanguageModel::tri_tableLength, LanguageModel::uni_lmData, LanguageModel::uni_tableLength, vocabulary, and wordTree.

Shout_lm2bin::~Shout_lm2bin | ( | ) |
The destructor is empty.
Member Function Documentation
int Shout_lm2bin::getNumberOfValidEntries | ( | char * | word1, | |
char * | word2, | |||
FILE * | file | |||
) | [protected] |
void Shout_lm2bin::sortWords | ( | int | numberOfItems, | |
int * | wordList, | |||
int * | sortList | |||
) | [protected] |
References numberOfWords.
Member Data Documentation
MemMappedFile* Shout_lm2bin::memMap [protected] |
Referenced by Shout_lm2bin().
int Shout_lm2bin::numberOfWords [protected] |
Referenced by Shout_lm2bin(), and sortWords().
char** Shout_lm2bin::vocabulary [protected] |
Referenced by Shout_lm2bin().
StringLookup* Shout_lm2bin::wordTree [protected] |
Referenced by Shout_lm2bin().