_control_block | |
_control_block_lmla | |
Adapt_AM | Main class of the shout_adapt_am application |
Adapt_AM_TreeNode | This class represents a single cluster in the SMAPLR adaptation method. method. Each cluster is structured in a node of a tree, hence the name Adapt_AM_TreeNode. Each node has access to a number of Gaussians of the acoustic models. The class creates an adaptation matrix using MAPLR adaptation and passes its matrix as M-matrix to its children |
Adapt_Segmenter | |
AnalysisSettings | This type is used to keep track of analysis administration |
ArticulatoryStream | |
AvgEnergy | |
CommandParameterType | |
DataStats | This structure contains the name of an acoustic model and the number of samples that are used during training (per cluster) |
DecoderSettings | |
Edge | This structure stores the edges (connections) of the graph |
FastCompressedTree | |
FeatureEnergy | |
FeatureExtraction | This class handles all feature extraction issues |
FeaturePool | FeaturePool objects contain a feature vector file in memory |
FeaturePool_Multistream_file | |
FeaturePoolInfo | |
FFTReal | The FFT procedures used by shout are written by Laurent de Soras |
FFTReal::FFTReal::BitReversedLUT | The FFT procedures used by shout are written by Laurent de Soras |
FFTReal::FFTReal::TrigoLUT | The FFT procedures used by shout are written by Laurent de Soras |
Gaussian | Using Gaussian objects, single gaussians can be trained and used |
Hash | Handles minimal perfect hashing |
IO_Speaker_Segmenter | The AM adaptation application does not need any speaker clustering information. The prepare adaptation application handles clustering. Unfortunately the adaptation application needs to read and write the speaker clustering information from and to disc. For this purpose this small class is created with only read and write methods |
LanguageModel | This class contains all language model functionality |
LanguageModel_Segmenter | |
LatticeNode | |
LexicalNode | A lexical node is a single node of the lexical tree or Pronounciation Prefix Tree (PPT). It contains token, AM and word information |
LexicalNodeList | The LexicalNodeList is needed for administration of all currently active nodes |
LexicalTree | This class contains all lexical tree functionality and also almost all decoder steps |
LMEntryType_1 | This structure-type is part of the LanguageModel data structure |
LMEntryType_2 | This structure-type is part of the LanguageModel data structure |
LMEntryType_3 | This structure-type is part of the LanguageModel data structure |
LMEntryType_4 | This structure-type is part of the LanguageModel data structure |
LMLAGlobalListType | This type is used to make a global cache table for Lanuage Model Look-Ahead (LMLA) |
LMListSearch_2 | |
LMListSearch_3 | |
MemMappedFile | |
MergeTrainSet | |
MixGaussian | Handles mixture sets of gaussians |
MixtureSet | This structure contains all data for one single gaussian mixture state set. This includes the PDF and state transition probabilities |
ModelStats | This structure contains a summary of the data in an acoustic model stored as a PhoneModel object |
MultiMixGaussian | |
NBest | This class will translate recognition administration (a bunch of WLRType objects) into an N-Best list |
NBestList | Used for N-Best/lattice calculations: |
NBestType | Used for N-Best calculations: |
NormalizeAM | |
NormData | |
PhoneFileReader | |
PhoneModel | Handles likelihood calculation of phones given an observation sequence |
PLRType | The PLRType, the Phone Link Record Type is the structure that contains the phone history information for a single word stored in a WLR (of struct WLRType) |
SearchStatistics | This structure keeps record of different kind of statistics during decoding. It may be switched on and off by the compiler switch SEARCH_STATISTICS_ON |
SegmentationAdmin | |
SegmentationList | Linked list that defines a segmentation of the feature vector pool |
Segmenter | This class can determine which cluster a certain stream of feature vectors is closest to. The class Train_Segmenter can train these clusters |
Shout_Cluster | This class is the base class for the shout_cluster application. It segments an audio file based on the speaker clustering procedure from the Train_Speaker_Segmenter class |
Shout_dct2lextree | This class will translate a plain text pronunciation dictionary into a PPT stored in LexicalTree format. This class is used by the application shout_dct2lextree |
Shout_lm2bin | With this class (main class of the application shout_lm2bin) an ARPA language model can be stored in a LanguageModel object and written to disk |
Shout_MakeTrainSet | This class is the base class for the shout_maketrainset application. It creates a new AM trainset given a Master Label File and an output directory |
Shout_Preprocess | |
Shout_UpdateVersion | |
Shout_VTLN | |
ShoutConfig | |
ShoutMergeAm | |
ShoutOnline | |
ShoutPrepareAdapt | This class is the main class of shout_prepare_adapt |
ShoutSegment | |
ShoutTrainFinish | |
ShoutTrainFinishSAT | |
ShoutTrainMMI | |
ShoutTrainModel | |
SpeakerRecognition | |
SpkrecStats | |
StringLookup | |
Thread_LMLACalculation_Data | |
Thread_online_Data | |
Thread_pdfCalculation_Data | |
Thread_Train_Data | |
Thread_Train_Data_Cluster | |
TokenType | The PhoneModel class defines the phone models, the LexicalTree class the tree structure and the TokenType struct is the glue between them. LexicalTree is responsible for the token flow between phones and for language model lookahead, while PhoneModels will fill the likelihood variables and decises if a token may be passed within the phone model |
Train_Segmenter | This class can 'train' new speaker clusters. One could call determining the clusters training because the phone (SIL) training procedures are used. But in fact, training is done on the target data (not on a training set), so it is not really training but processing. See our Spring 2006 NIST Rich Transcription Speaker Diarization paper for more information. This is ment to be the most simple form of the HMM-based merging diarization method. Overloaded classes may be created for testing new algorithms |
TrainGaussian | |
TrainHash | This class handles creation of new Hash functions |
TrainPhoneModel | With help of this class, the acoustic models are trained |
UniqueIDList | Linked list that stores a list of unique IDs |
Vector | This class handles basic vector calculations that are needed for the system |
VertexSet | This structure can store a set of vertices |
WaveHeaderType | |
Whisper | Whisper is the top-level class of the decoder |
WLRList | |
WLRTracker | The WLRTracker type, is created to track the single correct recognition obtained during forced alignment. It can be used to check when the correct path was left because of (for example) pruning |
WLRType | The WLRType, the Word Link Record is the structure that contains word history information for tokens (of struct TokenType) |
WLRTypeList | |
WordStringNode | A WordStringNode is a node from the tree-structure that contains the text versions of all words in a vocabulary. Organising the words in a tree makes look-up faster. This is mainly important when converting ARPA language models to Shout language models, but also when parsing application arguments |