Class List

Here are the classes, structs, unions and interfaces with brief descriptions:
_control_block
_control_block_lmla
Adapt_AMMain class of the shout_adapt_am application
Adapt_AM_TreeNodeThis class represents a single cluster in the SMAPLR adaptation method. method. Each cluster is structured in a node of a tree, hence the name Adapt_AM_TreeNode. Each node has access to a number of Gaussians of the acoustic models. The class creates an adaptation matrix using MAPLR adaptation and passes its matrix as M-matrix to its children
Adapt_Segmenter
AnalysisSettingsThis type is used to keep track of analysis administration
ArticulatoryStream
AvgEnergy
CommandParameterType
DataStatsThis structure contains the name of an acoustic model and the number of samples that are used during training (per cluster)
DecoderSettings
EdgeThis structure stores the edges (connections) of the graph
FastCompressedTree
FeatureEnergy
FeatureExtractionThis class handles all feature extraction issues
FeaturePoolFeaturePool objects contain a feature vector file in memory
FeaturePool_Multistream_file
FeaturePoolInfo
FFTRealThe FFT procedures used by shout are written by Laurent de Soras
FFTReal::FFTReal::BitReversedLUTThe FFT procedures used by shout are written by Laurent de Soras
FFTReal::FFTReal::TrigoLUTThe FFT procedures used by shout are written by Laurent de Soras
GaussianUsing Gaussian objects, single gaussians can be trained and used
HashHandles minimal perfect hashing
IO_Speaker_SegmenterThe AM adaptation application does not need any speaker clustering information. The prepare adaptation application handles clustering. Unfortunately the adaptation application needs to read and write the speaker clustering information from and to disc. For this purpose this small class is created with only read and write methods
LanguageModelThis class contains all language model functionality
LanguageModel_Segmenter
LatticeNode
LexicalNodeA lexical node is a single node of the lexical tree or Pronounciation Prefix Tree (PPT). It contains token, AM and word information
LexicalNodeListThe LexicalNodeList is needed for administration of all currently active nodes
LexicalTreeThis class contains all lexical tree functionality and also almost all decoder steps
LMEntryType_1This structure-type is part of the LanguageModel data structure
LMEntryType_2This structure-type is part of the LanguageModel data structure
LMEntryType_3This structure-type is part of the LanguageModel data structure
LMEntryType_4This structure-type is part of the LanguageModel data structure
LMLAGlobalListTypeThis type is used to make a global cache table for Lanuage Model Look-Ahead (LMLA)
LMListSearch_2
LMListSearch_3
MemMappedFile
MergeTrainSet
MixGaussianHandles mixture sets of gaussians
MixtureSetThis structure contains all data for one single gaussian mixture state set. This includes the PDF and state transition probabilities
ModelStatsThis structure contains a summary of the data in an acoustic model stored as a PhoneModel object
MultiMixGaussian
NBestThis class will translate recognition administration (a bunch of WLRType objects) into an N-Best list
NBestListUsed for N-Best/lattice calculations:
NBestTypeUsed for N-Best calculations:
NormalizeAM
NormData
PhoneFileReader
PhoneModelHandles likelihood calculation of phones given an observation sequence
PLRTypeThe PLRType, the Phone Link Record Type is the structure that contains the phone history information for a single word stored in a WLR (of struct WLRType)
SearchStatisticsThis structure keeps record of different kind of statistics during decoding. It may be switched on and off by the compiler switch SEARCH_STATISTICS_ON
SegmentationAdmin
SegmentationListLinked list that defines a segmentation of the feature vector pool
SegmenterThis class can determine which cluster a certain stream of feature vectors is closest to. The class Train_Segmenter can train these clusters
Shout_ClusterThis class is the base class for the shout_cluster application. It segments an audio file based on the speaker clustering procedure from the Train_Speaker_Segmenter class
Shout_dct2lextreeThis class will translate a plain text pronunciation dictionary into a PPT stored in LexicalTree format. This class is used by the application shout_dct2lextree
Shout_lm2binWith this class (main class of the application shout_lm2bin) an ARPA language model can be stored in a LanguageModel object and written to disk
Shout_MakeTrainSetThis class is the base class for the shout_maketrainset application. It creates a new AM trainset given a Master Label File and an output directory
Shout_Preprocess
Shout_UpdateVersion
Shout_VTLN
ShoutConfig
ShoutMergeAm
ShoutOnline
ShoutPrepareAdaptThis class is the main class of shout_prepare_adapt
ShoutSegment
ShoutTrainFinish
ShoutTrainFinishSAT
ShoutTrainMMI
ShoutTrainModel
SpeakerRecognition
SpkrecStats
StringLookup
Thread_LMLACalculation_Data
Thread_online_Data
Thread_pdfCalculation_Data
Thread_Train_Data
Thread_Train_Data_Cluster
TokenTypeThe PhoneModel class defines the phone models, the LexicalTree class the tree structure and the TokenType struct is the glue between them. LexicalTree is responsible for the token flow between phones and for language model lookahead, while PhoneModels will fill the likelihood variables and decises if a token may be passed within the phone model
Train_SegmenterThis class can 'train' new speaker clusters. One could call determining the clusters training because the phone (SIL) training procedures are used. But in fact, training is done on the target data (not on a training set), so it is not really training but processing. See our Spring 2006 NIST Rich Transcription Speaker Diarization paper for more information. This is ment to be the most simple form of the HMM-based merging diarization method. Overloaded classes may be created for testing new algorithms
TrainGaussian
TrainHashThis class handles creation of new Hash functions
TrainPhoneModelWith help of this class, the acoustic models are trained
UniqueIDListLinked list that stores a list of unique IDs
VectorThis class handles basic vector calculations that are needed for the system
VertexSetThis structure can store a set of vertices
WaveHeaderType
WhisperWhisper is the top-level class of the decoder
WLRList
WLRTrackerThe WLRTracker type, is created to track the single correct recognition obtained during forced alignment. It can be used to check when the correct path was left because of (for example) pruning
WLRTypeThe WLRType, the Word Link Record is the structure that contains word history information for tokens (of struct TokenType)
WLRTypeList
WordStringNodeA WordStringNode is a node from the tree-structure that contains the text versions of all words in a vocabulary. Organising the words in a tree makes look-up faster. This is mainly important when converting ARPA language models to Shout language models, but also when parsing application arguments