TrainPhoneModel Class Reference

With help of this class, the acoustic models are trained. More...

Inheritance diagram for TrainPhoneModel:


Public Member Functions
	TrainPhoneModel (const char n, int contextLeft, int contextRight, bool isSil, int dim, FeaturePoolInfo infoBlock=NULL)
	TrainPhoneModel (MixGaussian gmm, double trans, const char name)
	TrainPhoneModel (TrainPhoneModel model1, TrainPhoneModel model2, int maxGaussians=-1)
	TrainPhoneModel (TrainPhoneModel model1, TrainPhoneModel model2, double rate)
	TrainPhoneModel (TrainPhoneModel *orgModel, int shiftLeftRight=0)
	TrainPhoneModel (FILE inFile, int dim, FeaturePoolInfo infoBlock=NULL)
	~TrainPhoneModel ()
void	trainMMI (FILE fileEnum, FILE fileDenom)
void	doNotuseBordersForTraining (bool useBordersNot)
int	maxNrOfGaussians ()
double	getTrainSilP (int useLabel=-1, int useSegmentation=-1, FeaturePool *usePool=NULL)
void	adapt_setAcTrain (int useLabel=-1, int useSegmentation=-1, FeaturePool *usePool=NULL)
bool	readModel (FILE *inFile)
void	writeSAT (FILE *outFile)
void	appendSAT (FILE *outFile)
double	finishSAT ()
void	setDecisionMatrix (int numberOfModels, int numberOfRules, int *dMatrix)
void	setTrainingData (FeaturePool *fp, int segmentationID, int labelID, int guestID=-1, int tSilP=100, int tSilMax=-1)
double	train (int maxGaussians, bool isSil, bool neverPrune=false, Vector *trainDiscr=NULL, Vector trainDiscrMask=NULL, PhoneModel *doSAT=NULL, bool doFastTraining=false)
double	getCoSim (TrainPhoneModel t1, TrainPhoneModel t2)
double	getKLDistance (TrainPhoneModel *t2)
double	getNormDistance ()
void	startCount ()
void	stopCount ()
void	count (Vector *observation)
int	getDominantGaussian ()
void	addCountedGaussians (TrainPhoneModel *source, int nmbr)
void	moveModelGaussians (TrainPhoneModel *model, double factor)
void	addGaussian (Vector *v)
void	normalize ()
void	setMaxGaussians (int maxGaussians)
double	getClusterP (Vector *observation)
void	fillDistanceArray (int *distA)
Protected Member Functions
double	baumWelch (int trainWhat, PhoneModel *doSat=NULL)
double	viterbi (int trainWhat)
double	getSilP (int useLabel, int useSegmentation, FeaturePool *usePool)
Protected Attributes
int	trainSilP
int	trainSilMax
int *	decision_Matrix
int	decision_numberOfModels
int	decision_numberOfRules
int	totalLength
FeaturePool *	trainingPool
int	trainingSegment
int	trainingLabel
int	guestTrainingLabel
FeaturePoolInfo *	channelInfoBlock
bool	trainWithoutBorders

Detailed Description

With help of this class, the acoustic models are trained.

Before training the HMM, training samples need to be added to the system, using the addTrainingSample() method. Once all samples are added the HMM may be trained with the train() method. It is possible to either receive and send the model over a socket connection (also see the Socket_Server and Socket_Client classes) with receiveModel() and sendModel() or write the model to disk with writeModel(). The viterbi() method is used to determine the best path for one training sample. During use of the models (see the PhoneModel class), the viterbi token passing, a special form of viterbi, is used.

Constructor & Destructor Documentation

TrainPhoneModel::TrainPhoneModel	(	const char *	n,
		int	contextLeft,
		int	contextRight,
		bool	isSil,
		int	dim,
		FeaturePoolInfo *	infoBlock = `NULL`
	)

This constructor is used to create an empty training model. The name (input parameter) of the phone is stored in the statistics structure. The number of gaussians per state is set to one.

References channelInfoBlock, decision_Matrix, PhoneModel::dimensions, ModelStats::frameMeanLikelihood, guestTrainingLabel, ModelStats::isSil, ModelStats::likelihood, ModelStats::maxNrOfContexts, PhoneModel::mixtureSetData, ModelStats::name, ModelStats::nrOfContexts, ModelStats::nrOfGaussians, ModelStats::nrOfTrainOcc, FeaturePoolInfo::numberOfChannels, MixtureSet::state, PhoneModel::stateMix_1, PhoneModel::stateMix_2, PhoneModel::stateMix_3, PhoneModel::statistics, PhoneModel::timeStamp, trainingLabel, trainingPool, trainingSegment, trainSilMax, trainSilP, trainWithoutBorders, MixtureSet::transitionP_toNext, and MixtureSet::transitionP_toSelf.

TrainPhoneModel::TrainPhoneModel	(	MixGaussian *	gmm,
		double	trans,
		const char *	name
	)

This constructor is used to create a new PhoneModel object with only one state (SIL), outgoing transition probability trans and GMM gmm. (used by Train_Speaker_Segmenter)

References channelInfoBlock, decision_Matrix, MixGaussian::dim(), PhoneModel::dimensions, ModelStats::frameMeanLikelihood, MixGaussian::getNumberOfGaussians(), guestTrainingLabel, PhoneModel::isSil, ModelStats::isSil, ModelStats::likelihood, FastMath::log(), ModelStats::maxNrOfContexts, PhoneModel::mixtureSetData, ModelStats::name, ModelStats::nrOfContexts, ModelStats::nrOfGaussians, ModelStats::nrOfTrainOcc, MixtureSet::state, PhoneModel::stateMix_1, PhoneModel::stateMix_2, PhoneModel::stateMix_3, PhoneModel::statistics, PhoneModel::timeStamp, trainingLabel, trainingPool, trainingSegment, trainSilMax, trainSilP, trainWithoutBorders, MixtureSet::transitionP_toNext, and MixtureSet::transitionP_toSelf.

Here is the call graph for this function:

TrainPhoneModel::TrainPhoneModel	(	TrainPhoneModel *	model1,
		TrainPhoneModel *	model2,
		int	maxGaussians = `-1`
	)

This constructor will merge two models (it doesn't train them, use train for that after setting the training data setTrainingData() with a guestID!) THIS IMPLEMENTATION DOES ONLY MERGE THE FIRST CONTEXT OF SIL PHONES!!!

References channelInfoBlock, decision_Matrix, PhoneModel::dimensions, ModelStats::frameMeanLikelihood, guestTrainingLabel, PhoneModel::isSil, ModelStats::isSil, ModelStats::likelihood, FastMath::log(), ModelStats::maxNrOfContexts, maxNrOfGaussians(), PhoneModel::mixtureSetData, ModelStats::name, MixGaussian::normalizeWeights(), ModelStats::nrOfContexts, ModelStats::nrOfGaussians, ModelStats::nrOfTrainOcc, FeaturePoolInfo::numberOfChannels, MixtureSet::state, PhoneModel::stateMix_1, PhoneModel::stateMix_2, PhoneModel::stateMix_3, PhoneModel::statistics, PhoneModel::timeStamp, trainingLabel, trainingPool, trainingSegment, trainSilMax, trainSilP, trainWithoutBorders, MixtureSet::transitionP_toNext, and MixtureSet::transitionP_toSelf.

Here is the call graph for this function:

TrainPhoneModel::TrainPhoneModel	(	TrainPhoneModel *	model1,
		TrainPhoneModel *	model2,
		double	rate
	)

This constructor will merge two models (it doesn't train them, use train for that after setting the training data setTrainingData() with a guestID!) THIS IMPLEMENTATION DOES ONLY MERGE THE FIRST CONTEXT OF SIL PHONES!!! The number of gaussians in the models must be the same! In fact, it only works for models that are adapted from the same UBM!!!

References channelInfoBlock, decision_Matrix, PhoneModel::dimensions, ModelStats::frameMeanLikelihood, guestTrainingLabel, ModelStats::isSil, PhoneModel::isSil, ModelStats::likelihood, ModelStats::maxNrOfContexts, PhoneModel::mixtureSetData, ModelStats::name, ModelStats::nrOfContexts, ModelStats::nrOfGaussians, ModelStats::nrOfTrainOcc, FeaturePoolInfo::numberOfChannels, MixtureSet::state, PhoneModel::stateMix_1, PhoneModel::stateMix_2, PhoneModel::stateMix_3, PhoneModel::statistics, PhoneModel::timeStamp, trainingLabel, trainingPool, trainingSegment, trainSilMax, trainSilP, trainWithoutBorders, MixtureSet::transitionP_toNext, and MixtureSet::transitionP_toSelf.

TrainPhoneModel::TrainPhoneModel	(	TrainPhoneModel *	org,
		int	shiftLeftRight = `0`
	)

The constructor copies the settings of the original TrainPhoneModel. This constructor is used for creating clustered-based acoustic models.

References channelInfoBlock, decision_Matrix, PhoneModel::dimensions, ModelStats::frameMeanLikelihood, guestTrainingLabel, ModelStats::isSil, PhoneModel::isSil, ModelStats::likelihood, ModelStats::maxNrOfContexts, PhoneModel::mixtureSetData, ModelStats::name, ModelStats::nrOfContexts, ModelStats::nrOfGaussians, ModelStats::nrOfTrainOcc, FeaturePoolInfo::numberOfChannels, MixGaussian::shiftBestGaussian(), MixtureSet::state, PhoneModel::stateMix_1, PhoneModel::stateMix_2, PhoneModel::stateMix_3, PhoneModel::statistics, PhoneModel::timeStamp, totalLength, trainingLabel, trainingPool, trainingSegment, trainSilMax, trainSilP, trainWithoutBorders, MixtureSet::transitionP_toNext, and MixtureSet::transitionP_toSelf.

Here is the call graph for this function:

TrainPhoneModel::TrainPhoneModel	(	FILE *	inFile,
		int	dim,
		FeaturePoolInfo *	infoBlock = `NULL`
	)

The constructor only initialises some variables. Not very interesting. The parameters are only added to make inherentece possible with the PhoneModel class.

References channelInfoBlock, decision_Matrix, PhoneModel::dimensions, ModelStats::frameMeanLikelihood, WriteFileLittleBigEndian::freadEndianSafe(), guestTrainingLabel, PhoneModel::isSil, ModelStats::isSil, ModelStats::likelihood, ModelStats::maxNrOfContexts, PhoneModel::mixtureSetData, ModelStats::name, ModelStats::nrOfContexts, ModelStats::nrOfGaussians, ModelStats::nrOfTrainOcc, FeaturePoolInfo::numberOfChannels, PhoneModel::silRinglastPos, MixtureSet::state, PhoneModel::stateMix_1, PhoneModel::stateMix_2, PhoneModel::stateMix_3, PhoneModel::statistics, PhoneModel::timeStamp, trainingLabel, trainingPool, trainingSegment, trainSilMax, trainSilP, trainWithoutBorders, MixtureSet::transitionP_toNext, and MixtureSet::transitionP_toSelf.

Here is the call graph for this function:

TrainPhoneModel::~TrainPhoneModel ( )

The destructor is responsible for deleting the entire training pool.

Member Function Documentation

void TrainPhoneModel::adapt_setAcTrain	(	int	useLabel = `-1`,
		int	useSegmentation = `-1`,
		FeaturePool *	usePool = `NULL`
	)

Performs the adaptation training-run. This run will set all acumulators. You can only use other segments/labels/pools for sil models!

References PhoneModel::adapt_setAcumulators(), baumWelch(), PhoneModel::isSil, trainingLabel, trainingPool, and trainingSegment.

Referenced by ShoutPrepareAdapt::ShoutPrepareAdapt().

Here is the call graph for this function:

void TrainPhoneModel::addCountedGaussians	(	TrainPhoneModel *	source,
		int	nmbr
	)

Add the best nmbr gaussians from the model source to this model. The best gaussians are determined during earlier counting by count(). (mixtureset 0, we expect this model to be SIL without context (used by Train_Speaker_Segmenter).

References MixGaussian::addCountedGaussians(), PhoneModel::mixtureSetData, and MixtureSet::state.

Here is the call graph for this function:

void TrainPhoneModel::addGaussian ( Vector * v )

Todo:

References MixGaussian::addGaussian(), PhoneModel::mixtureSetData, and MixtureSet::state.

Here is the call graph for this function:

void TrainPhoneModel::appendSAT ( FILE * outFile )

Load training parameters of a model from disc.

References MixGaussian::appendSAT(), PhoneModel::mixtureSetData, ModelStats::nrOfContexts, MixtureSet::state, and PhoneModel::statistics.

Referenced by ShoutTrainFinishSAT::ShoutTrainFinishSAT().

Here is the call graph for this function:

double TrainPhoneModel::baumWelch	(	int	trainWhat,
		PhoneModel *	doSat = `NULL`
	)			`[protected]`

This method has the same function as the method baumWelch(), to train the acoustic models. The algorithm used in this case is Viterbi. the method baumWelch() uses Baum-Welch.

This method uses the viterbi algorithm on all training samples from the pool. The total likelihood (the product of all likelihoods) is returned. The heigher this value, the better the training samples match the HMM. When the train parameter is set to true, the number of transitions into each state are stored and the MixGaussian::train() methods are called. The transition statistics are used to determine new state transition values and MixGaussian::trainFinish() is called to finish the state likelihood training.

The Baum-Welch algorithm used in this method is as follows: For every observation (in one training sample), all state transition into each state are calculated. This is done in a Forward- and Backward pass. The probability of each path is calculated. The transition weights are calculated according the these probabilities by counting the number of transitions into a state and the number of transitions into that state using a specific route (multiplied with the path probability). When all samples are handled, those two numbers are devided, and the transition chance is the result. The state likelihoods are trained with the observations from a specific path that are mapped onto a state with the weight of the path probability. For the procedure to calculate the state likelihoods out of these observation sequences, see the TrainMixGaussian::train() method.

Literature: the HTK-BOOK, page 130-132.

References ModelStats::frameMeanLikelihood, FeaturePool::getCurSegmentLen(), FeaturePool::getCurSegmentVector(), FeaturePool::getFirstVectorFirstSegment(), FeaturePool::getFirstVectorNextSegment(), MixGaussian::getP(), FeaturePool::getSegmentID(), FastMath::log(), PhoneModel::mixtureSetData, ModelStats::nrOfContexts, MixtureSet::state, PhoneModel::stateMix_1, PhoneModel::stateMix_2, PhoneModel::stateMix_3, PhoneModel::statistics, MixGaussian::train(), MixGaussian::trainFinish(), trainingLabel, trainingPool, trainingSegment, MixtureSet::transitionP_toNext, and MixtureSet::transitionP_toSelf.

Referenced by adapt_setAcTrain(), and train().

Here is the call graph for this function:

void TrainPhoneModel::count ( Vector * observation )

Counts the importance of each gaussian in the model given the feature Vector observation. (mixtureset 0, we expect this model to be SIL without context (used by Train_Speaker_Segmenter).

References MixGaussian::count(), PhoneModel::mixtureSetData, and MixtureSet::state.

Referenced by train().

Here is the call graph for this function:

void TrainPhoneModel::doNotuseBordersForTraining ( bool useBordersNot )

Todo:: docs

References trainWithoutBorders.

Referenced by Train_Segmenter::getOverlap(), Train_Segmenter::loadClusters(), and Train_Segmenter::Train_Segmenter().

void TrainPhoneModel::fillDistanceArray ( int * distA )

Fills the distance array: an array of size numberOfGaussians (mixtureSetData[0]). The two most distant gaussians are in the first and last spot. The others are sorted in between.

References MixGaussian::fillDistanceArray(), PhoneModel::mixtureSetData, and MixtureSet::state.

Here is the call graph for this function:

double TrainPhoneModel::finishSAT ( )

Finish the training for SAT

References PhoneModel::mixtureSetData, ModelStats::nrOfContexts, MixtureSet::state, PhoneModel::statistics, and MixGaussian::trainFinish().

Referenced by ShoutTrainFinishSAT::ShoutTrainFinishSAT().

Here is the call graph for this function:

double TrainPhoneModel::getClusterP ( Vector * observation )

Returns the probability of the mixtureSet 0 on the input vector.

References MixGaussian::getP(), PhoneModel::mixtureSetData, and MixtureSet::state.

Here is the call graph for this function:

double TrainPhoneModel::getCoSim	(	TrainPhoneModel *	t1,
		TrainPhoneModel *	t2
	)

References MixGaussian::getCoSim(), PhoneModel::mixtureSetData, and MixtureSet::state.

Here is the call graph for this function:

int TrainPhoneModel::getDominantGaussian ( )

Determines the dominant Gaussian in the model...

References MixGaussian::getBestCount(), PhoneModel::mixtureSetData, and MixtureSet::state.

Here is the call graph for this function:

double TrainPhoneModel::getKLDistance ( TrainPhoneModel * t2 )

References MixGaussian::getKLDistance(), PhoneModel::mixtureSetData, and MixtureSet::state.

Referenced by ShoutPrepareAdapt::ShoutPrepareAdapt().

Here is the call graph for this function:

double TrainPhoneModel::getNormDistance ( )

References MixGaussian::getNormDistance(), PhoneModel::mixtureSetData, and MixtureSet::state.

Here is the call graph for this function:

double TrainPhoneModel::getSilP	(	int	useLabel,
		int	useSegmentation,
		FeaturePool *	usePool
	)			`[protected]`

References FeaturePool::getFirstVectorFirstSegment(), FeaturePool::getFirstVectorNextSegment(), MixGaussian::getLogP(), FeaturePool::getNextVector(), FeaturePool::getSegmentID(), PhoneModel::mixtureSetData, MixtureSet::state, PhoneModel::stateMix_1, and testP.

Referenced by getTrainSilP().

Here is the call graph for this function:

double TrainPhoneModel::getTrainSilP	(	int	useLabel = `-1`,
		int	useSegmentation = `-1`,
		FeaturePool *	usePool = `NULL`
	)

Used by the train() method to calculate the current score if the phone is a SIL phone. Effectively the score is the MixGaussian probability (MixGaussian::getP()) of all training samples. It is possible to use an alternative training pool to calculate the score.

References ModelStats::frameMeanLikelihood, getSilP(), PhoneModel::statistics, trainingLabel, trainingPool, and trainingSegment.

Referenced by Train_Segmenter::getMergeModelScore(), Train_Segmenter::getOverlap(), Adapt_Segmenter::proceedMerge(), ShoutSegment::ShoutSegment(), thread_train_cluster(), train(), and Train_Segmenter::trainModel().

Here is the call graph for this function:

int TrainPhoneModel::maxNrOfGaussians ( )

This method retrieves the number of gausians of each state and returns the maximum value.

References MixGaussian::getNumberOfGaussians(), PhoneModel::mixtureSetData, ModelStats::nrOfContexts, MixtureSet::state, and PhoneModel::statistics.

Referenced by setMaxGaussians(), train(), Train_Segmenter::trainClusters(), and TrainPhoneModel().

Here is the call graph for this function:

void TrainPhoneModel::moveModelGaussians	(	TrainPhoneModel *	model,
		double	factor
	)

Todo:: docs

References PhoneModel::mixtureSetData, MixGaussian::moveModelGaussians(), and MixtureSet::state.

Here is the call graph for this function:

void TrainPhoneModel::normalize ( )

References Gaussian::getMean(), Vector::getValue(), Gaussian::getVariance(), PhoneModel::mixtureSetData, ModelStats::nrOfContexts, Vector::setValue(), PhoneModel::statistics, Gaussian::train(), Gaussian::trainFinish(), and MixtureSet::transitionP_toNext.

Here is the call graph for this function:

bool TrainPhoneModel::readModel ( FILE * inFile )

readModel reads a model from disc. Especially needed for MultiMixGaussian PDFs

References channelInfoBlock, PhoneModel::dimensions, ModelStats::frameMeanLikelihood, WriteFileLittleBigEndian::freadEndianSafe(), PhoneModel::isSil, ModelStats::isSil, ModelStats::likelihood, ModelStats::maxNrOfContexts, PhoneModel::mixtureSetData, ModelStats::name, ModelStats::nrOfContexts, ModelStats::nrOfGaussians, ModelStats::nrOfTrainOcc, FeaturePoolInfo::numberOfChannels, MixtureSet::state, PhoneModel::stateMix_1, PhoneModel::stateMix_2, PhoneModel::stateMix_3, and PhoneModel::statistics.

Referenced by Train_Segmenter::loadClusters(), and Train_Segmenter::Train_Segmenter().

Here is the call graph for this function:

void TrainPhoneModel::setDecisionMatrix	(	int	numberOfModels,
		int	numberOfRules,
		int *	dMatrix
	)

This method will set a new decision-rule matrix. Each row in this matrix contains all phones that are part of a particular tree-based clustering rule.

References decision_Matrix, decision_numberOfModels, and decision_numberOfRules.

Referenced by ShoutTrainModel::ShoutTrainModel().

void TrainPhoneModel::setMaxGaussians ( int maxGaussians )

Set the maximum number of gaussians...

References PhoneModel::isSil, maxNrOfGaussians(), PhoneModel::mixtureSetData, MixGaussian::normalizeWeights(), and MixtureSet::state.

Here is the call graph for this function:

void TrainPhoneModel::setTrainingData	(	FeaturePool *	fp,
		int	trainID,
		int	labelID,
		int	guestID = `-1`,
		int	tSilP = `100`,
		int	tSilMax = `-1`
	)

The FeaturePool contains all training data! Deleting the training pool after it has no more use, is the responsability of the user, because one pool maybe used by multiple models!

References guestTrainingLabel, trainingLabel, trainingPool, trainingSegment, trainSilMax, and trainSilP.

Referenced by Adapt_AM::Adapt_AM(), ShoutSegment::doSAD_noGarbage(), Train_Segmenter::getMergeModelScore(), Train_Segmenter::getOverlap(), Train_Segmenter::loadClusters(), Train_Segmenter::mergeModels(), Adapt_Segmenter::proceedMerge(), ShoutPrepareAdapt::ShoutPrepareAdapt(), ShoutSegment::ShoutSegment(), ShoutTrainModel::ShoutTrainModel(), Train_Segmenter::Train_Segmenter(), Train_Segmenter::trainIteration(), and Train_Segmenter::trainModel().

void TrainPhoneModel::startCount ( )

Starts a counting run for determining the most important gaussians of this model (mixtureset 0, we expect this model to be SIL without context (used by Train_Speaker_Segmenter).

References PhoneModel::mixtureSetData, MixGaussian::startCount(), and MixtureSet::state.

Here is the call graph for this function:

void TrainPhoneModel::stopCount ( )

Stops the counting run for determining the most important gaussians of this model (mixtureset 0, we expect this model to be SIL without context (used by Train_Speaker_Segmenter).

References PhoneModel::mixtureSetData, MixtureSet::state, and MixGaussian::stopCount().

Here is the call graph for this function:

double TrainPhoneModel::train	(	int	maxGaussians,
		bool	isS,
		bool	neverPrune = `false`,
		Vector **	trainDiscr = `NULL`,
		Vector *	trainDiscrMask = `NULL`,
		PhoneModel *	doSat = `NULL`,
		bool	doFastTraining = `false`
	)

This method is responsible for the training iteration. First it checks if the HMM is a SIL HMM or a regular one. SIL HMMs only have one state and only this one state is trained. Apart from this difference, both types of HMMs are trained as follows:

First it is checked if an existing HMM is already loaded. If not, an empty HMM is created. Then, in a loop, the system calculates the total likelihood on all training samples by calling viterbi() with its training parameter set to true. When the result is more than MINIMUM_TRAIN_IMPROVEMENT better than the last result, another viterbi iteration is performed. Otherwise, it is checked if it is allowed to add more gaussians to each state-> This is allowed as long as during the previous iterations, the number of gaussians has not been reduced (it is possible that a gaussian is pruned, when not enough training samples represent it) and the maximum number of gaussians has not yet been reached. The MixGaussian::splitBestGaussian() method is used to split the gaussians. After splitting, the entire procedure is repeated. If splitting was not permitted, the training run has finished.

When the training run is finished, the training sample pool is deleted. This means that it is not possible to train the system again, without providing new samples (but who would want to?)

References baumWelch(), count(), SegmentationAdmin::curSeg, decision_Matrix, decision_numberOfModels, decision_numberOfRules, SegmentationList::firstFrame, FeaturePool::getCurSegmentLen(), FeaturePool::getCurSegmentStart(), FeaturePool::getFirstVectorFirstSegment(), FeaturePool::getFirstVectorNextSegment(), Gaussian::getLogP(), FeaturePool::getNextVector(), MixGaussian::getNumberOfGaussians(), FeaturePool::getSegmentID(), getTrainSilP(), Vector::getValue(), guestTrainingLabel, SegmentationList::ID, ModelStats::isSil, PhoneModel::isSil, SegmentationList::lastFrame, ModelStats::likelihood, FastMath::log(), ModelStats::maxNrOfContexts, maxNrOfGaussians(), MINPASSES, PhoneModel::mixtureSetData, Vector::multiplyVector(), ModelStats::name, SegmentationList::next, ModelStats::nrOfContexts, ModelStats::nrOfGaussians, ModelStats::nrOfTrainOcc, SegmentationAdmin::prevSeg, MixGaussian::splitAllGaussians(), MixGaussian::splitBestGaussian(), MixtureSet::state, PhoneModel::stateMix_1, PhoneModel::stateMix_2, PhoneModel::stateMix_3, PhoneModel::statistics, Gaussian::train(), MixGaussian::train(), Gaussian::trainFinish(), MixGaussian::trainFinish(), trainingLabel, trainingPool, trainingSegment, trainSilMax, trainSilP, trainWithoutBorders, MixtureSet::transitionP_toNext, MixtureSet::transitionP_toSelf, and viterbi().

Referenced by Train_Segmenter::getMergeModelScore(), Train_Segmenter::getOverlap(), Train_Segmenter::mergeModels(), Adapt_Segmenter::proceedMerge(), ShoutPrepareAdapt::ShoutPrepareAdapt(), ShoutSegment::ShoutSegment(), ShoutTrainModel::ShoutTrainModel(), thread_train(), thread_train_cluster(), Train_Segmenter::trainModel(), and ShoutSegment::trainModel().

Here is the call graph for this function:

void TrainPhoneModel::trainMMI	(	FILE *	fileEnum,
		FILE *	fileDenom
	)

This method will read training accumulators from file and perform MMI training

References PhoneModel::mixtureSetData, ModelStats::nrOfContexts, MixtureSet::state, PhoneModel::statistics, and MixGaussian::trainMMI().

Referenced by ShoutTrainMMI::ShoutTrainMMI().

Here is the call graph for this function:

double TrainPhoneModel::viterbi ( int trainWhat ) [protected]

This method has the same function as the method viterbi(), to train the acoustic models. The algorithm used in this case is Baum-Welch. the method viterbi() uses Viterbi.

This method uses the Baum-Welch algorithm on all training samples from the pool. The total likelihood (the product of all likelihoods) is returned. The heigher this value, the better the training samples match the HMM. When the train parameter is set to true, the number of transitions into each state are stored and the MixGaussian::train() methods are called. The transition statistics are used to determine new state transition values and MixGaussian::trainFinish() is called to finish the state likelihood training.

The viterbi algorithm used in this method is as follows: For every observation (in one training sample), the highest probable state transition into each state is calculated. Basically, this means comparing between two transitions (for example, state 2 may get input from state 1 or from itself). The transition choosen is stored in the variable transPath. After all observations are handled, the best path is calculated by starting at the final state at time observationLength and 'walking' back in the transPath matrix. The transition weights are calculated directly from the 'best path' transitions by counting the number of transitions into a state and the number of transitions into that state using a specific route. When all samples are handled, those two numbers are devided, and the transition chance is the result. The state likelihoods are trained with those observations from the best path that are mapped onto a state-> For the procedure to calculate the state likelihoods out of these observation sequences, see the TrainMixGaussian::train() method.

Literature: Frederick Jelinek, "Statistical Methods for Speech Recognition": page 22 for the viterbi training procedure and page 30 for transition weight calculation.

References FeaturePool::addSegment(), ModelStats::frameMeanLikelihood, FeaturePool::getCurSegmentLen(), FeaturePool::getCurSegmentStart(), FeaturePool::getCurSegmentVector(), FeaturePool::getFirstVectorFirstSegment(), FeaturePool::getFirstVectorNextSegment(), MixGaussian::getLogP(), FeaturePool::getSegmentID(), FastMath::log(), PhoneModel::mixtureSetData, ModelStats::nrOfContexts, ModelStats::nrOfTrainOcc, FeaturePool::resetSegmentation(), MixtureSet::state, PhoneModel::stateMix_1, PhoneModel::stateMix_2, PhoneModel::stateMix_3, PhoneModel::statistics, MixGaussian::train(), MixGaussian::trainFinish(), trainingLabel, trainingPool, trainingSegment, MixtureSet::transitionP_toNext, and MixtureSet::transitionP_toSelf.