// doc/dependencies.dox
// Copyright 2009-2011 Microsoft Corporation
// 2013 Johns Hopkins University (author: Daniel Povey)
// See ../../COPYING for clarification regarding multiple authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// http://www.apache.org/licenses/LICENSE-2.0
// THIS CODE IS PROVIDED *AS IS* BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED
// WARRANTIES OR CONDITIONS OF TITLE, FITNESS FOR A PARTICULAR PURPOSE,
// MERCHANTABLITY OR NON-INFRINGEMENT.
// See the Apache 2 License for the specific language governing permissions and
// limitations under the License.
/**
\page dependencies Software required to install and run Kaldi
\section dependencies_environment Computing environment
- We expect that you will run Kaldi is a cluster of
Linux machines running Sun GridEngine (SGE). This is open-source software and widely used.
- We expect that the cluster will have access to shared directories based on NFS or something
similar.
- We have started a separate project called Kluster
that shows you how to create such a cluster on Amazon's EC2. Most of the scripts should be suitable
for a locally hosted cluster based on Debian; you can also investigate
Rocks.
- You can run Kaldi on just one machine, without GridEngine or NFS, but of course it will be slower.
- You should be able to run Kaldi on most types of Linux machine; it has also been tested
on Darwin (Apple's version of BSD) and on Cygwin.
- Kaldi's scripts have been written in such a way that if you replace SGE with a similar mechanism
with different syntax (such as Tork), it should be relatively easy to get it to work.
- In the past Kaldi has been compiled on Windows; however, the example scripts will not
work there, and we are not actively maintaining the Windows compatibility of the code or the
Windows build scripts. Help with this would be appreciated.
\section dependencies_packages Software packages required
This is a non-exhaustive list of some of the packages you need in order to install Kaldi.
- Subversion (svn): this is needed to download Kaldi and other software that it depends on.
- wget is required for the installation of some non-Kaldi components described below
- The example scripts require standard UNIX utilities such as bash,
perl, awk, grep, and make.
It can also be helpful if you have an ATLAS linear-algebra package installed on your system. Most
systems already have this (You can also search the packages in linux for installation by simple commands like
"yum search atlas" or "apt-cache search libatlas");
the best approach is to ignore this requirement for now and see if you have problems when you install Kaldi.
\section dependencies_installed Software packages installed by Kaldi
The following tools and libraries come with installation scripts in
the tools/ directory so you won't have to install them yourself (note: this is a non-exhaustive list).
- OpenFst: we compile against this and use it heavily.
- IRSTLM: this a language modeling toolkit. Some of the example scripts require it but
it is not tightly integrated with Kaldi; we can convert any Arpa format
language model to an FST.
- The IRSTLM build process requires automake, aclocal, and libtoolize
(the corresponding packages are automake and libtool).
- Note: some of the example scripts now use SRILM; we make it easy to install
that, although you still have to register online to download it.
- sph2pipe: this is for converting sph format files into other formats such
as wav. It's needed for the example scripts that use LDC data.
- sclite: this is for scoring and is not necessary as we have our own, simple
scoring program (compute-wer.cc).
- ATLAS, the linear algebra library. This is only needed for the headers; in
typical setups we expect that ATLAS will be on your system. However, if it not
already on your system you can compile ATLAS as long as your machine does not
have CPU throttling enabled.
- CLAPACK, the linear algebra library (we download the headers).
This is useful only on systems where you don't have ATLAS and are
instead compiling with CLAPACK.
*/