Difference between revisions of "Software"

Revision as of 21:01, 29 August 2014

This page provides software grouped by application.

ASR Engines	Release/update	Actively Developed	Corpora Training-Recipes	Reproducible Results	Licence	Platforms	Language	VAD	Acoustic features	Feature normalization/compensation	Acoustic models	Model adaptation/compensation	decoding techniques	training techniques	Hardware Optimization	Online ASR	Links	Forums/Mail-Lists	Online Repository	Extensions
ASR Engines	HTK	1993-2009 (3.4.1)	No	AURORA2 (purch.) AURORA3 (purch.), AURORA4 (WSJ0), CHIME-1, CHIME-2-I, REVERB	ETSI-AFE-AURORA2 paper (see AURORA2 purch.)	limited [1]	Windows, Linux, OSX	C	Yes	MFCC, PLP	VTLN, CMS	GMM (Full Cov.), Tied-Mix, Streams	HLDA, MLLR (w/ reg. trees), CMLR (w/ adaptive training), MAP	aligment, N-best, lattice rescoring	Baum-Welch	No	Yes	Website Book (need registration)	mail-lists (low activity)	No	official, ATK, Uncertainty Decoding,
Sphinx4	1986-2011 (4.1.0)	Yes	AURORA4 (WSJ0)		limited Copyright, allows modif.	Windows, Linux, OSX	Java	Yes	MFCC, PLP	CMN, Mel-Spectrum subtraction	GMM, Streams	MLLR, MAP	aligment, N-best, lattice rescoring	Baum-Welch	No	Yes	Website paper	mail-lists forums	Github
Kaldi	2009-* (continous updates)	Yes	AURORA4 (WSJ0), CHIME-2	Weniger2014-REVERB Paper Code	Apache 2.0	Windows (not mantained as of 2014), Linux, OSX	C++	Yes	MFCC, PLP	VTLN, CMVN	GMM (Full Cov.), SGMM, DNN	HLDA, STC, MLLT, MLLR, CMLLR (w/ reg. trees), Exponential transform	Uses OpenFST, aligment, N-best, lattice rescoring	Baum-Welch, MMI (boosted), MC, feature-based, sequence training	BLAS, LAPACK, GPU (for DNNs)	Yes	Website paper	mail-lists forums	SVN
Spraak	2012 (1.1)	No	AURORA4		Academic/commercial	Windows (limited), Linux	C, Python	Yes	MFCC, PLP	VTLN,CMN, MIDA, MDT Techniques	GMM, Tied-Mix, Exemplar based	CMLLR	aligment, N-best, lattice rescoring, paralel latices	Baum-Welch	No	unclear	Website paper	mail-lists forums	SVN (needs registration)

Speaker identification and verification

Speech enhancement and separation

Other applications

Contribute software

To contribute new software, please

create an account and login
go to the wiki page above corresponding to your application; if it does not exist yet, you may create it
click on the "Edit" link at the top of the page and add a new section for your software (software is ordered by year of the latest version)
click on the "Save page" link at the bottom of the page to save your modifications

Please make sure to provide the following information:

name of the software and year of the latest version
authors, institution, contact information
link to the software, ideally including a short demo, and to the external libraries needed
short description (functionalities, inputs and outputs, programming language, operating system, license, etc) and link to a paper/report describing the software, if any
whether running on well-known baselines (Aurora-2, Aurora-4, Switchboard, CHiME, etc) is included or requires wrapping by the user

In order to save storage space, please do not upload the software on this wiki, but link it as much as possible from a public repository (e.g., bitbucket, github, sourceforge) or from a stable URL on the website of your institution. If this is not possible, please contact the resources sharing working group.

@@ Line 1: / Line 1: @@
 This page provides software grouped by application.
-== [[Automatic speech recognition]] ==
+{| class="wikitable sortable" style="font-size:72%; border:gray solid 1px; text-align:center; width:auto; table-layout:fixed;"
+|-
-'''Kaldi'''
+!style="width: 40px" rowspan="2" class="unsortable"|ASR Engines
+!scope="col" width="40px" | Release/update
-Available at sourceforge [http://kaldi.sourceforge.net/ here]
+!scope="col" width="40px" | Actively Developed
+!scope="col" width="40px" | Corpora Training-Recipes
+!scope="col" width="40px" | Reproducible Results
-'''CMUSphinx'''
+!scope="col" width="40px" | Licence
+!scope="col" width="40px" | Platforms
-Available at sourceforge [http://cmusphinx.sourceforge.net/ here]
+!scope="col" width="40px" | Language
+!scope="col" width="40px" | VAD
+!scope="col" width="40px" | Acoustic features
-'''Hidden Markov Model Toolkit (HTK)'''
+!scope="col" width="40px" | Feature normalization/compensation
+!scope="col" width="40px" | Acoustic models
-Available from the Cambridge University [http://htk.eng.cam.ac.uk/ here] (you need to register to download)
+!scope="col" width="40px" | Model adaptation/compensation
+!scope="col" width="40px" | decoding techniques
-''Resources related to robustness''
+!scope="col" width="40px" | training techniques
+!scope="col" width="40px" | Hardware Optimization
-*Scripts available for various robust ASR Corpora, see the [[Datasets#Automatic_speech_recognition|Datasets section]]
+!scope="col" width="40px" | Online ASR
+!scope="col" width="40px" | Links
-*The [http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html voicebox] MATLAB toolbox allows writing and reading feature vectors in HTK format, thus making possible custom robust front ends.
+!scope="col" width="40px" | Forums/Mail-Lists
+!scope="col" width="40px" | Online Repository
-*Patches to perform Uncertainty Decoding and Modified Imputation available [http://www.astudillo.com/ramon/research/stft-up/ here]
+!scope="col" width="40px" | Extensions
+|-
+!HTK
+|1993-2009 (3.4.1)
+|{{no|No}}
+|[http://catalog.elra.info/product_info.php?cPath=37_40&products_id=693 AURORA2 (purch.)] [http://catalog.elra.info/index.php?cPath=37_40 AURORA3 (purch.)], [http://www.keithv.com/software/htk/ AURORA4 (WSJ0)], [http://spandh.dcs.shef.ac.uk/projects/chime/PCC/data/pcchome.tar.gz CHIME-1], [ftp://ftp.dcs.shef.ac.uk/share/spandh/chime_challenge/grid/eval_tools_grid.tgz CHIME-2-I], [http://reverb2014.dereverberation.com/tools/REVERB_TOOLS_FOR_ASR_ver2.0.tgz REVERB]
+|ETSI-AFE-AURORA2 [http://aurora.hsnr.de/download/Aurora2_afe_v1_1.pdf paper] (see AURORA2 purch.)
+|{{no|limited [http://htk.eng.cam.ac.uk/docs/license.shtml]}}
+|{{yes|Windows, Linux, OSX}}
+|C
+|{{yes|Yes}}
+|MFCC, PLP
+|VTLN, CMS
+|GMM (Full Cov.), Tied-Mix, Streams
+|HLDA, MLLR (w/ reg. trees), CMLR (w/ adaptive training), MAP
+|aligment, N-best, lattice rescoring
+|Baum-Welch
+|{{no|No}}
+|{{yes|Yes}}
+|[http://htk.eng.cam.ac.uk/download.shtml Website ] [http://htk.eng.cam.ac.uk/docs/docs.shtml Book] (need registration)
+|[http://htk.eng.cam.ac.uk/mailing/subscribe_mail.shtml mail-lists] (low activity)
+|{{no|No}}
+|[http://htk.eng.cam.ac.uk/extensions/index.shtml official], [http://htk.eng.cam.ac.uk/develop/atk.shtml ATK], [https://github.com/ramon-astudillo/custom_fe Uncertainty Decoding],
+|-
+!Sphinx4
+|1986-2011 (4.1.0)
+|{{yes|Yes}}
+|[http://www.keithv.com/software/sphinx/ AURORA4 (WSJ0)]
+|
+|{{some|limited [https://raw.githubusercontent.com/cmusphinx/sphinx4/master/license.terms Copyright, allows modif.]}}
+|{{yes|Windows, Linux, OSX}}
+|Java
+|{{yes|Yes}}
+|MFCC, PLP
+|CMN, Mel-Spectrum subtraction
+|GMM, Streams
+|MLLR, MAP
+|aligment, N-best, lattice rescoring
+|Baum-Welch
+|{{no|No}}
+|{{yes|Yes}}
+|[http://cmusphinx.sourceforge.net/ Website]
+[http://www.researchgate.net/publication/228770826_Sphinx-4_A_flexible_open_source_framework_for_speech_recognition/file/79e4150c20aeb37c52.pdf paper]
+|[http://sourceforge.net/p/cmusphinx/mailman/ mail-lists] [http://sourceforge.net/p/cmusphinx/discussion/ forums]
+|{{yes|[https://github.com/cmusphinx/sphinx4 Github]}}
+|
+|-
+!Kaldi
+|2009-* (continous updates)
+|{{yes|Yes}}
+|[http://kaldi.sourceforge.net/data_prep.html AURORA4 (WSJ0)], CHIME-2
+|Weniger2014-REVERB [http://reverb2014.dereverberation.com/workshop/reverb2014-papers/1569884459.pdf Paper] [http://www.mmk.ei.tum.de/~wen/REVERB_2014/kaldi_baseline.tar.gz  Code]
+|{{Yes|Apache 2.0}}
+|{{some|Windows (not mantained as of 2014), Linux, OSX}}
+|C++
+|{{yes|Yes}}
+|MFCC, PLP
+|VTLN, CMVN
+|GMM (Full Cov.), SGMM, DNN
+|HLDA, STC, MLLT, MLLR, CMLLR (w/ reg. trees), Exponential transform
+|Uses OpenFST, aligment, N-best, lattice rescoring
+|Baum-Welch, MMI (boosted), MC, feature-based, sequence training
+|{{yes|BLAS, LAPACK, GPU (for DNNs)}}
+|{{yes|Yes}}
+|[http://kaldi.sourceforge.net/about.html Website] [http://homepages.inf.ed.ac.uk/aghoshal/pubs/asru11-kaldi.pdf paper]
+|[http://sourceforge.net/p/kaldi/mailman/kaldi-users/ mail-lists] [http://sourceforge.net/p/kaldi/discussion/ forums]
+|{{yes|[http://kaldi.sourceforge.net/install.html SVN]}}
+|
+|-
+!Spraak
+|2012 (1.1)
+|{{no|No}}
+|[http://www.spraak.org/documentation/doxygen/doc/html/spr__mdt__example.html AURORA4]
+|
+|{{some|[http://www.spraak.org/obtaining-spraak/license Academic/commercial]}}
+|{{some|Windows ([http://www.spraak.org/obtaining-spraak/system-requirements limited]), Linux}}
+|C, Python
+|{{yes|Yes}}
+|MFCC, PLP
+|VTLN,CMN, [http://www.spraak.org/documentation/doxygen/doc/html/spr__tut__mida.html MIDA], [http://www.spraak.org/documentation/doxygen/doc/html/spr__mdt__intro.html MDT Techniques]
+|GMM, Tied-Mix, Exemplar based
+|CMLLR
+|aligment, N-best, lattice rescoring, paralel latices
+|Baum-Welch
+|{{no|No}}
+|{{some|unclear}}
+|[http://www.spraak.org/ Website] [http://www.esat.kuleuven.be/psi/spraak/cgi-bin/get_file.cgi?/wambacq/interspeech08/is2008_spraak_v3.pdf paper]
+|[http://www.spraak.org/mailing-lists mail-lists] [http://sourceforge.net/p/kaldi/discussion/ forums]
+|{{some|[http://www.spraak.org/documentation/doxygen/doc/html/spr__svn.html SVN (needs registration)]}}
+|
+|}
 == [[Speaker identification and verification]] ==

Not logged in

Search

Navigation

Tools

Difference between revisions of "Software"

Namespaces

Views

Actions