Difference between revisions of "Software"
From rosp
(→Automatic Speech Recognition) |
|||
Line 49: | Line 49: | ||
|[http://htk.eng.cam.ac.uk/extensions/index.shtml official], [http://htk.eng.cam.ac.uk/develop/atk.shtml ATK], [https://github.com/ramon-astudillo/custom_fe Uncertainty Decoding], | |[http://htk.eng.cam.ac.uk/extensions/index.shtml official], [http://htk.eng.cam.ac.uk/develop/atk.shtml ATK], [https://github.com/ramon-astudillo/custom_fe Uncertainty Decoding], | ||
|- | |- | ||
− | ! | + | !CMU Sphinx |
− | |1986- | + | |1986-* (Sphinx 4.1.0, pocketsphinx 0.8) |
|{{yes|Yes}} | |{{yes|Yes}} | ||
|[http://www.keithv.com/software/sphinx/ AURORA4 (WSJ0)] | |[http://www.keithv.com/software/sphinx/ AURORA4 (WSJ0)] | ||
| | | | ||
|{{some|limited [https://raw.githubusercontent.com/cmusphinx/sphinx4/master/license.terms Copyright, allows modif.]}} | |{{some|limited [https://raw.githubusercontent.com/cmusphinx/sphinx4/master/license.terms Copyright, allows modif.]}} | ||
− | |{{yes|Windows, Linux, OSX}} | + | |{{yes|Sphinx4 (Windows, Linux, OSX) pocketsphinx (Raspberry-pi, IPhone, Android)}} |
− | |Java | + | |Sphinx4 (Java), pocketsphinx (C) |
|{{yes|Yes}} | |{{yes|Yes}} | ||
|MFCC, PLP | |MFCC, PLP | ||
Line 67: | Line 67: | ||
|{{yes|Yes}} | |{{yes|Yes}} | ||
|[http://cmusphinx.sourceforge.net/ Website] | |[http://cmusphinx.sourceforge.net/ Website] | ||
− | [http://www. | + | [http://www.cs.cmu.edu/~rsingh/homepage/papers/icassp03-sphinx4_2.pdf Sphinx4] [https://www.cs.cmu.edu/~awb/papers/ICASSP2006/0100185.pdf pocketsphinx] |
|[http://sourceforge.net/p/cmusphinx/mailman/ mail-lists] [http://sourceforge.net/p/cmusphinx/discussion/ forums] | |[http://sourceforge.net/p/cmusphinx/mailman/ mail-lists] [http://sourceforge.net/p/cmusphinx/discussion/ forums] | ||
− | |{{yes|[https://github.com/cmusphinx | + | |{{yes|[https://github.com/cmusphinx Github]}} |
| | | | ||
|- | |- |
Revision as of 22:03, 2 September 2014
This page provides software grouped by application.
Contents
Automatic Speech Recognition
ASR Engines | Release/update | Actively Developed | Corpora Training-Recipes | Reproducible Results | Licence | Platforms | Language | VAD | Acoustic features | Feature normalization/compensation | Acoustic models | Model adaptation/compensation | decoding techniques | training techniques | Hardware Optimization | Online ASR | Links | Forums/Mail-Lists | Online Repository | Extensions |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
HTK | 1993-2009 (3.4.1) | No | AURORA2 (purch.) AURORA3 (purch.), AURORA4 (WSJ0), CHIME-1, CHIME-2-I, REVERB | ETSI-AFE-AURORA2 paper (see AURORA2 purch.) | limited [1] | Windows, Linux, OSX | C | Yes | MFCC, PLP | VTLN, CMS | GMM (Full Cov.), Tied-Mix, Streams | HLDA, MLLR (w/ reg. trees), CMLR (w/ adaptive training), MAP | aligment, N-best, lattice rescoring | Baum-Welch | No | Yes | Website Book (need registration) | mail-lists (low activity) | No | official, ATK, Uncertainty Decoding, |
CMU Sphinx | 1986-* (Sphinx 4.1.0, pocketsphinx 0.8) | Yes | AURORA4 (WSJ0) | limited Copyright, allows modif. | Sphinx4 (Windows, Linux, OSX) pocketsphinx (Raspberry-pi, IPhone, Android) | Sphinx4 (Java), pocketsphinx (C) | Yes | MFCC, PLP | CMN, Mel-Spectrum subtraction | GMM, Streams | MLLR, MAP | aligment, N-best, lattice rescoring | Baum-Welch | No | Yes | Website | mail-lists forums | Github | ||
Kaldi | 2009-* (continous updates) | Yes | AURORA4 (WSJ0), CHIME-2 | Weniger2014-REVERB Paper Code | Apache 2.0 | Windows (not mantained as of 2014), Linux, OSX | C++ | Yes | MFCC, PLP | VTLN, CMVN | GMM (Full Cov.), SGMM, DNN | HLDA, STC, MLLT, MLLR, CMLLR (w/ reg. trees), Exponential transform | Uses OpenFST, aligment, N-best, lattice rescoring | Baum-Welch, MMI (boosted), MC, feature-based, sequence training | BLAS, LAPACK, GPU (for DNNs) | Yes | Website paper | mail-lists forums | SVN | |
Spraak | 2012 (1.1) | No | AURORA4 | Academic/commercial | Windows (limited), Linux | C, Python | Yes | MFCC, PLP | VTLN,CMN, MIDA, MDT Techniques | GMM, Tied-Mix, Exemplar based | CMLLR | aligment, N-best, lattice rescoring, paralel latices | Baum-Welch | No | unclear | Website paper | mail-lists forums | SVN (needs registration) |
Speaker identification and verification
Speech enhancement and separation
Other applications
Contribute software
To contribute new software, please
- create an account and login
- go to the wiki page above corresponding to your application; if it does not exist yet, you may create it
- click on the "Edit" link at the top of the page and add a new section for your software (software is ordered by year of the latest version)
- click on the "Save page" link at the bottom of the page to save your modifications
Please make sure to provide the following information:
- name of the software and year of the latest version
- authors, institution, contact information
- link to the software, ideally including a short demo, and to the external libraries needed
- short description (functionalities, inputs and outputs, programming language, operating system, license, etc) and link to a paper/report describing the software, if any
- whether running on well-known baselines (Aurora-2, Aurora-4, Switchboard, CHiME, etc) is included or requires wrapping by the user
In order to save storage space, please do not upload the software on this wiki, but link it as much as possible from a public repository (e.g., bitbucket, github, sourceforge) or from a stable URL on the website of your institution. If this is not possible, please contact the resources sharing working group.