Difference between revisions of "Datasets"

From rosp
m
m
Line 45: Line 45:
 
|{{some|3}}
 
|{{some|3}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://spandh.dcs.shef.ac.uk/projects/shatrweb/ download] [mailto:g.brown@dcs.shef.ac.uk email] [http://spandh.dcs.shef.ac.uk/projects/shatrweb/papers/ioa94.html paper]
 
|[http://spandh.dcs.shef.ac.uk/projects/shatrweb/ download] [mailto:g.brown@dcs.shef.ac.uk email] [http://spandh.dcs.shef.ac.uk/projects/shatrweb/papers/ioa94.html paper]
 
|0.6
 
|0.6
Line 72: Line 72:
 
|{{yes|4}}
 
|{{yes|4}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[https://www.ll.mit.edu/mission/cybersec/HLT/corpora/SpeechCorpora.html download] [mailto:jpc@ll.mit.edu email]
 
|[https://www.ll.mit.edu/mission/cybersec/HLT/corpora/SpeechCorpora.html download] [mailto:jpc@ll.mit.edu email]
 
|{{dunno}}
 
|{{dunno}}
Line 99: Line 99:
 
|{{some|2}}
 
|{{some|2}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://research.nii.ac.jp/src/en/RWCP-SP96.html download] [mailto:src@nii.ac.jp email] [http://scitation.aip.org/content/asa/journal/jasa/100/4/10.1121/1.416338 paper]
 
|[http://research.nii.ac.jp/src/en/RWCP-SP96.html download] [mailto:src@nii.ac.jp email] [http://scitation.aip.org/content/asa/journal/jasa/100/4/10.1121/1.416338 paper]
 
|10
 
|10
Line 126: Line 126:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|free given TIDigits
+
|{{some|free given TIDigits}}
 
|[http://aurora.hsnr.de/download.html download] [mailto:hans-guenter.hirsch@hs-niederrhein.de email] [http://www.isca-speech.org/archive_open/asr2000/asr0_181.html paper]
 
|[http://aurora.hsnr.de/download.html download] [mailto:hans-guenter.hirsch@hs-niederrhein.de email] [http://www.isca-speech.org/archive_open/asr2000/asr0_181.html paper]
 
|33
 
|33
Line 153: Line 153:
 
|{{some|2}}
 
|{{some|2}}
 
|{{no}}
 
|{{no}}
|7400 $
+
|{{no|7400 $}}
 
|[https://catalog.ldc.upenn.edu/search?q%5Bname_cont%5D=SPINE purchase] [mailto:jdwright@ldc.upenn.edu email] [http://dl.acm.org/citation.cfm?id=1289199 paper]
 
|[https://catalog.ldc.upenn.edu/search?q%5Bname_cont%5D=SPINE purchase] [mailto:jdwright@ldc.upenn.edu email] [http://dl.acm.org/citation.cfm?id=1289199 paper]
 
|{{dunno}}
 
|{{dunno}}
Line 180: Line 180:
 
|{{yes|4}}
 
|{{yes|4}}
 
|{{no}}
 
|{{no}}
|1000 €
+
|{{some|1000 €}}
 
|[http://catalog.elra.info/index.php?cPath=37_40 purchase] [http://aurora.hsnr.de/aurora-3/reports.html papers]
 
|[http://catalog.elra.info/index.php?cPath=37_40 purchase] [http://aurora.hsnr.de/aurora-3/reports.html papers]
 
|{{dunno}}
 
|{{dunno}}
Line 207: Line 207:
 
|{{no|1}}
 
|{{no|1}}
 
|{{yes|3}}
 
|{{yes|3}}
|free
+
|{{yes|free}}
 
|[http://research.nii.ac.jp/src/en/RWCP-SP01.html download] [mailto:src@nii.ac.jp email] [http://id.nii.ac.jp/1001/00057420/ paper]
 
|[http://research.nii.ac.jp/src/en/RWCP-SP01.html download] [mailto:src@nii.ac.jp email] [http://id.nii.ac.jp/1001/00057420/ paper]
 
|3.5
 
|3.5
Line 234: Line 234:
 
|{{yes|30}}
 
|{{yes|30}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://research.nii.ac.jp/src/en/RWCP-SSD.html download] [mailto:s-nakamura@is.naist.jp email] [http://www.lrec-conf.org/proceedings/lrec2000/html/summary/356.htm paper]
 
|[http://research.nii.ac.jp/src/en/RWCP-SSD.html download] [mailto:s-nakamura@is.naist.jp email] [http://www.lrec-conf.org/proceedings/lrec2000/html/summary/356.htm paper]
 
|{{dunno}}
 
|{{dunno}}
Line 261: Line 261:
 
|{{yes|4}}
 
|{{yes|4}}
 
|{{no}}
 
|{{no}}
|39000 - 182000 k€ per lang
+
|{{no|39000 - 182000 k€ per lang}}
 
|[http://catalog.elra.info/search.php purchase] [http://www.lrec-conf.org/proceedings/lrec2000/html/summary/373.htm paper]
 
|[http://catalog.elra.info/search.php purchase] [http://www.lrec-conf.org/proceedings/lrec2000/html/summary/373.htm paper]
 
|{{dunno}}
 
|{{dunno}}
Line 288: Line 288:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|free given WSJ0
+
|{{some|free given WSJ0}}
 
|[http://aurora.hsnr.de/download.html download] [mailto:hans-guenter.hirsch@hs-niederrhein.de email] [http://aurora.hsnr.de/aurora-4/reports.html paper]
 
|[http://aurora.hsnr.de/download.html download] [mailto:hans-guenter.hirsch@hs-niederrhein.de email] [http://aurora.hsnr.de/aurora-4/reports.html paper]
 
|{{dunno}}
 
|{{dunno}}
Line 315: Line 315:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|525 $
+
|{{some|525 $}}
 
|[https://catalog.ldc.upenn.edu/LDC2002S04 purchase] [http://perso.limsi.fr/lamel/icslp94ted.pdf paper]
 
|[https://catalog.ldc.upenn.edu/LDC2002S04 purchase] [http://perso.limsi.fr/lamel/icslp94ted.pdf paper]
 
|47
 
|47
Line 342: Line 342:
 
|{{no|1}}
 
|{{no|1}}
 
|{{some|1}}
 
|{{some|1}}
|free
+
|{{yes|free}}
 
|[http://www.clemson.edu/ces/speech/cuave.htm download] [mailto:ksampat@clemson.edu email] [http://asp.eurasipjournals.com/content/2002/11/208541 paper]
 
|[http://www.clemson.edu/ces/speech/cuave.htm download] [mailto:ksampat@clemson.edu email] [http://asp.eurasipjournals.com/content/2002/11/208541 paper]
 
|3
 
|3
Line 369: Line 369:
 
|{{yes|6 - 8}}
 
|{{yes|6 - 8}}
 
|{{no}}
 
|{{no}}
|25000 $
+
|{{no|25000 $}}
 
|[http://crss.utdallas.edu/ purchase] [mailto:john.hansen@utdallas.edu email] [http://www.isca-speech.org/archive/eurospeech_2001/e01_2023.html paper]
 
|[http://crss.utdallas.edu/ purchase] [mailto:john.hansen@utdallas.edu email] [http://www.isca-speech.org/archive/eurospeech_2001/e01_2023.html paper]
 
|286
 
|286
Line 396: Line 396:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://research.nii.ac.jp/src/en/CENSREC-1.html download]  [mailto:s-nakamura@is.naist.jp email] [http://ir.nul.nagoya-u.ac.jp/jspui/bitstream/2237/15046/1/425.pdf paper]
 
|[http://research.nii.ac.jp/src/en/CENSREC-1.html download]  [mailto:s-nakamura@is.naist.jp email] [http://ir.nul.nagoya-u.ac.jp/jspui/bitstream/2237/15046/1/425.pdf paper]
 
|
 
|
Line 423: Line 423:
 
|{{yes|7}}
 
|{{yes|7}}
 
|{{yes|4}}
 
|{{yes|4}}
|free
+
|{{yes|free}}
 
|[http://www.isle.illinois.edu/sst/AVICAR/ download] [mailto:jhasegaw@illinois.edu email] [http://www.isca-speech.org/archive/interspeech_2004/i04_2489.html paper]
 
|[http://www.isle.illinois.edu/sst/AVICAR/ download] [mailto:jhasegaw@illinois.edu email] [http://www.isca-speech.org/archive/interspeech_2004/i04_2489.html paper]
 
|29
 
|29
Line 450: Line 450:
 
|{{yes|16}}
 
|{{yes|16}}
 
|{{yes|3}}
 
|{{yes|3}}
|free
+
|{{yes|free}}
 
|[http://www.idiap.ch/dataset/av16-3/ download] [mailto:odobez@idiap.ch email] [http://publications.idiap.ch/index.php/publications/show/353 paper]
 
|[http://www.idiap.ch/dataset/av16-3/ download] [mailto:odobez@idiap.ch email] [http://publications.idiap.ch/index.php/publications/show/353 paper]
 
|1.5
 
|1.5
Line 477: Line 477:
 
|{{yes|6}}
 
|{{yes|6}}
 
|{{no}}
 
|{{no}}
|2800 $
+
|{{no|2800 $}}
 
|[https://catalog.ldc.upenn.edu/search?q%5Bname_cont%5D=ICSI purchase] [mailto:mrcontact@icsi.berkeley.edu email] [http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=1198793 paper]
 
|[https://catalog.ldc.upenn.edu/search?q%5Bname_cont%5D=ICSI purchase] [mailto:mrcontact@icsi.berkeley.edu email] [http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=1198793 paper]
 
|72
 
|72
Line 504: Line 504:
 
|{{yes|7}}
 
|{{yes|7}}
 
|{{no}}
 
|{{no}}
|5500 $
+
|{{no|5500 $}}
 
|[https://catalog.ldc.upenn.edu/search?q%5Bname_cont%5D=NIST%20Meeting purchase] [mailto:john.garofolo@nist.gov email] [http://www.lrec-conf.org/proceedings/lrec2004/summaries/137.htm paper]
 
|[https://catalog.ldc.upenn.edu/search?q%5Bname_cont%5D=NIST%20Meeting purchase] [mailto:john.garofolo@nist.gov email] [http://www.lrec-conf.org/proceedings/lrec2004/summaries/137.htm paper]
 
|15
 
|15
Line 531: Line 531:
 
|{{yes|79 - 147}}
 
|{{yes|79 - 147}}
 
|{{yes|6 - 9}}
 
|{{yes|6 - 9}}
|3500 €
+
|{{no|3500 €}}
 
|[http://catalog.elra.info/search.php purchase] [mailto:choukri@elda.org email] [http://link.springer.com/article/10.1007%2Fs10579-007-9054-4 paper]
 
|[http://catalog.elra.info/search.php purchase] [mailto:choukri@elda.org email] [http://link.springer.com/article/10.1007%2Fs10579-007-9054-4 paper]
 
|{{dunno}}
 
|{{dunno}}
Line 558: Line 558:
 
|{{some|3}}
 
|{{some|3}}
 
|{{no}}
 
|{{no}}
|75000 € per lang
+
|{{no|75000 € per lang}}
 
|[http://catalog.elra.info/search.php purchase] [mailto:diskra@appen.com email] [http://www.lrec-conf.org/proceedings/lrec2002/sumarios/177.htm paper]
 
|[http://catalog.elra.info/search.php purchase] [mailto:diskra@appen.com email] [http://www.lrec-conf.org/proceedings/lrec2002/sumarios/177.htm paper]
 
|{{dunno}}
 
|{{dunno}}
Line 585: Line 585:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://research.nii.ac.jp/src/en/CENSREC-2.html download] [mailto:src@nii.ac.jp email] [http://www.isca-speech.org/archive/interspeech_2006/i06_1726.html paper]
 
|[http://research.nii.ac.jp/src/en/CENSREC-2.html download] [mailto:src@nii.ac.jp email] [http://www.isca-speech.org/archive/interspeech_2006/i06_1726.html paper]
 
|{{dunno}}
 
|{{dunno}}
Line 612: Line 612:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|21000 ¥
+
|{{some|21000 ¥}}
 
|[http://research.nii.ac.jp/src/en/CENSREC-3.html purchase] [mailto:src@nii.ac.jp email] [http://ir.nul.nagoya-u.ac.jp/jspui/bitstream/2237/15050/1/429.pdf paper]
 
|[http://research.nii.ac.jp/src/en/CENSREC-3.html purchase] [mailto:src@nii.ac.jp email] [http://ir.nul.nagoya-u.ac.jp/jspui/bitstream/2237/15050/1/429.pdf paper]
 
|{{dunno}}
 
|{{dunno}}
Line 639: Line 639:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|free given TIDigits
+
|{{some|free given TIDigits}}
 
|[http://aurora.hsnr.de/download.html download] [mailto:hans-guenter.hirsch@hs-niederrhein.de email] [http://aurora.hsnr.de/aurora-5/reports.html paper]
 
|[http://aurora.hsnr.de/download.html download] [mailto:hans-guenter.hirsch@hs-niederrhein.de email] [http://aurora.hsnr.de/aurora-5/reports.html paper]
 
|{{dunno}}
 
|{{dunno}}
Line 666: Line 666:
 
|{{yes|16}}
 
|{{yes|16}}
 
|{{yes|6}}
 
|{{yes|6}}
|free
+
|{{yes|free}}
 
|[http://groups.inf.ed.ac.uk/ami/ download] [mailto:amicorpus@amiproject.org email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=4538700 paper]
 
|[http://groups.inf.ed.ac.uk/ami/ download] [mailto:amicorpus@amiproject.org email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=4538700 paper]
 
|{{dunno}}
 
|{{dunno}}
Line 693: Line 693:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[mailto:m.cooke@ikerbasque.org email] [http://www.sciencedirect.com/science/article/pii/S0885230809000205 paper]
 
|[mailto:m.cooke@ikerbasque.org email] [http://www.sciencedirect.com/science/article/pii/S0885230809000205 paper]
 
|8.8
 
|8.8
Line 720: Line 720:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|50 €
+
|{{some|50 €}}
 
|[http://catalog.elra.info/product_info.php?products_id=1088&language=en purchase] [mailto:segura@ugr.es email] [http://cvsp.cs.ntua.gr/projects/pub/HIWIRE/WebHome/HIWIRE_db_description_paper.pdf paper]
 
|[http://catalog.elra.info/product_info.php?products_id=1088&language=en purchase] [mailto:segura@ugr.es email] [http://cvsp.cs.ntua.gr/projects/pub/HIWIRE/WebHome/HIWIRE_db_description_paper.pdf paper]
 
|21
 
|21
Line 747: Line 747:
 
|{{yes|5}}
 
|{{yes|5}}
 
|{{yes|2}}
 
|{{yes|2}}
|25000 $
+
|{{no|25000 $}}
 
|[http://crss.utdallas.edu/ download] [mailto:john.hansen@utdallas.edu email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=4290175 paper]
 
|[http://crss.utdallas.edu/ download] [mailto:john.hansen@utdallas.edu email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=4290175 paper]
 
|40
 
|40
Line 774: Line 774:
 
|{{some|2}}
 
|{{some|2}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://sisec2011.wiki.irisa.fr/tiki-index.php?page=Underdetermined+speech+and+music+mixtures download] [mailto:araki.shoko@lab.ntt.co.jp email] [http://www.sciencedirect.com/science/article/pii/S0165168411003604 paper]
 
|[http://sisec2011.wiki.irisa.fr/tiki-index.php?page=Underdetermined+speech+and+music+mixtures download] [mailto:araki.shoko@lab.ntt.co.jp email] [http://www.sciencedirect.com/science/article/pii/S0165168411003604 paper]
 
|0.3
 
|0.3
Line 801: Line 801:
 
|{{yes|8 - 40}}
 
|{{yes|8 - 40}}
 
|{{no}}
 
|{{no}}
|1500 $
+
|{{some|1500 $}}
 
|[https://catalog.ldc.upenn.edu/LDC2014S03 purchase] [mailto:mike.lincoln@quoratetechnology.com email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=1566470 paper] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=6639033 paper]
 
|[https://catalog.ldc.upenn.edu/LDC2014S03 purchase] [mailto:mike.lincoln@quoratetechnology.com email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=1566470 paper] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=6639033 paper]
 
|{{dunno}}
 
|{{dunno}}
Line 828: Line 828:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://research.nii.ac.jp/src/en/CENSREC-4.html download] [mailto:src@nii.ac.jp email] [http://www.lrec-conf.org/proceedings/lrec2008/summaries/468.html paper]
 
|[http://research.nii.ac.jp/src/en/CENSREC-4.html download] [mailto:src@nii.ac.jp email] [http://www.lrec-conf.org/proceedings/lrec2008/summaries/468.html paper]
 
|{{dunno}}
 
|{{dunno}}
Line 855: Line 855:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://research.nii.ac.jp/src/en/CENSREC-4.html download] [mailto:src@nii.ac.jp email] [http://www.lrec-conf.org/proceedings/lrec2008/summaries/468.html paper]
 
|[http://research.nii.ac.jp/src/en/CENSREC-4.html download] [mailto:src@nii.ac.jp email] [http://www.lrec-conf.org/proceedings/lrec2008/summaries/468.html paper]
 
|{{dunno}}
 
|{{dunno}}
Line 882: Line 882:
 
|{{yes|16}}
 
|{{yes|16}}
 
|{{yes|2}}
 
|{{yes|2}}
|free
+
|{{yes|free}}
 
|[http://shine.fbk.eu/resources/dicit-acoustic-woz-data download] [mailto:omologo@fbk.eu email] [http://www.lrec-conf.org/proceedings/lrec2008/summaries/584.html paper]
 
|[http://shine.fbk.eu/resources/dicit-acoustic-woz-data download] [mailto:omologo@fbk.eu email] [http://www.lrec-conf.org/proceedings/lrec2008/summaries/584.html paper]
 
|1
 
|1
Line 909: Line 909:
 
|{{some|2}}
 
|{{some|2}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://sisec2008.wiki.irisa.fr/tiki-index.php?page=Head-geometry%20mixtures%20of%20two%20speech%20sources%20in%20real%20environments,%20impinging%20from%20many%20directions download] [mailto:hendrik.kayser@uni-oldenburg.de email] [http://www.sciencedirect.com/science/article/pii/S0165168411003604 paper]
 
|[http://sisec2008.wiki.irisa.fr/tiki-index.php?page=Head-geometry%20mixtures%20of%20two%20speech%20sources%20in%20real%20environments,%20impinging%20from%20many%20directions download] [mailto:hendrik.kayser@uni-oldenburg.de email] [http://www.sciencedirect.com/science/article/pii/S0165168411003604 paper]
 
|1.9
 
|1.9
Line 936: Line 936:
 
|{{yes|20}}
 
|{{yes|20}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://melodi.ee.washington.edu/cosine/ download] [mailto:cosine@melodi.ee.washington.edu email] [http://www.sciencedirect.com/science/article/pii/S0885230811000143 paper]
 
|[http://melodi.ee.washington.edu/cosine/ download] [mailto:cosine@melodi.ee.washington.edu email] [http://www.sciencedirect.com/science/article/pii/S0885230811000143 paper]
 
|11
 
|11
Line 963: Line 963:
 
|{{yes|2 - 4}}
 
|{{yes|2 - 4}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://sisec2010.wiki.irisa.fr/tiki-index.php?page=Source+separation+in+the+presence+of+real-world+background+noise download] [mailto:ito.nobutaka@lab.ntt.co.jp email] [http://www.sciencedirect.com/science/article/pii/S0165168411003604 paper]
 
|[http://sisec2010.wiki.irisa.fr/tiki-index.php?page=Source+separation+in+the+presence+of+real-world+background+noise download] [mailto:ito.nobutaka@lab.ntt.co.jp email] [http://www.sciencedirect.com/science/article/pii/S0165168411003604 paper]
 
|0.3
 
|0.3
Line 990: Line 990:
 
|{{yes|2 - 4}}
 
|{{yes|2 - 4}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://sisec2010.wiki.irisa.fr/tiki-index.php?page=Determined+convolutive+mixtures+under+dynamic+conditions download] [mailto:francesco.nesta@gmail.com email] [http://www.sciencedirect.com/science/article/pii/S0165168411003604 paper]
 
|[http://sisec2010.wiki.irisa.fr/tiki-index.php?page=Determined+convolutive+mixtures+under+dynamic+conditions download] [mailto:francesco.nesta@gmail.com email] [http://www.sciencedirect.com/science/article/pii/S0165168411003604 paper]
 
|0.2
 
|0.2
Line 1,017: Line 1,017:
 
|{{some|2}}
 
|{{some|2}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://spandh.dcs.shef.ac.uk/chime_challenge/chime2_task1.html download] [mailto:emmanuel.vincent@inria.fr email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=6637622 paper]
 
|[http://spandh.dcs.shef.ac.uk/chime_challenge/chime2_task1.html download] [mailto:emmanuel.vincent@inria.fr email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=6637622 paper]
 
|12
 
|12
Line 1,044: Line 1,044:
 
|{{some|2}}
 
|{{some|2}}
 
|{{no}}
 
|{{no}}
|free given WSJ0
+
|{{some|free given WSJ0}}
 
|[http://spandh.dcs.shef.ac.uk/chime_challenge/chime2_task2.html download] [mailto:francesco.nesta@gmail.com email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=6637622 paper]
 
|[http://spandh.dcs.shef.ac.uk/chime_challenge/chime2_task2.html download] [mailto:francesco.nesta@gmail.com email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=6637622 paper]
 
|33
 
|33
Line 1,098: Line 1,098:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|3500 $
+
|{{no|3500 $}}
 
|[https://catalog.ldc.upenn.edu/LDC2013S04 purchase] [mailto:strassel@ldc.upenn.edu email]
 
|[https://catalog.ldc.upenn.edu/LDC2013S04 purchase] [mailto:strassel@ldc.upenn.edu email]
 
|108
 
|108
Line 1,125: Line 1,125:
 
|{{no|1}}
 
|{{no|1}}
 
|{{no}}
 
|{{no}}
|7000 $
+
|{{no|7000 $}}
 
|[https://catalog.ldc.upenn.edu/LDC2013S02 purchase] [mailto:strassel@ldc.upenn.edu email]
 
|[https://catalog.ldc.upenn.edu/LDC2013S02 purchase] [mailto:strassel@ldc.upenn.edu email]
 
|234
 
|234
Line 1,152: Line 1,152:
 
|{{yes|8}}
 
|{{yes|8}}
 
|{{no}}
 
|{{no}}
|free given WSJCAM0
+
|{{some|free given WSJCAM0}}
 
|[http://reverb2014.dereverberation.com/ purchase] [mailto:REVERB-challenge@lab.ntt.co.jp email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=6701894 paper]
 
|[http://reverb2014.dereverberation.com/ purchase] [mailto:REVERB-challenge@lab.ntt.co.jp email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=6701894 paper]
 
|25
 
|25
Line 1,179: Line 1,179:
 
|{{yes|40}}
 
|{{yes|40}}
 
|{{no}}
 
|{{no}}
|free
+
|{{yes|free}}
 
|[http://shine.fbk.eu/resources/dirha-ii-simulated-corpus download] [mailto:mravanelli@fbk.eu email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=6843271 paper]
 
|[http://shine.fbk.eu/resources/dirha-ii-simulated-corpus download] [mailto:mravanelli@fbk.eu email] [http://ieeexplore.ieee.org/xpl/login.jsp?arnumber=6843271 paper]
 
|1.3
 
|1.3

Revision as of 17:56, 8 August 2014

This page aims to provide a list of datasets with detailed attributes and links to corresponding research results (papers, numerical results, output transcriptions, intermediary data, etc). Each dataset may be used for one or more applications: automatic speech recognition, speaker identification and verification, source localization, speech enhancement and separation...

Disclaimer: Only publicly available datasets with a total duration longer than 5 min are listed.

Datasets General attributes Speech Channel Noise Ground truth
release scenario total duration (h) sampling rate (kHz) degraded channels cameras cost (acad) links speech duration (h) unique speakers language unique words (k) speaking style simultaneous speakers speaker overlap channel type radiation speaker location speaker movements noise type speech signal speaker location, orientation words nonverbal traits noise events
ShATR 1994 meeting 0.6 48 3 no free download email paper 0.6 5 UK English 1 colloquial 5 multiple conversations reverb human quasi-fixed head meeting headset yes yes no yes
LLSEC 1996 conversation 1.4 16 4 no free download email ? 12 N/S N/S read, colloquial 2 conversation reverb human quasi-fixed head hallway, restaurant (scenarized) no yes no no no
RWCP Spoken Dialog Corpus 1996 - 1997 conversation 10 16 2 no free download email paper 10 39 Japanese ? colloquial 1 or 2 conversation reverb (low) human quasi-fixed head stationary background no no yes no no
Aurora-2 2000 public spaces 33 8 - 16 1 no free given TIDigits download email paper 33 214 US English 0.01 digits 1 no simulated phone human N/S no various real environments original N/S yes no yes
SPINE1, SPINE2 2000 - 2001 military 38 16 2 no 7400 $ purchase email paper ? 100 US English 1 command, colloquial 1 or 2 no simulated radio human quasi-fixed head military no no yes no no
Aurora-3 (subset of SpeechDat-Car) 2000 - 2003 car ? 16 4 no 1000 € purchase papers ? ? Finnish, German, Spanish, Danish, Italian ? digits, command, read, spontaneous 1 no reverb human quasi-fixed head car headset no yes no no
RWCP Meeting Speech Corpus 2001 meeting 3.5 16 - 48 1 3 free download email paper 3.5 ? Japanese ? colloquial 1 to 5 meeting reverb (low) human quasi-fixed head stationary background headset no yes no no
RWCP Real Environment Speech and Acoustic Database 2001 domestic, office ? 16 - 48 30 no free download email paper ? 5 Japanese ? read 1 no real rir, reverb loudspeaker various no, pivoting arm stationary background original yes yes no yes
SpeechDat-Car 2001 - 2011 car ? 16 4 no 39000 - 182000 k€ per lang purchase paper ? 300 per lang various ? digits, command, read, spontaneous 1 no reverb human quasi-fixed head car headset no yes no no
Aurora-4 2002 public spaces ? 8 - 16 1 no free given WSJ0 download email paper ? 101 US English 10 read 1 no simulated phone human N/S no various real environments original N/S yes no yes
TED 2002 seminar 47 16 1 no 525 $ purchase paper 47 188 English (mostly non-native) ? lecture 1 or more seminar reverb human quasi-fixed head stationary background lapel no partial no no
CUAVE 2002 cocktail party 3 44 1 1 free download email paper 3 36 US English 0.01 digits 1 or 2 full reverb human quasi-fixed head stationary background no no yes no no
CU-Move Microphone Array Data 2002 - 2011 car 286 44 6 - 8 no 25000 $ purchase email paper 286 172 US English 12 digits, command, read, dialogue 1 no reverb human quasi-fixed head car no no yes no no
CENSREC-1 (Aurora-2J) 2003 public spaces ? 8 1 no free download email paper 214 Japanese 0.01 digits 1 no simulated phone human N/S no various real environments original N/S yes no yes
AVICAR 2004 car 29 16 7 4 free download email paper 29 86 US English, non-native English 1 read 1 no reverb human quasi-fixed head car no no yes no no
AV16.3 2004 meeting 1.5 16 16 3 free download email paper 1.5 12 N/S N/S colloquial 1 to 3 full reverb human various walk stationary background no yes no no no
ICSI Meeting Corpus 2004 meeting 72 16 6 no 2800 $ purchase email paper 72 53 US English 13 meeting 3 to 10 meeting reverb human quasi-fixed head stationary background headset, lapel no yes yes no
NIST Meeting Pilot Corpus Speech 2004 meeting 15 16 7 no 5500 $ purchase email paper 15 61 US English 6 meeting 3 to 9 meeting reverb human various walk stationary background headset, lapel no yes no no
CHIL Meetings 2004 - 2007 seminar, meeting 60 44 79 - 147 6 - 9 3500 € purchase email paper ? ? non-native English ? seminar, meeting 3 to 20 seminar, meeting reverb human quasi-fixed head meeting (scenarized) headset yes yes yes no
SPEECON 2004 - 2011 public space, domestic, office, car ? 16 3 no 75000 € per lang purchase email paper ? 600 per lang various ? command, read, spontaneous 1 no reverb human quasi-fixed head various real environments headset no yes no no
CENSREC-2 2005 car ? 16 1 no free download email paper ? 214 Japanese 0.01 digits 1 no reverb human quasi-fixed head car headset no yes no no
CENSREC-3 2005 car ? 16 1 no 21000 ¥ purchase email paper ? 311 Japanese 0.05 read 1 no reverb human quasi-fixed head car headset no yes no no
Aurora-5 2006 public spaces, domestic, office, car ? 8 1 no free given TIDigits download email paper ? 225 US English 0.01 digits 1 no no, simulated rir, real rir loudspeaker N/S no various real environments original no yes no yes
AMI 2006 meeting 100 16 16 6 free download email paper ? 189 UK English 8 meeting 4 (18% overlap) meeting reverb human quasi-fixed head stationary background headset, lapel yes yes yes no
PASCAL SSC 2006 cocktail party 8.8 25 1 no free email paper 8.8 34 UK English 0.05 command 2 full no human N/S no no original N/S yes no no
HIWIRE 2007 airplane 21 16 1 no 50 € purchase email paper 21 81 non-native English 0.1 command 1 no no human N/S head airplane original N/S yes no no
UT-Drive 2007 car 40 25 5 2 25000 $ download email paper 40 25 US English 2.4 command, dialogue 1 to 2 conversation reverb human quasi-fixed head car headset (low quality) no partial no no
SASSEC, SiSEC underdetermined 2007 - 2011 cocktail party 0.3 16 2 no free download email paper 0.3 16 N/S N/S read 3 or 4 full simulated rir, real rir, reverb no, loudspeaker fixed no no original, spatial image yes no no no
MC-WSJ-AV, PASCAL SSC2, 2012_MMA, REVERB RealData 2007 - 2014 cocktail party 10 16 8 - 40 no 1500 $ purchase email paper paper ? 45 UK English 10 read 1 or 2 full reverb human various walk stationary background headset, lapel yes yes no no
CENSREC-4 (Simulated) 2008 public spaces, domestic, office, car ? 16 1 no free download email paper ? 214 Japanese 0.01 digits 1 no real rir dummy fixed no various real environments original no yes no yes
CENSREC-4 (Real) 2008 public spaces, domestic, office, car ? 16 1 no free download email paper ? 10 Japanese 0.01 digits 1 no reverb human quasi-fixed head various real environments headset no yes no yes
DICIT 2008 domestic 6 48 16 2 free download email paper 1 ? Italian ? command 4 no reverb human various walk domestic (scenarized) headset, tv yes yes no yes
SiSEC head-geometry 2008 cocktail party 1.9 16 2 no free download email paper 1.9 ? N/S N/S read 2 full real rir loudspeaker various no no original, spatial image yes no no no
COSINE 2009 conversation 38 48 20 no free download email paper 11 91 US English, non-native English 5 colloquial 2 to 7 conversation reverb human various walk various real environments headset, throat mic no yes no no
SiSEC real-world noise 2010 public spaces 0.3 16 2 - 4 no free download email paper 0.3 6 N/S N/S read 1 or 3 full no, reverb (other room) loudspeaker various no various real environments original, spatial image yes no no no
SiSEC dynamic 2010 - 2011 cocktail party 0.2 16 2 - 4 no free download email paper 0.2 ? N/S N/S read many but only 2 simultaneous full reverb loudspeaker various simulated no original, spatial image yes no no no
CHiME 1, CHiME 2 Grid 2011 - 2012 domestic 70 16 - 48 2 no free download email paper 12 34 UK English 0.05 command 1 no real rir dummy quasi-fixed simulated head domestic yes yes yes no no
CHiME 2 WSJ0 2012 domestic 78 16 2 no free given WSJ0 download email paper 33 101 US English 11 read 1 no real rir dummy fixed no domestic yes yes yes no no
ETAPE 2012 TV/radio debates, outdoor interviews... 42 16 1 1 ? email paper 32 347 French 16 colloquial 1 or more (up to 10% overlap) conversation reverb (some) human quasi-fixed head various real environments no N/S yes no yes
GALE (Chinese broadcast conversation) 2013 TV conversation 120 16 1 no 3500 $ purchase email 108 ? Mandarin ? colloquial 1 or more conversation no human quasi-fixed head no no N/S yes no no
GALE (Arabic broadcast conversation) 2013 TV conversation 251 16 1 no 7000 $ purchase email 234 ? Arabic ? colloquial 1 or more conversation no human quasi-fixed head no no N/S yes no no
REVERB SimData 2013 domestic, office 25 16 8 no free given WSJCAM0 purchase email paper 25 130 UK English 10 read 1 no real rir loudspeaker fixed no stationary background original, spatial image yes yes no yes
DIRHA 2014 domestic 3.8 48 40 no free download email paper 1.3 30 Italian, German, Greek, Portuguese various various 1 or more simulated real rir loudspeaker various no domestic (sum of events) yes yes yes no yes

Automatic speech recognition

1st CHiME Challenge (2011)

Artificially distorted version of the small vocabulary GRID audio-visual corpus (audio only). Binaural reverberated speech with speaker situated in front of the microphones. Additive household noises impinging from different directions. Clean-training, noisy-training, development and evaluation sets available, see

Jon Barker, E. Vincent, N. Ma, H. Christensen, P. Green, "The PASCAL CHiME speech separation and recognition challenge", Computer Speech & Language, Volume 27, Issue 3, May 2013, Pages 621-633.

Available from Computer Speech and Language here

Corpus available here (no cost)

Resources

  • Training recipe of the challenge for HTK here.

Baselines

  • See the paper above for results for a wide range of techniques.


AURORA 5 (2007)

Artificially distorted version of the digits TI-DIGITS corpus. Additive noise and additive noise plus reverberant speech sets. Variable SNR range. Various mixed training sets, no evaluation set, see

G. Hirsch "Aurora-5 Experimental Framework for the Performance Evaluation of Speech Recognition in Case of a Hands-free Speech Input in Noisy Environments", Niederrhein University of Applied Sciences, 2007.

Paper available online here (no cost)

Corpus available from LDC here

Resources

  • Training recipe for HTK is provided with the corpora.

Baselines

  • Reproducible baseline: The above cited paper includes a baseline for the ETSI Advanced Front-End.


AURORA 4 (2002)

Artificially distorted version of the 5K word Wall Street Journal corpus (WSJ0). Stationary and non-stationary noises added. Second recordings with distant mismatched microphone. Clean-training, mixed-training, noisy training and test sets available. No evaluation set, see

G. Hirsch "Experimental Framework for the Performance Evaluation of Speech Recognition Front-ends on a Large Vocabulary Task", ETSI STQ Aurora DSR Working Group, 2002.

Paper available with the corpus.

Corpora available from ELRA here and here

Resources

  • Training recipe for HTK available here. Note that this recipe is for Wall-Street Journal (WSJ0), which is the clean speech version of AURORA4. Small changes are needed in the feature extraction scripts to account for different file terminations.

Speaker identification and verification

Speech enhancement and separation

Other applications

Contribute a dataset

To contribute a new dataset, please

  • create an account and login
  • go to the wiki page above corresponding to your application; if it does not exist yet, you may create it
  • click on the "Edit" link at the top of the page and add a new section for your dataset (the datasets are ordered by year of collection)
  • click on the "Save page" link at the bottom of the page to save your modifications

Please make sure to provide the following information:

  • name of the dataset and year of collection
  • authors, institution, contact information
  • link to the dataset and to side resources (lexicon, language model, etc)
  • short description (nature of the data, license, etc) and link to a paper/report describing the dataset, if any
  • at least 1 research result obtained for this dataset (see below)

We currently cannot provide storage space for large datasets. Please upload the dataset at a stable URL on the website of your institution or elsewhere and provide its URL only. If this is not possible, please contact the resources sharing working group.

Contribute a research result

To contribute a new research result, please

  • create an account and login
  • go to the wiki page and the section corresponding to the dataset for which this result was obtained
  • click on the "Edit" link on the right of the section header and add a new item for your result
  • click on the "Save page" link at the bottom of the page to save your modifications

Please make sure to provide the following information:

  • authors, paper/report title, means of publication
  • link to the pdf of the paper
  • link to derived data (output transcriptions, intermediary data, etc)
  • Code and instructions to reproduce experiments (if available)

In order to save storage space, please do not upload the paper on this wiki, but link it as much as possible from your institutional archive, from another public archive (e.g., arxiv) or from the publisher website (e.g., ieexplore).

We currently cannot provide storage space for large datasets. Please upload the derived data at a stable URL on the website of your institution or elsewhere and provide its URL only. If this is not possible, please contact the resources sharing working group.