61 to 70 of 2,445 Results
Aug 30, 2023
Chen, Gang; Neubauer, Juergen; Garellek, Marc; Samlan, Robin; Gerratt, Bruce R.; Kreiman, Jody; Alwan, Abeer, 2017, "UCLA High-Speed Laryngeal Video and Audio", https://hdl.handle.net/11272.1/AB2/OWLHMG, Linguistic Data Consortium
UCLA High-Speed Laryngeal Video and Audio was developed by UCLA Speech Processing and Auditory Perception Laboratory and is comprised of high-speed laryngeal video recordings of the vocal folds and synchronized audio recordings from nine subjects collected between April 2012 and April 2013. Speakers were asked to sustain the vowel /i/ for approxima...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Aug 30, 2023
Luqman, Hamzah; Mahmoud, Sabri; Awaida, Sameh, 2016, "KAFD: Arabic Font Database", https://hdl.handle.net/11272.1/AB2/A0JPYM, Linguistic Data Consortium
Introduction KAFD: Arabic Font Database was developed by King Fahd University of Petroleum & Minerals and Qassim University. It is comprised of approximately 2.5 million scanned Arabic printed pages in a variety of fonts, sizes and resolutions along with corresponding transcripts. KAFD was designed for research in Arabic text recognition. Data The...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Aug 30, 2023
Abdulaziz, Azhar; Kepuska, Veton, 2017, "Noisy TIMIT Speech", https://hdl.handle.net/11272.1/AB2/FFFXT2, Linguistic Data Consortium
Introduction Noisy TIMIT Speech was developed by the Florida Institute of Technology and contains approximately 322 hours of speech from the TIMIT Acoustic-Phonetic Continuous Speech Corpus (LDC93S1) modified with different additive noise levels. Only the audio has been modified; the original arrangement of the TIMIT corpus is still as described by...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Aug 30, 2023
Mahmoud, Sabri; Ahmad, Irfan; Al-Khatib, Wasfi; Alshayeb, Mohammad; Parvez, Mohammad; Märgner, Volker; Fink, Gernot, 2015, "KHATT: Handwritten Arabic Text", https://hdl.handle.net/11272.1/AB2/PL0DHA, Linguistic Data Consortium
Introduction KHATT: Handwritten Arabic Text was developed by King Fahd University of Petroleum & Minerals, Technical University of Dortmund and Braunschweig University of Technology. It is comprised of scanned Arabic handwriting from 1,000 distinct male and female writers representing diverse countries, age groups, handedness and education levels....This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Aug 30, 2023
Morris, Amanda; Strassel, Stephanie; Li, Xuansong; Antonishek, Brian; Fiscus, Jonathan G., 2019, "HAVIC MED Progress Test -- Videos, Metadata and Annotation", https://hdl.handle.net/11272.1/AB2/QYTBMD, Linguistic Data Consortium
HAVIC MED Progress Test – Videos, Metadata and Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 3,650 hours of user-generated videos with annotation and metadata. To advance multimodal event detection and related technologies, LDC developed, in collaboration with NIST (the National Institute of Stan...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Aug 30, 2023
Ferraro, Francis; Thomas, Max; Gormley, Matthew R.; Wolfe, Travis; Harman, Craig; Van Durme, Benjamin, 2018, "Concretely Annotated English Gigaword", https://hdl.handle.net/11272.1/AB2/NQCDFR, Linguistic Data Consortium
Concretely Annotated English Gigaword was developed by Johns Hopkins University’s Human Language Technology Center of Excellence (JHU). It adds multiple kinds and instances of automatically-generated syntactic, semantic and coreference annotations to English Gigaword Fifth Edition (LDC2011T07). Concrete is a schema for representing structured, hier...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Aug 30, 2023
Ferraro, Francis; Thomas, Max; Wolfe, Travis; R. Gormley, Matthew; Harman, Craig; Van Durme, Benjamin, 2018, "Concretely Annotated New York Times", https://hdl.handle.net/11272.1/AB2/VA98GM, Linguistic Data Consortium
Introduction Concretely Annotated New York Times was developed by Johns Hopkins University’s Human Language Technology Center of Excellence. It adds multiple kinds and instances of automatically-generated syntactic, semantic and coreference annotations to The New York Times Annotated Corpus (LDC2008T19). Concrete is a schema for representing struct...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Aug 30, 2023
Vincent, Emmanuel; Barker, Jon; Watanabe, Shinji; Le Roux, Jonathan; Nesta, Francesco; Matassoni, Marco, 2017, "CHiME2 WSJ0", https://hdl.handle.net/11272.1/AB2/IUB8PD, Linguistic Data Consortium
CHiME2 WSJ0 was developed as part of The 2nd CHiME Speech Separation and Recognition Challenge and contains approximately 166 hours of English speech from a noisy living room environment. The CHiME Challenges focus on distant-microphone automatic speech recognition (ASR) in real-world environments. CHiME2 WSJ0 reflects the medium vocabulary track o...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Aug 30, 2023
Barker, Jon; Marxer, Ricard; Vincent, Emmanuel; Watanabe, Shinji, 2017, "CHiME3", https://hdl.handle.net/11272.1/AB2/HGHM4U, Linguistic Data Consortium
Introduction CHiME3 was developed as part of The 3rd CHiME Speech Separation and Recognition Challenge and contains approximately 342 hours of English speech and transcripts from noisy environments and 50 hours of noisy environment audio. The CHiME Challenges focus on distant-microphone automatic speech recognition (ASR) in real-world environments....This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |
Aug 30, 2023
Tracey, Jennifer; Lee, Haejoong; Strassel, Stephanie, 2017, "BOLT English Discussion Forums", https://hdl.handle.net/11272.1/AB2/VDFID2, Linguistic Data Consortium
BOLT English Discussion Forums was developed by the Linguistic Data Consortium (LDC) and consists of 830,440 discussion forum threads in English harvested from the Internet using a combination of manual and automatic processes. The DARPA BOLT (Broad Operational Language Translation) program developed machine translation and information retrieval fo...This Dataset is harvested from our partners. Clicking the link will take you directly to the archival source of the data. |