This dataverse contains LDC membership data for Harvard University affiliates.
Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

1 to 6 of 6 Results
Aug 2, 2016
Barbosa, Sonia, 2016, "Linguistic Data Consortium Harvard Membership - General Information", https://doi.org/10.7910/DVN/WL1DFP, Harvard Dataverse, V1
Membership Years 1993 (Not-for-Profit, Standard) 1994 (Not-for-Profit, Standard) 1995 (Not-for-Profit, Standard) 1996 (Not-for-Profit, Standard) 1997 (Not-for-Profit, Standard) 1998 (Not-for-Profit, Standard) 1999 (Not-for-Profit, Standard) 2000 (Not-for-Profit, Standard) 2001 (Not-for-Profit, Standard) 2002 (Not-for-Profit, Standard) 2003 (Not-for...
Aug 2, 2016
LDC, 2016, "Continuous Speech Recognition Corpus - Disc 1 of 1", https://doi.org/10.7910/DVN/P0PUTV, Harvard Dataverse, V1
The third ARPA Continuous Speech Recognition (CSR) Benchmark Speech Test Collection is a three CD-ROM set that contains complete development test and evaluation test suites for speaker-independent, large-vocabulary speech recognition systems. The development and evaluation tests share a common structure, consisting of two core test components ("hub...
Aug 2, 2016
LDC, 2016, "CSR-II (WSJ1) Sennheiser Discs 1 - 3", https://doi.org/10.7910/DVN/OVXSNR, Harvard Dataverse, V1
The complete WSJ1 corpus contains approximately 78,000 training utterances (73 hours of speech), 4,000 of which are the result of spontaneous dictation by journalists with varying degrees of experience in dictation. The corpus contains approximately 8,200 conventional development test utterances (eight hours of speech), 6,800 of which are from spon...
Aug 2, 2016
Garofolo, John; Fiscus, Johnathan; Fisher, William; Pallett, David, 2016, "CSR-IV HUB4", https://doi.org/10.7910/DVN/BT8CTN, Harvard Dataverse, V1
This set of CD-ROMs contains all of the speech data provided to sites participating in the DARPA CSR November 1995 HUB4 (Radio) Broadcast News tests. The data consists of digitized waveforms of MarketPlace (tm) business news radio shows provided by KUSC through an agreement with the Linguistic Data Consortium and detailed transcriptions of those br...
Aug 2, 2016
Fiscus, Jonathan; Garofolo, John; Pallett, David, 2016, "CSR-IV HUB3", https://doi.org/10.7910/DVN/DACJZB, Harvard Dataverse, V1
This set of CD-ROMs contains all of the speech data provided to sites participating in the DARPA CSR November 1995 HUB3 Multi-Microphone tests. The data consists of digitized waveforms collected with eight different microphones simultaneously from 40 subjects reading 15 sentence articles drawn from various North American business news publications....
Aug 2, 2016
Garofolo, John; Graff, David; Paul, Doug; Pallett, David, 2016, "CSR-I (WSJ0) Other Discs 1 - 2", https://doi.org/10.7910/DVN/ZVU9HF, Harvard Dataverse, V1
LDC93S6A - Complete CSR-I corpus LDC93S6B - CSR-I Sennheiser speech LDC93S6C - CSR-I other speech During 1991, the DARPA Spoken Language Program initiated efforts to build a new corpus to support research on large-vocabulary Continuous Speech Recognition (CSR) systems. The first two CSR Corpora consist primarily of read speech with texts drawn from...
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.