Danish Legislative Speech Corpus (doi:10.7910/DVN/PNCBKF)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Danish Legislative Speech Corpus

Identification Number:

doi:10.7910/DVN/PNCBKF

Distributor:

Harvard Dataverse

Date of Distribution:

2021-12-13

Version:

2

Bibliographic Citation:

Hjorth, Frederik, 2021, "Danish Legislative Speech Corpus", https://doi.org/10.7910/DVN/PNCBKF, Harvard Dataverse, V2

Study Description

Citation

Title:

Danish Legislative Speech Corpus

Identification Number:

doi:10.7910/DVN/PNCBKF

Authoring Entity:

Hjorth, Frederik (University of Copenhagen)

Date of Production:

2024-10-15

Distributor:

Harvard Dataverse

Access Authority:

Hjorth, Frederik

Depositor:

Hjorth, Frederik

Date of Deposit:

2021-12-13

Holdings Information:

https://doi.org/10.7910/DVN/PNCBKF

Study Scope

Keywords:

Social Sciences

Abstract:

This dataset contains text and speaker data for around 1.7 million snippets of legislative speech from Denmark's Parliament, Folketinget. The data set draws in part on the <a href="https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/L4OAKN">ParlSpeech V2</a> data set, in part on Folketinget's publicly available <a href="https://www.ft.dk/da/dokumenter/aabne_data">XML transcripts</a>. In order to homogenize the lengths of text units, longer speeches are broken down into snippets of 3-5 sentences each. <br><br> Version 2 of the data extends the data coverage to late 2024. While the first version is retained here for archival purposes, most users will probably want the newest version (i.e. "paradf-v2").

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Other Study-Related Materials

Label:

paradf-v2.rds

Text:

Version 2 of the corpus, extending the data to late 2024.

Notes:

application/octet-stream

Other Study-Related Materials

Label:

paradf.rds

Text:

File containing text snippets, date, and speaker name.

Notes:

application/octet-stream