View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
Replication Data for: A new database for Italian parliamentary speeches. Introducing the ItaParlCorpus dataset |
Identification Number: |
doi:10.7910/DVN/RCKMDA |
Distributor: |
Harvard Dataverse |
Date of Distribution: |
2025-03-23 |
Version: |
1 |
Bibliographic Citation: |
Cova, Joshua, 2025, "Replication Data for: A new database for Italian parliamentary speeches. Introducing the ItaParlCorpus dataset", https://doi.org/10.7910/DVN/RCKMDA, Harvard Dataverse, V1 |
Citation |
|
Title: |
Replication Data for: A new database for Italian parliamentary speeches. Introducing the ItaParlCorpus dataset |
Identification Number: |
doi:10.7910/DVN/RCKMDA |
Authoring Entity: |
Cova, Joshua (Max Planck Institute for the Study of Societies) |
Distributor: |
Harvard Dataverse |
Access Authority: |
Cova, Joshua |
Depositor: |
Cova, Joshua |
Date of Deposit: |
2025-02-16 |
Holdings Information: |
https://doi.org/10.7910/DVN/RCKMDA |
Study Scope |
|
Keywords: |
Social Sciences, Parliamentary data, Italian politics, Text as data, Italy, Parliament, Political parties, Research methods, Text analysis |
Abstract: |
A common challenge in studying Italian parliamentary discourse is the lack of accessible, machine-readable, and systematized parliamentary data. To address this, this article introduces the ItaParlCorpus dataset, a new, annotated, machine-readable collection of Italian parliamentary plenary speeches for the Camera dei Deputati, the lower house of Parliament, spanning from 1948 to 2022. This dataset encompasses 470 million words and 2.4 million speeches delivered by 5,830 unique speakers representing 77 different political parties. The files are designed for easy processing and analysis using widely-used programming languages, and they include metadata such as speaker identification and party affiliation. This opens up opportunities for in-depth analyses on a variety of topics related to parliamentary behavior, elite rhetoric, and the salience of political themes, exploring how these vary across party families and over time. |
Methodology and Processing |
|
Sources Statement |
|
Data Access |
|
Other Study Description Materials |
|
Related Publications |
|
Citation |
|
Title: |
Cova J (2025). A new database for Italian parliamentary speeches: introducing the ItaParlCorpus dataset. Italian Political Science Review/Rivista Italiana di Scienza Politica 1–10. https://doi.org/10.1017/ipo.2025.6 |
Identification Number: |
10.1017/ipo.2025.6 |
Bibliographic Citation: |
Cova J (2025). A new database for Italian parliamentary speeches: introducing the ItaParlCorpus dataset. Italian Political Science Review/Rivista Italiana di Scienza Politica 1–10. https://doi.org/10.1017/ipo.2025.6 |
Label: |
read_me_itaparlcorpus.txt |
Text: |
ReadMe file |
Notes: |
text/plain |
Label: |
replication_code_itaparlcorpus.R |
Text: |
Replication code for Figures 3-6 |
Notes: |
type/x-r-syntax |