Replication Data for: Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction (doi:10.7910/DVN/Y5INRM)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description
Citation
Title:	Replication Data for: Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction
Identification Number:	doi:10.7910/DVN/Y5INRM
Distributor:	Harvard Dataverse
Date of Distribution:	2023-02-22
Version:	1
Bibliographic Citation:	Häffner, Sonja; Hofer, Martin; Nagl, Maximilian; Walterskirchen, Julian, 2023, "Replication Data for: Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction", https://doi.org/10.7910/DVN/Y5INRM, Harvard Dataverse, V1
Study Description
Citation
Title:	Replication Data for: Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction
Identification Number:	doi:10.7910/DVN/Y5INRM
Authoring Entity:	Häffner, Sonja (Universität der Bundeswehr München)
	Hofer, Martin (Universität der Bundeswehr München)
	Nagl, Maximilian (Universität Regensburg)
	Walterskirchen, Julian (Universität der Bundeswehr München)
Producer:	<i>Political Analysis</i>
Distributor:	Harvard Dataverse
Access Authority:	Walterskirchen, Julian
Depositor:	Häffner, Sonja
Date of Deposit:	2022-12-27
Holdings Information:	https://doi.org/10.7910/DVN/Y5INRM
Study Scope
Keywords:	Social Sciences, natural language processing, objective dictionaries, deep learning, transformers, conflict dynamics
Abstract:	Recent advancements in natural language processing (NLP) methods have significantly improved their performance. However, more complex NLP models are more difficult to interpret and computationally expensive. Therefore, we propose an approach to dictionary creation that carefully balances the trade-off between complexity and interpretability. This approach combines a deep neural network architecture with techniques to improve model explainability to automatically build a domain-specific dictionary. As an illustrative use case of our approach, we create an objective dictionary that can infer conflict intensity from text data. We train the neural networks on a corpus of conflict reports and match them with conflict event data. This corpus consists of over 14,000 expert-written International Crisis Group (ICG) CrisisWatch reports between 2003 and 2021. Sensitivity analysis is used to extract the weighted words from the neural network to build the dictionary. In order to evaluate our approach, we compare our results to state-of-the-art deep learning language models, text-scaling methods, as well as standard, non-specialized, and conflict event dictionary approaches. We are able to show that our approach outperforms other approaches while retaining interpretability.
Methodology and Processing
Sources Statement
Data Access
Other Study Description Materials
Related Publications
Citation
Bibliographic Citation:	Forthcoming, Political Analysis
Other Study-Related Materials
Label:	bert.zip
Text:	Folder containing data and code required for the bert model
Notes:	application/zip
Other Study-Related Materials
Label:	code_guide.png
Text:	Flow-chart visualizing the replication process
Notes:	image/png
Other Study-Related Materials
Label:	code.zip
Text:	Folder containing all code for the replication
Notes:	application/zip
Other Study-Related Materials
Label:	crisisgroup_copyright.txt
Text:	Copyright and Trademark Notice by the International Crisis Group
Notes:	text/plain
Other Study-Related Materials
Label:	data.zip
Text:	Folder containing the raw input data, data from the analysis and output data
Notes:	application/zip
Other Study-Related Materials
Label:	README.md
Text:	Instructions regarding the replication
Notes:	text/markdown
Other Study-Related Materials
Label:	replication_env.yaml
Text:	File to set up the environment
Notes:	application/octet-stream
Other Study-Related Materials
Label:	requirements2.txt
Notes:	text/plain
Other Study-Related Materials
Label:	requirements.txt
Notes:	text/plain
Other Study-Related Materials
Label:	results.zip
Text:	Folder containing all data-generated figures
Notes:	application/zip