Replication Data for: Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction (doi:10.7910/DVN/Y5INRM)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Replication Data for: Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction

Identification Number:

doi:10.7910/DVN/Y5INRM

Distributor:

Harvard Dataverse

Date of Distribution:

2023-02-22

Version:

1

Bibliographic Citation:

Häffner, Sonja; Hofer, Martin; Nagl, Maximilian; Walterskirchen, Julian, 2023, "Replication Data for: Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction", https://doi.org/10.7910/DVN/Y5INRM, Harvard Dataverse, V1

Study Description

Citation

Title:

Replication Data for: Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction

Identification Number:

doi:10.7910/DVN/Y5INRM

Authoring Entity:

Häffner, Sonja (Universität der Bundeswehr München)

Hofer, Martin (Universität der Bundeswehr München)

Nagl, Maximilian (Universität Regensburg)

Walterskirchen, Julian (Universität der Bundeswehr München)

Producer:

<i>Political Analysis</i>

Distributor:

Harvard Dataverse

Access Authority:

Walterskirchen, Julian

Depositor:

Häffner, Sonja

Date of Deposit:

2022-12-27

Holdings Information:

https://doi.org/10.7910/DVN/Y5INRM

Study Scope

Keywords:

Social Sciences, natural language processing, objective dictionaries, deep learning, transformers, conflict dynamics

Abstract:

Recent advancements in natural language processing (NLP) methods have significantly improved their performance. However, more complex NLP models are more difficult to interpret and computationally expensive. Therefore, we propose an approach to dictionary creation that carefully balances the trade-off between complexity and interpretability. This approach combines a deep neural network architecture with techniques to improve model explainability to automatically build a domain-specific dictionary. As an illustrative use case of our approach, we create an objective dictionary that can infer conflict intensity from text data. We train the neural networks on a corpus of conflict reports and match them with conflict event data. This corpus consists of over 14,000 expert-written International Crisis Group (ICG) CrisisWatch reports between 2003 and 2021. Sensitivity analysis is used to extract the weighted words from the neural network to build the dictionary. In order to evaluate our approach, we compare our results to state-of-the-art deep learning language models, text-scaling methods, as well as standard, non-specialized, and conflict event dictionary approaches. We are able to show that our approach outperforms other approaches while retaining interpretability.

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Related Publications

Citation

Bibliographic Citation:

Forthcoming, Political Analysis

Other Study-Related Materials

Label:

bert.zip

Text:

Folder containing data and code required for the bert model

Notes:

application/zip

Other Study-Related Materials

Label:

code_guide.png

Text:

Flow-chart visualizing the replication process

Notes:

image/png

Other Study-Related Materials

Label:

code.zip

Text:

Folder containing all code for the replication

Notes:

application/zip

Other Study-Related Materials

Label:

crisisgroup_copyright.txt

Text:

Copyright and Trademark Notice by the International Crisis Group

Notes:

text/plain

Other Study-Related Materials

Label:

data.zip

Text:

Folder containing the raw input data, data from the analysis and output data

Notes:

application/zip

Other Study-Related Materials

Label:

README.md

Text:

Instructions regarding the replication

Notes:

text/markdown

Other Study-Related Materials

Label:

replication_env.yaml

Text:

File to set up the environment

Notes:

application/octet-stream

Other Study-Related Materials

Label:

requirements2.txt

Notes:

text/plain

Other Study-Related Materials

Label:

requirements.txt

Notes:

text/plain

Other Study-Related Materials

Label:

results.zip

Text:

Folder containing all data-generated figures

Notes:

application/zip