View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
Replication Data for: Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction |
Identification Number: |
doi:10.7910/DVN/Y5INRM |
Distributor: |
Harvard Dataverse |
Date of Distribution: |
2023-02-22 |
Version: |
1 |
Bibliographic Citation: |
Häffner, Sonja; Hofer, Martin; Nagl, Maximilian; Walterskirchen, Julian, 2023, "Replication Data for: Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction", https://doi.org/10.7910/DVN/Y5INRM, Harvard Dataverse, V1 |
Citation |
|
Title: |
Replication Data for: Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction |
Identification Number: |
doi:10.7910/DVN/Y5INRM |
Authoring Entity: |
Häffner, Sonja (Universität der Bundeswehr München) |
Hofer, Martin (Universität der Bundeswehr München) |
|
Nagl, Maximilian (Universität Regensburg) |
|
Walterskirchen, Julian (Universität der Bundeswehr München) |
|
Producer: |
<i>Political Analysis</i> |
Distributor: |
Harvard Dataverse |
Access Authority: |
Walterskirchen, Julian |
Depositor: |
Häffner, Sonja |
Date of Deposit: |
2022-12-27 |
Holdings Information: |
https://doi.org/10.7910/DVN/Y5INRM |
Study Scope |
|
Keywords: |
Social Sciences, natural language processing, objective dictionaries, deep learning, transformers, conflict dynamics |
Abstract: |
Recent advancements in natural language processing (NLP) methods have significantly improved their performance. However, more complex NLP models are more difficult to interpret and computationally expensive. Therefore, we propose an approach to dictionary creation that carefully balances the trade-off between complexity and interpretability. This approach combines a deep neural network architecture with techniques to improve model explainability to automatically build a domain-specific dictionary. As an illustrative use case of our approach, we create an objective dictionary that can infer conflict intensity from text data. We train the neural networks on a corpus of conflict reports and match them with conflict event data. This corpus consists of over 14,000 expert-written International Crisis Group (ICG) CrisisWatch reports between 2003 and 2021. Sensitivity analysis is used to extract the weighted words from the neural network to build the dictionary. In order to evaluate our approach, we compare our results to state-of-the-art deep learning language models, text-scaling methods, as well as standard, non-specialized, and conflict event dictionary approaches. We are able to show that our approach outperforms other approaches while retaining interpretability. |
Methodology and Processing |
|
Sources Statement |
|
Data Access |
|
Other Study Description Materials |
|
Related Publications |
|
Citation |
|
Bibliographic Citation: |
Forthcoming, Political Analysis |
Label: |
bert.zip |
Text: |
Folder containing data and code required for the bert model |
Notes: |
application/zip |
Label: |
code_guide.png |
Text: |
Flow-chart visualizing the replication process |
Notes: |
image/png |
Label: |
code.zip |
Text: |
Folder containing all code for the replication |
Notes: |
application/zip |
Label: |
crisisgroup_copyright.txt |
Text: |
Copyright and Trademark Notice by the International Crisis Group |
Notes: |
text/plain |
Label: |
data.zip |
Text: |
Folder containing the raw input data, data from the analysis and output data |
Notes: |
application/zip |
Label: |
README.md |
Text: |
Instructions regarding the replication |
Notes: |
text/markdown |
Label: |
replication_env.yaml |
Text: |
File to set up the environment |
Notes: |
application/octet-stream |
Label: |
requirements2.txt |
Notes: |
text/plain |
Label: |
requirements.txt |
Notes: |
text/plain |
Label: |
results.zip |
Text: |
Folder containing all data-generated figures |
Notes: |
application/zip |