Replication data for: Explaining Rare Events in International Relations (doi:10.7910/DVN/RNSU7V)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Replication data for: Explaining Rare Events in International Relations

Identification Number:

doi:10.7910/DVN/RNSU7V

Distributor:

Harvard Dataverse

Date of Distribution:

2007-11-28

Version:

4

Bibliographic Citation:

King, Gary; Zeng, Langche, 2007, "Replication data for: Explaining Rare Events in International Relations", https://doi.org/10.7910/DVN/RNSU7V, Harvard Dataverse, V4, UNF:3:vyct3c8fMCdWOdp03NUhaA== [fileUNF]

Study Description

Citation

Title:

Replication data for: Explaining Rare Events in International Relations

Identification Number:

doi:10.7910/DVN/RNSU7V

Authoring Entity:

King, Gary (Harvard University)

Zeng, Langche (UC San Diego)

Date of Production:

2001

Distributor:

Harvard Dataverse

Distributor:

Harvard Dataverse

Date of Deposit:

2006

Date of Distribution:

2001

Holdings Information:

https://doi.org/10.7910/DVN/RNSU7V

Study Scope

Keywords:

Social Sciences

Abstract:

Some of the most important phenomena in international conflict are coded s "rare events data," binary dependent variables with dozens to thousands of times fewer events, such as wars, coups, etc., than "nonevents". Unfortunately, rare events data are difficult to explain and predict, a problem that seems to have at least two sources. First, and most importantly, the data collection strategies used in international conflict are grossly inefficient. The fear of collecting data with too few events has led to data collections with huge numbers of observations but relatively few, and poorly measured, explanatory variables. As it turns out, more efficient sampling designs exist for making valid inferences, such as sampling all available events (e.g., wars) and a tiny fraction of non-events (peace). This enables scholars to save as much as 99% of their (non-fixed) data collection costs, or to collect much more meaningful explanatory variables. Second, logistic regression, and other commonly used statistical procedures, can underestimate the probability of rare events. We introduce some corrections that outperform existing methods and change the estimates of absolute and relative risks by as much as some estimated effects reported in the literature. We also provide easy-to-use methods and software that link these two results, enabling both types of corrections to work simultaneously. <br /><br /> You may also be interested in the companion methods article to this one, <a href="http://gking.harvard.edu/files/gking/files/0s.pdf" target="_blank">Logistic Regression in Rare Events Data</a>, as well as related work, <a href="http://gking.harvard.edu/files/abs/1s-enc-abs.shtml" target="_blank">Inference in Case-Control Studies.</a> <br /><br /> See also: <a href="http://gking.harvard.edu/category/research-interests/applications/international-conflict" target="_blank">Internation al Conflict</a>, <a href="http://gking.harvard.edu/category/research-interests/methods/rare-events" target="_blank">Rare Events</a>

Methodology and Processing

Sources Statement

Data Access

Notes:

This dataset is made available without information on how it can be used. You should communicate with the Contact(s) specified before use.

Other Study Description Materials

Related Publications

Citation

Title:

King, Gary; Zeng, Langche, 2001, "Explaining Rare Events in International Relations," International Organization, 55, 3, 693-715: <a href="http://j.mp/iHyh68" target="_blank">Link to article</a>

Bibliographic Citation:

King, Gary; Zeng, Langche, 2001, "Explaining Rare Events in International Relations," International Organization, 55, 3, 693-715: <a href="http://j.mp/iHyh68" target="_blank">Link to article</a>

File Description--f103955

File: model.tab

  • Number of cases: 303814

  • No. of variables per record: 13

  • Type of File: text/tab-separated-values

Notes:

UNF:3:8p3E5xbsx0MQeREYNgGrjQ==

Replication of the full sample model in table 1

File Description--f109212

File: SF.tab

  • Number of cases: 7190

  • No. of variables per record: 16

  • Type of File: text/tab-separated-values

Notes:

UNF:3:WtZaHZxhnJA+1nAQCxEcLg==

Data for replication of results in the state failure data

Variable Description

List of Variables:

Variables

Y

f103955 Location:

Variable Format: numeric

Notes: UNF:3:7DQktaYQaT1wZpwmfCRGSA==

X1

f103955 Location:

Variable Format: numeric

Notes: UNF:3:vGV9Xr0MKSDbvPA4awn2yg==

X2

f103955 Location:

Variable Format: numeric

Notes: UNF:3:enwuSvT1i37kiSqHDrFqNA==

X3

f103955 Location:

Variable Format: numeric

Notes: UNF:3:C8J7jH73rBthyYA2c4iylA==

X4

f103955 Location:

Variable Format: numeric

Notes: UNF:3:pwWteDhwp513md+yfWO9rg==

X5

f103955 Location:

Variable Format: numeric

Notes: UNF:3:Rh+9tvDAfLVCKJmbST8iWA==

X6

f103955 Location:

Variable Format: numeric

Notes: UNF:3:SJR3APKpatO5glc0iF/1Ig==

X7

f103955 Location:

Variable Format: numeric

Notes: UNF:3:uZKeSlOF4+wpd0AJSMR5Dw==

X8

f103955 Location:

Variable Format: numeric

Notes: UNF:3:p9oTgXV64B8cgOKh5FzKIg==

X9

f103955 Location:

Variable Format: numeric

Notes: UNF:3:DXxkAExCbO0Av/0M5R9TZQ==

X10

f103955 Location:

Variable Format: numeric

Notes: UNF:3:4ydGWHEG/GfT/fkWd5zjnw==

X11

f103955 Location:

Variable Format: numeric

Notes: UNF:3:dzfw+jVpO2O1KSAhyjrFlA==

YEAR

f103955 Location:

Variable Format: numeric

Notes: UNF:3:/ZwamE05B3hoQvo1ROZZnQ==

V1

f109212 Location:

Variable Format: numeric

Notes: UNF:3:4DvdDLBs0AgLwpdiOipNtw==

V2

f109212 Location:

Variable Format: numeric

Notes: UNF:3:dbFBhp6VOBcDPUsx19oq5A==

V3

f109212 Location:

Variable Format: numeric

Notes: UNF:3:n7OqHc4+mT9EsBFx2AHsQQ==

V4

f109212 Location:

Variable Format: numeric

Notes: UNF:3:cz8zYKuOM77PdXTskgLOKQ==

V5

f109212 Location:

Variable Format: numeric

Notes: UNF:3:fTRT4P2W+i8/mz1CG49nNQ==

V6

f109212 Location:

Variable Format: numeric

Notes: UNF:3:lNwYrd8sTlSbphSam6ihWQ==

V7

f109212 Location:

Variable Format: numeric

Notes: UNF:3:D3nWP1fT5SpABqAQ8o0Ajw==

V8

f109212 Location:

Variable Format: numeric

Notes: UNF:3:rFGXzon2ci3ixhsYKgg7cA==

V9

f109212 Location:

Variable Format: numeric

Notes: UNF:3:sAEEza6vIi/AWkOE1iHjZw==

V10

f109212 Location:

Variable Format: numeric

Notes: UNF:3:lreeBbj3TPWwctpIoD2Ctg==

V11

f109212 Location:

Variable Format: numeric

Notes: UNF:3:T1DFUYr8PyLee8Tzy0fbFw==

V12

f109212 Location:

Variable Format: numeric

Notes: UNF:3:x8fA+wGeedON031PACi2ew==

V13

f109212 Location:

Variable Format: numeric

Notes: UNF:3:Dd793N3CFj/OGuonW9J0DA==

V14

f109212 Location:

Variable Format: numeric

Notes: UNF:3:Rl4NCWoPrNVu7BKHeqg+vg==

V15

f109212 Location:

Variable Format: numeric

Notes: UNF:3:Rs8orduxYy4b3TBdNPQwBg==

V16

f109212 Location:

Variable Format: numeric

Notes: UNF:3:Hi0O05XUq73/0TR2L4nQHA==

Other Study-Related Materials

Label:

Huth.dat

Text:

File for replication of results in table 2, ASCII format

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

Huth.out

Text:

Output file for replication of results in table 2, Gauss format

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

Huth1.prg

Text:

File for replication of results in table 2, Gauss code

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

LogisticRegressionArticle.pdf

Text:

Article related to this study: Logistic Regression in Rare Events Data

Notes:

application/pdf

Other Study-Related Materials

Label:

Model.do

Text:

Output file for replication of the full sample model in table 1, Stata code

Notes:

text/x-stata-syntax; charset=US-ASCII

Other Study-Related Materials

Label:

model.dta

Text:

Data file replicating the full sample model in table 1, in Stata format

Notes:

application/x-stata

Other Study-Related Materials

Label:

Model.Log

Text:

output for replication of the full sample model in table 1

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

Readme

Text:

Detailed information on data and documentation files for this study

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

SF.dat

Text:

File for replication of results in the state failure data, ASCII format

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

SF.out

Text:

File for replication of results in the state failure data, Gauss output file

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

SF.prg

Text:

Program file for replication of results in the state failure data, Gauss code file

Notes:

text/plain; charset=US-ASCII