View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
Replication data for: Reverse Engineering Chinese Censorship: Randomized Experimentation and Participant Observation |
Identification Number: |
doi:10.7910/DVN/26212 |
Distributor: |
Harvard Dataverse |
Date of Distribution: |
2014-05-27 |
Version: |
5 |
Bibliographic Citation: |
King, Gary; Pan, Jennifer; Roberts, Margaret, E., 2014, "Replication data for: Reverse Engineering Chinese Censorship: Randomized Experimentation and Participant Observation", https://doi.org/10.7910/DVN/26212, Harvard Dataverse, V5, UNF:5:K/LGmB0vjskGYBobxbT+8g== [fileUNF] |
Citation |
|
Title: |
Replication data for: Reverse Engineering Chinese Censorship: Randomized Experimentation and Participant Observation |
Identification Number: |
doi:10.7910/DVN/26212 |
Authoring Entity: |
King, Gary (Harvard University) |
Pan, Jennifer (Harvard University) |
|
Roberts, Margaret, E. (Harvard University) |
|
Distributor: |
Harvard Dataverse |
Distributor: |
Harvard Dataverse |
Access Authority: |
Gary King |
Date of Deposit: |
2014-05-27 |
Date of Distribution: |
2014 |
Holdings Information: |
https://doi.org/10.7910/DVN/26212 |
Study Scope |
|
Keywords: |
Social Sciences |
Abstract: |
Chinese government censorship of social media constitutes the largest coordinated selective suppression of human communication in recorded history. Although existing research on the subject has revealed a great deal, it is based on passive, observational methods, with well known inferential limitations. For example, these methods can reveal nothing about censorship that occurs before submissions are posted, such as via automated review which we show is used at two-thirds of all social media sites. We offer two approaches to overcome these limitations. For causal inferences, we conduct the first large scale experimental study of censorship by creating accounts on numerous social media sites spread throughout the country, submitting different randomly assigned types of social media texts, and detecting from a network of computers all over the world which types are censored. Then, for descriptive inferences, we supplement the current uncertain practice of conducting anonymous interviews with secret informants, by participant observation: we set up our own social media site in China, contract with Chinese firms to install the same censoring technologies as their existing sites, and -- with direct access to their software, documentation, and even customer service help desk support -- reverse engineer how it all works. Our results offer the first rigorous experimental support for the recent hypothesis that criticism of the state, its leaders, and their policies are routinely published, whereas posts about real world events with collective action potential are censored. We also extend the hypothesis by showing that it applies even to accusations of corruption by high-level officials and massive online-only protests, neither of which are censored. We also reveal for the first time the inner workings of the process of automated review, and as a result are able to reconcile conflicting accounts of keyword-based content filtering in the academic literature. We show that the Chinese government tolerates surprising levels of diversity in automated review technology, but still ensures a uniform outcome by post hoc censorship using huge numbers of human coders. <br /><br /> See also: <a href="http://gking.harvard.edu/category/research-interests/applications/automated-text-analysis" target="_blank">Automated Text Analysis</a> |
Methodology and Processing |
|
Sources Statement |
|
Data Access |
|
Notes: |
This dataset is made available without information on how it can be used. You should communicate with the Contact(s) specified before use. |
Other Study Description Materials |
|
Related Publications |
|
Citation |
|
Title: |
King, Gary, Jennifer Pan, and Margaret E Roberts. 2014. “Reverse-Engineering Censorship in China: Randomized Experimentation and Participant Observation.” Science 345 (6199): 1-10. <a href="http://j.mp/1KbwkJJ" target="_blank">Link to article</a> |
Bibliographic Citation: |
King, Gary, Jennifer Pan, and Margaret E Roberts. 2014. “Reverse-Engineering Censorship in China: Randomized Experimentation and Participant Observation.” Science 345 (6199): 1-10. <a href="http://j.mp/1KbwkJJ" target="_blank">Link to article</a> |
File Description--f2468914 |
|
File: AiWeiwei_obs_replicate.tab |
|
|
|
Notes: |
UNF:5:uM/zT66PuAYki4wE68RN5Q== |
File Description--f2468912 |
|
File: PotalaPalace_obs_replicate.tab |
|
|
|
Notes: |
UNF:5:le1jxO0+PWc5SJuetcg4Jw== |
File Description--f2468915 |
|
File: results_all_replication.tab |
|
|
|
Notes: |
UNF:5:MFe4uwNcUrPHl8fw4U70dg== |
File Description--f2468916 |
|
File: reviewed_replication.tab |
|
|
|
Notes: |
UNF:5:/UNS7/lpOUXwpTv1NMC9jQ== |
File Description--f2468913 |
|
File: UyghurLongName_obs_replicate.tab |
|
|
|
Notes: |
UNF:5:AoUfMrFlEfK8rzRZ2j6ehg== |
File Description--f2468911 |
|
File: XJPDumplingcensor.tab |
|
|
|
Notes: |
UNF:5:bb6SKx4GOaZuEPwNeJtlxA== |
File Description--f2468909 |
|
File: XJPDumplingnotcensor.tab |
|
|
|
Notes: |
UNF:5:JVHl8NaXJjoJuyqB3ixlpw== |
File Description--f2468910 |
|
File: XJPDumplingupdown.tab |
|
|
|
Notes: |
UNF:5:h6HLeKPWrqz0BZ7O3yUg5g== |
List of Variables: | |
Variables |
|
f2468914 Location: |
Variable Format: character Notes: UNF:5:plgonvqSLUdKaOtYU8ukkw== |
f2468914 Location: |
Variable Format: numeric Notes: UNF:5:chGhvaKw2EfVhPwTCa3bMg== |
f2468912 Location: |
Variable Format: character Notes: UNF:5:8bYiipvQJ0qTcuLBO/O5sw== |
f2468912 Location: |
Variable Format: numeric Notes: UNF:5:fS+ZVdxpSo3Vpl+AkykgMQ== |
f2468915 Location: |
Variable Format: character Notes: UNF:5:AN/EIv6ToyG0xQTAigKcyg== |
f2468915 Location: |
Variable Format: numeric Notes: UNF:5:ReCTuybTRePGEZChaMdNBQ== |
f2468915 Location: |
Variable Format: numeric Notes: UNF:5:pDDSvZwbQbW4QAtSPOPS2g== |
f2468915 Location: |
Variable Format: numeric Notes: UNF:5:x65PYCXz0F3M5xlFk+22Gw== |
f2468915 Location: |
Variable Format: numeric Notes: UNF:5:PQWzv2Lmy9d3sh7BLWxzyQ== |
f2468915 Location: |
Variable Format: character Notes: UNF:5:i8Gzs+gzp8LSqh2jK/Xq1g== |
f2468915 Location: |
Variable Format: character Notes: UNF:5:738DeqSSyqj3Cr02RYeL+g== |
f2468915 Location: |
Variable Format: character Notes: UNF:5:2zpvMoW6rwx0bZRzJ3L/Tg== |
f2468915 Location: |
Variable Format: character Notes: UNF:5:CRnDbUrKjeF5UL+Xmu1sbg== |
f2468915 Location: |
Variable Format: character Notes: UNF:5:I7J6pFyZlED4FaThiahGXA== |
f2468915 Location: |
Variable Format: character Notes: UNF:5:HjAoTNx8DbOWLVz2DfwY7w== |
f2468915 Location: |
Variable Format: character Notes: UNF:5:Ka5+sDfqoz6cL/BC6wUS/w== |
f2468916 Location: |
Variable Format: numeric Notes: UNF:5:BJh0puHcOc0Pv6aAdq2hNQ== |
f2468916 Location: |
Variable Format: numeric Notes: UNF:5:PcB72iUaG+JZEuY52qCuRw== |
f2468916 Location: |
Variable Format: character Notes: UNF:5:zHTkK++hTSx46DKgzZgl6g== |
f2468913 Location: |
Variable Format: character Notes: UNF:5:iZ6EnYtC7C8PTPmhPDJyLA== |
f2468913 Location: |
Variable Format: numeric Notes: UNF:5:hNIPKUbg9l/VEvfr6J9BYQ== |
f2468911 Location: |
Variable Format: numeric Notes: UNF:5:u9SsxxcOYWwqTYldbrd7vQ== |
f2468911 Location: |
Variable Format: numeric Notes: UNF:5:9VWQU1oDj+F+jxr19P8nXw== |
f2468911 Location: |
Variable Format: numeric Notes: UNF:5:cSvsnO893nU/UtuyQzIyfw== |
f2468911 Location: |
Variable Format: numeric Notes: UNF:5:Mc626WkNbv9XpCvYzbl0eg== |
f2468911 Location: |
Variable Format: character Notes: UNF:5:GAHxcQmOkM5DU1ZkUw6Scw== |
f2468909 Location: |
Variable Format: numeric Notes: UNF:5:u9SsxxcOYWwqTYldbrd7vQ== |
f2468909 Location: |
Variable Format: numeric Notes: UNF:5:w5CI+zVnlvL56lwX5C1HLw== |
f2468909 Location: |
Variable Format: character Notes: UNF:5:1hQkmj8O54EWajajSFRuWQ== |
f2468909 Location: |
Variable Format: numeric Notes: UNF:5:CQGeuJ2LSi32AbK2NZYhXA== |
f2468909 Location: |
Variable Format: character Notes: UNF:5:6QXPn7PXisgpwA6OurRzcw== |
f2468910 Location: |
Variable Format: character Notes: UNF:5:MTdCQ8eV7vutkz62ckySlw== |
f2468910 Location: |
Variable Format: numeric Notes: UNF:5:mT9/x77HMo4uwqFgk9fM1A== |
Label: |
100urls.txt |
Text: |
A list of all URLs we used in our article to run our experiment |
Notes: |
text/plain |
Label: |
AiWeiwei_obs_replicate.csv |
Notes: |
text/plain; charset=US-ASCII |
Label: |
PotalaPalace_obs_replicate.csv |
Notes: |
text/plain; charset=US-ASCII |
Label: |
readme_replication.txt |
Text: |
Read me files describing how to use the data |
Notes: |
text/plain; charset=US-ASCII |
Label: |
replication.R |
Notes: |
text/plain; charset=US-ASCII |
Label: |
replication.R~ |
Notes: |
text/plain; charset=US-ASCII |
Label: |
replication_script.R |
Notes: |
text/plain; charset=US-ASCII |
Label: |
results_all_replication.csv |
Notes: |
text/plain; charset=US-ASCII |
Label: |
reviewed_replication.csv |
Notes: |
text/plain; charset=US-ASCII |
Label: |
UyghurLongName_obs_replicate.csv |
Notes: |
text/plain; charset=US-ASCII |
Label: |
XJPDumplingcensor.csv |
Notes: |
text/plain; charset=US-ASCII |
Label: |
XJPDumplingnotcensor.csv |
Notes: |
text/plain; charset=US-ASCII |
Label: |
XJPDumplingupdown.csv |
Notes: |
text/plain; charset=US-ASCII |
Label: |
XJPDumpling_textanalysis.R |
Notes: |
text/plain; charset=US-ASCII |
Label: |
XJPDumpling_textanalysis.R~ |
Notes: |
text/plain; charset=US-ASCII |