View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
Replication Data for: Differentially Private Survey Research |
Identification Number: |
doi:10.7910/DVN/X4Y2FL |
Distributor: |
Harvard Dataverse |
Date of Distribution: |
2023-12-19 |
Version: |
1 |
Bibliographic Citation: |
Evans, Georgina; King, Gary; Smith, Adam; Thakurta, Abhradeep, 2023, "Replication Data for: Differentially Private Survey Research", https://doi.org/10.7910/DVN/X4Y2FL, Harvard Dataverse, V1, UNF:6:1hQlAh8RGzLi+kKnI82oXw== [fileUNF] |
Citation |
|
Title: |
Replication Data for: Differentially Private Survey Research |
Identification Number: |
doi:10.7910/DVN/X4Y2FL |
Authoring Entity: |
Evans, Georgina (Harvard University) |
King, Gary (Harvard University) |
|
Smith, Adam (Boston University) |
|
Thakurta, Abhradeep (University of California Santa Cruz) |
|
Producer: |
Georgina Evans |
Distributor: |
Harvard Dataverse |
Access Authority: |
Evans, Georgina |
Depositor: |
Evans, Georgina |
Date of Deposit: |
2022-08-29 |
Holdings Information: |
https://doi.org/10.7910/DVN/X4Y2FL |
Study Scope |
|
Keywords: |
Social Sciences, Privacy, Statistics, Inference |
Abstract: |
Survey researchers have long protected the privacy of respondents via de-identification (removing names and other directly identifying information) before sharing data. Although these procedures help, recent research demonstrates that they fail to protect respondents from intentional re-identification attacks, a problem that threatens to undermine vast survey enterprises in academia, government, and industry. This is especially a problem in political science because political beliefs are not merely the subject of our scholarship; they represent some of the most important information respondents want to keep private. We confirm the problem in practice by re-identifying individuals from a survey about a controversial referendum declaring life beginning at conception. We build on the concept of “differential privacy” to offer new data sharing procedures with mathematical guarantees for protecting respondent privacy and statistical validity guarantees for social scientists analyzing differentially private data. The cost of these new procedures is larger standard errors, which can be overcome with somewhat larger sample sizes. |
Notes: |
This dataset underwent an independent verification process, complying with the AJPS Verification Policy updated June 2023, that replicated the tables and figures in the primary article. For the supplementary materials, verification was performed solely for the successful execution of code. The verification process was carried out by the Odum Institute for Research in Social Science at the University of North Carolina at Chapel Hill. <br></br> The associated article has been awarded the Open Materials Badge. Learn more about the Open Practice Badges from the <a href="https://osf.io/tvyxz/wiki/home/" target="_blank">Center for Open Science</a>.<br></br> <img src="https://odum.unc.edu/files/2020/03/OpenMaterials_PR-1.png" alt="Open Materials Badge" height="77" width="80"> |
Methodology and Processing |
|
Sources Statement |
|
Data Sources: |
Rosenfeld, Bryn; Imai, Kosuke; Shapiro, Jacob, 2015, "Replication Data for: An Empirical Validation Study of Popular Survey Methodologies for Sensitive Questions", https://doi.org/10.7910/DVN/29911, Harvard Dataverse, V3, UNF:5:wfSfR7xnbL9XigVosud4zA== [fileUNF] |
Data Access |
|
Disclaimer: |
The <i>American Journal of Political Science</i> and the Odum Institute for Research in Social Science are not responsible for the accuracy or quality of data uploaded within the <i>AJPS</i> Dataverse, for the use of those data, or for interpretations or conclusions based on their use. |
Other Study Description Materials |
|
Related Publications |
|
Citation |
|
Title: |
Evans, Georgina, Gary King, Adam D. Smith, and Abhradeep Thakurta. [date]. "Differentially Private Survey Research." <i>American Journal of Political Science</i> Forthcoming. <a href="http://ajps.org/" target="_blank">http://ajps.org/</a> |
Bibliographic Citation: |
Evans, Georgina, Gary King, Adam D. Smith, and Abhradeep Thakurta. [date]. "Differentially Private Survey Research." <i>American Journal of Political Science</i> Forthcoming. <a href="http://ajps.org/" target="_blank">http://ajps.org/</a> |
File Description--f7673753 |
|
File: k_sims.tab |
|
|
|
Notes: |
UNF:6:khzDUos3LKNK++SA0qos2g== |
File Description--f7673746 |
|
File: main_sims.tab |
|
|
|
Notes: |
UNF:6:ybdf76nGv0zkZnO8zgTgPA== |
List of Variables: |
|
Variables |
|
f7673753 Location: |
Summary Statistics: Valid 800.0; Mean 1.5027557072832878; Max. 1.73668069093032; Min. 1.2720189369362973; StDev 0.07683358665682588 Variable Format: numeric Notes: UNF:6:GKN3YY2uNDsFiGZWUSkBHg== |
f7673753 Location: |
Summary Statistics: Valid 800.0; Max. 2.1259068620668202; StDev 0.22221282330510794; Mean 1.2466164126791646; Min. 0.31969283513689173 Variable Format: numeric Notes: UNF:6:yJIllsz96IZ/8fyrYeI5hQ== |
f7673753 Location: |
Summary Statistics: StDev 0.4967055137536442; Valid 796.0; Mean 1.5270863495760985; Min. -0.19429541441521395; Max. 5.679337874311988 Variable Format: numeric Notes: UNF:6:6E/KCxvA7Utjhd3mbs+qNg== |
f7673753 Location: |
Summary Statistics: Valid 796.0; Mean 0.522523773934766; StDev 0.98766294931149; Max. 25.862257672408422; Min. 0.206101091406152 Variable Format: numeric Notes: UNF:6:w3xcf1mQ9e4AG9+ug3p1/Q== |
f7673753 Location: |
Summary Statistics: Max. 2.5343270594144687; Min. 0.6788424860121531; StDev 0.21934761771782632; Mean 1.4828276768855337; Valid 800.0 Variable Format: numeric Notes: UNF:6:S0m2BS9P8369gA9Od3US8A== |
f7673753 Location: |
Summary Statistics: Max. 0.4071612050104414; StDev 0.045879262059710504; Mean 0.05125408519622568; Min. 0.01640300148457335; Valid 800.0; Variable Format: numeric Notes: UNF:6:AeCgCpaHo5CnPLSCQclkAQ== |
f7673753 Location: |
Summary Statistics: Max. 3.0; Valid 800.0; Min. 1.5; StDev 0.6183153158278016; Mean 2.53125; Variable Format: numeric Notes: UNF:6:+59sRB2TkDe5CI8P94mzGQ== |
f7673753 Location: |
Summary Statistics: Min. 5000.0; Valid 800.0; Mean 5000.0; Max. 5000.0; StDev 0.0 Variable Format: numeric Notes: UNF:6:JyJP7JOyP61F787Y6DyZTA== |
f7673753 Location: |
Summary Statistics: Mean 122.0; Max. 212.0; Min. 92.0; Valid 800.0; StDev 51.994030657663366 Variable Format: numeric Notes: UNF:6:hhZ9KH/3POhXWJ3FuoKY+A== |
f7673746 Location: |
Summary Statistics: Max. 2.166432466885211; StDev 0.1621077666802211; Valid 1550.0; Min. 0.9271491035473647; Mean 1.487662551637857 Variable Format: numeric Notes: UNF:6:DaAk1XB78OrNABzNCbuunQ== |
f7673746 Location: |
Summary Statistics: Valid 1550.0; Mean 1.2693089130887847; StDev 0.2215281691873661; Max. 1.9453736404322763; Min. 0.34355057155357593; Variable Format: numeric Notes: UNF:6:gmh1EHh88Zg6spUS+tdBbQ== |
f7673746 Location: |
Summary Statistics: Valid 1550.0; Max. 0.3043607087676973; Min. 0.15125494273371956; StDev 0.028521218308317123; Mean 0.20741768677634173 Variable Format: numeric Notes: UNF:6:Ou0OXSxt+8Pm3XARjqPZDA== |
f7673746 Location: |
Summary Statistics: Mean 1.5227978795523087; Valid 1550.0; Max. 4.19173586661004; Min. -0.21938981837435853; StDev 0.41882022328105667 Variable Format: numeric Notes: UNF:6:aBdesdIF+waLdxaDDpqDaA== |
f7673746 Location: |
Summary Statistics: Mean 0.40696236188501206; StDev 0.3047595822764976; Max. 6.723271544477865; Valid 1550.0; Min. 0.15889221856021768 Variable Format: numeric Notes: UNF:6:gDTXC90HjG7YmnejDXLM1Q== |
f7673746 Location: |
Summary Statistics: Max. 2.598813202844112; Valid 1550.0; Min. 0.6478957658974533; Mean 1.496106607240939; StDev 0.24124068894675973; Variable Format: numeric Notes: UNF:6:H2epMSQ6JntFaG3Bz2EnPg== |
f7673746 Location: |
Summary Statistics: Mean 4.516129032258072; Valid 1550.0; Max. 8.0; StDev 1.3277703231781692; Min. 3.5 Variable Format: numeric Notes: UNF:6:lIseoamlehdkVDwiKBbwXA== |
f7673746 Location: |
Summary Statistics: Min. 1000.0; StDev 0.0; Max. 1000.0; Valid 1550.0; Mean 1000.0 Variable Format: numeric Notes: UNF:6:jtmgwiRlZO+snpnlcFD/1g== |
f7673746 Location: |
Summary Statistics: Min. 0.025096825477179112; Mean 0.056179799476229024; Max. 0.16255296044189924; Valid 600.0; StDev 0.02331268246773068 Variable Format: numeric Notes: UNF:6:wwO1f0u9hpU4OMsu0dQtBg== |
Label: |
fig2a.R |
Notes: |
type/x-r-syntax |
Label: |
fig2b.R |
Notes: |
type/x-r-syntax |
Label: |
fig3.R |
Notes: |
type/x-r-syntax |
Label: |
fig4.R |
Notes: |
type/x-r-syntax |
Label: |
README |
Notes: |
text/plain; charset=US-ASCII |
Label: |
RUN_ALL.R |
Notes: |
type/x-r-syntax |
Label: |
run_simulations.R |
Notes: |
type/x-r-syntax |
Label: |
run_simulations_k.R |
Notes: |
type/x-r-syntax |
Label: |
simulation_functions.R |
Notes: |
type/x-r-syntax |
Label: |
dp_hier_hist.R |
Notes: |
type/x-r-syntax |
Label: |
em_code.R |
Notes: |
type/x-r-syntax |
Label: |
logit_regression.R |
Notes: |
type/x-r-syntax |
Label: |
poisson_regression.R |
Notes: |
type/x-r-syntax |
Label: |
util_functions.R |
Notes: |
type/x-r-syntax |