Replication Data for: Case Studies in Public Interest Technology (doi:10.7910/DVN/D2A279)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Replication Data for: Case Studies in Public Interest Technology

Identification Number:

doi:10.7910/DVN/D2A279

Distributor:

Harvard Dataverse

Date of Distribution:

2022-02-14

Version:

1

Bibliographic Citation:

Zang, Jinyan; Sweeney, Latanya, 2022, "Replication Data for: Case Studies in Public Interest Technology", https://doi.org/10.7910/DVN/D2A279, Harvard Dataverse, V1, UNF:6:XACoJaGEmZk/OdiANhaNYw== [fileUNF]

Study Description

Citation

Title:

Replication Data for: Case Studies in Public Interest Technology

Identification Number:

doi:10.7910/DVN/D2A279

Authoring Entity:

Zang, Jinyan (Harvard University)

Sweeney, Latanya (Harvard University)

Producer:

Department of Government

Grant Number:

1730326

Distributor:

Harvard Dataverse

Distributor:

Department of Government

Access Authority:

Wall, Thom

Depositor:

Zang, Jinyan

Date of Deposit:

2021-05-11

Holdings Information:

https://doi.org/10.7910/DVN/D2A279

Study Scope

Keywords:

Computer and Information Science, Social Sciences, Facebook, SSN, Contact Tracing, Public Interest Technology

Topic Classification:

Harvard University, Department of Government

Abstract:

<h1>Case Studies in Public Interest Technology</h1> <p>Today, there are multiple ways where digital technologies adversely impacts the public interest, whether that&rsquo;s the spread of misinformation online, the loss of privacy, the threat of algorithmic discrimination, and more. Public interest technology is an emerging field that seeks to use cross-disciplinary techniques to research and address these issues in order to advance the public interest.</p> <p>For this dissertation, I present three different case studies of public interest tech research projects, each of which focuses on a different technology and relevant public interest. In Chapter 2, I research how Facebook&rsquo;s advertising algorithms can discriminate by race and ethnicity. In Chapter 3, I test how the predictability of Social Security Number (SSN) assignment based on easily accessible data about Americans presents a risk of identity theft. In Chapter 4, I demonstrate how TraceFi, a Wi-Fi based collocation detection technology, can be deployed for COVID-19 contact tracing.</p> <p>I propose how we can adapt Lawrence Lessig&rsquo;s pathetic dot model as the &ldquo;Three Forces Model of Public Interest Tech&rdquo; to understand the current dysfunctional state of relationships between technology, society, and the public interest, where the public interest is often affected as an output of technology but not fully considered as an input. The three forces of the law, norms, and market can affect a given technology or vice versa which in turn affects the public interest. For different combinations of technologies and public interests, the amount of force exerted by the law, norms, or market could also differ and so could the degree of feedback between the technology and each of the forces.</p> <p>Since the normative goal of public interest tech as a field is to ultimately advance the public interest, the goal state of the Three Forces Model demonstrates how the public interest can be an input for the law, norms, and market in how they affect a technology&rsquo;s design and usage, which would in turn affect the public interest. Stakeholders relevant to each of the forces can consider the public interest as a priority in how they interact with a technology and its designer.</p> <p>In Chapter 5, I present how we can apply the Three Forces Model for Public Interest Tech to each case study to describe the current state and the ideal goal state.</p> <p>In order to effectively respond to the multiple ways of how digital technologies have adversely impacted the public interest, we need a &ldquo;whole-society&rdquo; strategy that coordinates our laws, norms, and markets in how they interact with our technologies to prioritize the public interest. As public interest technologists, we need to work across disciplines to advance the public interest.</p> <p>Let&rsquo;s get started.</p>

Date of Collection:

2020-01-01-2021-01-312013-01-01-2013-12-312020-07-01-2020-10-02

Unit of Analysis:

individuals

Kind of Data:

program source code

Kind of Data:

experimental data

Kind of Data:

event/transaction data

Methodology and Processing

Sources Statement

Notes:

This study was deposited under the of the Data-PASS standard deposit terms. A copy of the usage agreement is included in the file section of this study.

Data Access

Restrictions:

<b>The data archived in the Harvard Government Dissertation Dataverse are restricted for use for five years post deposit date.</b> I will use these data solely for the purposes stated in my application to use data, detailed in a written research proposal.

Citation Requirement:

I will include a bibliographic citation acknowledging the use of these data in any publication or presentation in which these data are used. Such citations will appear in footnotes or in the reference section of any such manuscript. I understand the guideline in "How to Cite This Dataset" described in the Summary of this study.

Conditions:

The data are available without additional conditions other than those stated in the "Restrictions" Terms of Use above.

Notes:

This dataset is made available under a Creative Commons CC0 license with the following additional/modified terms and conditions:

Embargoed for 5 years from the publication date

Other Study Description Materials

Related Publications

Citation

Title:

A copy of this dissertation is available here: https://about.proquest.com/en/dissertations/

Bibliographic Citation:

A copy of this dissertation is available here: https://about.proquest.com/en/dissertations/

File Description--f4639964

File: Chapter_2_FB_Data_Analysis.tab

  • Number of cases: 35

  • No. of variables per record: 6

  • Type of File: text/tab-separated-values

Notes:

UNF:6:90iEmnU8e36HmDyiBzaINQ==

Chapter 2: Analysis of Facebook data and creation of figures

File Description--f4639979

File: Chapter_3_SSN_dmf_eab.tab

  • Number of cases: 262012

  • No. of variables per record: 12

  • Type of File: text/tab-separated-values

Notes:

UNF:6:ysI87WSKSACZjyQmigSf0g==

Chapter 3: Death Master File (1989 - 2011)

File Description--f4639977

File: Chapter_3_SSN_gn_index_and_an_freq_bounds_per_state.tab

  • Number of cases: 51

  • No. of variables per record: 13

  • Type of File: text/tab-separated-values

Notes:

UNF:6:9bKiU6r5J/RKJMhIxBginQ==

Chapter 3: GNAN index values used to convert SSN to SSN index values

File Description--f4639973

File: Chapter_3_SSN_Predictive Accuracy by State Weighted by Births - Post 1995.tab

  • Number of cases: 51

  • No. of variables per record: 7

  • Type of File: text/tab-separated-values

Notes:

UNF:6:r3LZbDrj/DK4xLA6hF2mpQ==

Chapter 3: Predictive Accuracy by State Weighted by Births - Post 1995

File Description--f4639974

File: Chapter_3_SSN_Predictive Accuracy by State Weighted by Births.tab

  • Number of cases: 51

  • No. of variables per record: 7

  • Type of File: text/tab-separated-values

Notes:

UNF:6:aKLKU0H6Ux1TZRUHspsTNg==

Chapter 3: Predictive Accuracy by State Weighted by Births

File Description--f4639978

File: Chapter_3_SSN_predictive accuracy by year of birth.tab

  • Number of cases: 23

  • No. of variables per record: 7

  • Type of File: text/tab-separated-values

Notes:

UNF:6:o6R00V47h+HjWrrKzSkafg==

Chapter 3: Predictive Accuracy by Year of Birth

File Description--f4639985

File: Chapter_4_TraceFi_fingerprints_colab_building_A.tab

  • Number of cases: 1046

  • No. of variables per record: 307

  • Type of File: text/tab-separated-values

Notes:

UNF:6:PjqRGZrPBss4wfvAt3WC2A==

Chapter 4: Building A Wi-Fi fingerprint data

File Description--f4639982

File: Chapter_4_TraceFi_fingerprints_colab_building_B.tab

  • Number of cases: 2081

  • No. of variables per record: 333

  • Type of File: text/tab-separated-values

Notes:

UNF:6:54MP5jQPceVWWyXeeQazAQ==

Chapter 4: Building B Wi-Fi fingerprint data

File Description--f4639983

File: Chapter_4_TraceFi_fingerprints_colab_building_C.tab

  • Number of cases: 249

  • No. of variables per record: 21

  • Type of File: text/tab-separated-values

Notes:

UNF:6:+nSPKnpEmutv/qQ2HdOccA==

Chapter 4: Building C Wi-Fi fingerprint data

Other Study-Related Materials

Label:

Chapter_2_FB_data_Jan_2020.7z

Text:

Facebook ad platform data (Jan 2020) as compressed directory due to large # of files

Notes:

application/x-7z-compressed

Other Study-Related Materials

Label:

Chapter_2_FB_data_Jan_2021.7z

Text:

Facebook ad platform data (Jan 2021 as compressed directory due to large # of files

Notes:

application/x-7z-compressed

Other Study-Related Materials

Label:

Chapter_3_SSN_SSN_Index_Values_vs_DOB_Plots.7z

Text:

Chapter 3: Plots of SSN index values vs. DOB for each state as compressed directory

Notes:

application/x-7z-compressed

Other Study-Related Materials

Label:

Chapter_3_SSN_Study_Code.Rmd

Text:

Chapter 3: R notebook of code used in the study

Notes:

application/octet-stream

Other Study-Related Materials

Label:

Chapter_4_TraceFi_TraceFi_Models.ipynb

Text:

Chapter 4: TraceFi model training and test results

Notes:

application/x-ipynb+json