View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
Replication Data for: Synthetic Replacements for Human Survey Data? The Perils of Large Language Models |
Identification Number: |
doi:10.7910/DVN/VPN481 |
Distributor: |
Harvard Dataverse |
Date of Distribution: |
2024-04-02 |
Version: |
1 |
Bibliographic Citation: |
Bisbee, James, 2024, "Replication Data for: Synthetic Replacements for Human Survey Data? The Perils of Large Language Models", https://doi.org/10.7910/DVN/VPN481, Harvard Dataverse, V1 |
Citation |
|
Title: |
Replication Data for: Synthetic Replacements for Human Survey Data? The Perils of Large Language Models |
Identification Number: |
doi:10.7910/DVN/VPN481 |
Authoring Entity: |
Bisbee, James (Vanderbilt University) |
Producer: |
<i>Political Analysis</i> |
Distributor: |
Harvard Dataverse |
Access Authority: |
Bisbee, James |
Depositor: |
Bisbee, James |
Date of Deposit: |
2024-02-15 |
Holdings Information: |
https://doi.org/10.7910/DVN/VPN481 |
Study Scope |
|
Keywords: |
Social Sciences, ChatGPT, synthetic data, public opinion, research ethics |
Abstract: |
Large Language Models (LLMs) offer new research possibilities for social scientists, but their potential as “synthetic data" is still largely unknown. In this paper, we investigate how accurately the popular LLM ChatGPT can recover public opinion, prompting the LLM to adopt different “personas” and then provide feeling thermometer scores for 11 sociopolitical groups. The average scores generated by ChatGPT correspond closely to the averages in our baseline survey, the 2016–2020 American National Election Study. Nevertheless, sampling by ChatGPT is not reliable for statistical inference: there is less variation in responses than in the real surveys, and regression coefficients often differ significantly from equivalent estimates obtained using ANES data. We also document how the distribution of synthetic responses varies with minor changes in prompt wording, and we show how the same prompt yields significantly different results over a three-month period. Altogether, our findings raise serious concerns about the quality, reliability, and reproducibility of synthetic survey data generated by LLMs. |
Methodology and Processing |
|
Sources Statement |
|
Data Access |
|
Other Study Description Materials |
|
Related Publications |
|
Citation |
|
Title: |
Forthcoming, Political Analysis |
Bibliographic Citation: |
Forthcoming, Political Analysis |
Label: |
PA_replication.zip |
Notes: |
application/zip |