Replication materials for: Categorizing topics versus inferring attitudes: a theory and method for analyzing open-ended survey responses (doi:10.7910/DVN/FSK6NZ)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Replication materials for: Categorizing topics versus inferring attitudes: a theory and method for analyzing open-ended survey responses

Identification Number:

doi:10.7910/DVN/FSK6NZ

Distributor:

Harvard Dataverse

Date of Distribution:

2024-10-30

Version:

1

Bibliographic Citation:

Hobbs, William; Green, Jon, 2024, "Replication materials for: Categorizing topics versus inferring attitudes: a theory and method for analyzing open-ended survey responses", https://doi.org/10.7910/DVN/FSK6NZ, Harvard Dataverse, V1

Study Description

Citation

Title:

Replication materials for: Categorizing topics versus inferring attitudes: a theory and method for analyzing open-ended survey responses

Identification Number:

doi:10.7910/DVN/FSK6NZ

Authoring Entity:

Hobbs, William (Cornell University)

Green, Jon (Duke University)

Distributor:

Harvard Dataverse

Access Authority:

Hobbs, William

Depositor:

Code Ocean

Holdings Information:

https://doi.org/10.7910/DVN/FSK6NZ

Study Scope

Keywords:

Social Sciences

Abstract:

Article abstract: Past work on closed-ended survey responses demonstrates that inferring stable political attitudes requires separating signal from noise in “top of the head” answers to researchers’ questions. We outline a corresponding theory of the open-ended response, in which respondents make narrow, stand-in statements to convey more abstract, general attitudes. We then present a method designed to infer those attitudes. Our approach leverages co-variation with words used relatively frequently across respondents to infer what else they could have said without substantively changing what they meant – linking narrow themes to each other through associations with contextually prevalent words. This reflects the intuition that a respondent may use different specific statements at different points in time to convey similar meaning. We validate this approach using panel data in which respondents answer the same open-ended questions (concerning healthcare policy, most important problems, and evaluations of political parties) at multiple points in time, showing that our method’s output consistently exhibits higher within-subject correlations than hand-coding of narrow response categories, topic modeling, and large language model output. Finally, we show how large language models can be used to complement – but not, at present, substitute – our “implied word” method.

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Other Study-Related Materials

Label:

capsule-5767693.zip

Notes:

application/zip

Other Study-Related Materials

Label:

result-997ac6a3-8ccf-464b-a150-9648e5dc3614.zip

Notes:

application/zip