Replication materials for: Categorizing topics versus inferring attitudes: a theory and method for analyzing open-ended survey responses (doi:10.7910/DVN/FSK6NZ)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description
Citation
Title:	Replication materials for: Categorizing topics versus inferring attitudes: a theory and method for analyzing open-ended survey responses
Identification Number:	doi:10.7910/DVN/FSK6NZ
Distributor:	Harvard Dataverse
Date of Distribution:	2024-10-30
Version:	1
Bibliographic Citation:	Hobbs, William; Green, Jon, 2024, "Replication materials for: Categorizing topics versus inferring attitudes: a theory and method for analyzing open-ended survey responses", https://doi.org/10.7910/DVN/FSK6NZ, Harvard Dataverse, V1
Study Description
Citation
Title:	Replication materials for: Categorizing topics versus inferring attitudes: a theory and method for analyzing open-ended survey responses
Identification Number:	doi:10.7910/DVN/FSK6NZ
Authoring Entity:	Hobbs, William (Cornell University)
	Green, Jon (Duke University)
Distributor:	Harvard Dataverse
Access Authority:	Hobbs, William
Depositor:	Code Ocean
Holdings Information:	https://doi.org/10.7910/DVN/FSK6NZ
Study Scope
Keywords:	Social Sciences
Abstract:	Article abstract: Past work on closed-ended survey responses demonstrates that inferring stable political attitudes requires separating signal from noise in “top of the head” answers to researchers’ questions. We outline a corresponding theory of the open-ended response, in which respondents make narrow, stand-in statements to convey more abstract, general attitudes. We then present a method designed to infer those attitudes. Our approach leverages co-variation with words used relatively frequently across respondents to infer what else they could have said without substantively changing what they meant – linking narrow themes to each other through associations with contextually prevalent words. This reflects the intuition that a respondent may use different specific statements at different points in time to convey similar meaning. We validate this approach using panel data in which respondents answer the same open-ended questions (concerning healthcare policy, most important problems, and evaluations of political parties) at multiple points in time, showing that our method’s output consistently exhibits higher within-subject correlations than hand-coding of narrow response categories, topic modeling, and large language model output. Finally, we show how large language models can be used to complement – but not, at present, substitute – our “implied word” method.
Methodology and Processing
Sources Statement
Data Access
Other Study Description Materials
Other Study-Related Materials
Label:	capsule-5767693.zip
Notes:	application/zip
Other Study-Related Materials
Label:	result-997ac6a3-8ccf-464b-a150-9648e5dc3614.zip
Notes:	application/zip