This repository contains replication data for the article: Griswold, Robbins, and Pollard. (2025). "Stay Tuned: Improving Sentiment Analysis and Stance Detection Using Large Language Models". Political Analysis.


Project data included tweets collected from Twitter between Jan 1st. 2020 and Jan 31st, 2021, for a random subset of general Twitter users and for all members of congress. Tweets were coded for stance concerning candidates in the 2020 US Presidential election. For more details, please see the referenced paper.

This repository also includes stance estimates modeled using several OpenAI Large Language Models. These estimates are included for researchers who are unable to reproduce these results using scripts from the code repository, since these models require proprietary access.

Code related to this project can be found at the following repository: https://github.com/maxgriswold/Stay-Tuned---Improving-Sentiment-Analysis-and-Stance-Detection-Using-Large-Language-Models

Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

11 to 20 of 20 Results
Tabular Data - 1.1 MB - 3 Variables, 5806 Observations - UNF:6:yz7wPH958bcDWmZhZzAOvQ==
Files from Yingjie Li, Tiberiu Sosea, Aditya Sawant, Ajith Jayaraman Nair, Diana Inkpen, and Cornelia Caragea. 2021. P-Stance: A Large Dataset for Stance Detection in Political Domain. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2355–2365, Online. Association for Computational Linguistics. Please cite this...
Tabular Data - 1.3 MB - 3 Variables, 6362 Observations - UNF:6:2XxnLaYaT/3UOaJEOZh5NA==
Files from Yingjie Li, Tiberiu Sosea, Aditya Sawant, Ajith Jayaraman Nair, Diana Inkpen, and Cornelia Caragea. 2021. P-Stance: A Large Dataset for Stance Detection in Political Domain. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2355–2365, Online. Association for Computational Linguistics. Please cite this...
Tabular Data - 2.7 KB - 2 Variables, 250 Observations - UNF:6:Bi+Ugnp/BSMnx5p6YRwVDQ==
Fold used in study for user data pertaining to subject "Biden".
Tabular Data - 65.6 KB - 2 Variables, 5508 Observations - UNF:6:kcZluOXqx2aGc+euEFkbcA==
Fold used in study for politician data pertaining to subject "Biden".
Tabular Data - 2.6 KB - 2 Variables, 250 Observations - UNF:6:BA5q9fImYIp1lfM6juw6AA==
Fold used in study for user data pertaining to subject "Trump".
Tabular Data - 186.8 KB - 2 Variables, 14934 Observations - UNF:6:uWqPoQU3Twe9YRMC01+RzA==
Fold used in study for politician data pertaining to subject "Trump".
Tabular Data - 69.8 KB - 3 Variables, 375 Observations - UNF:6:Uxo/q/iicdo/bC6vzg3T0A==
Files from Kornraphop Kawintiranon and Lisa Singh. 2021. Knowledge Enhanced Masked Language Model for Stance Detection. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4725–4735, Online. Association for Computational Linguistics. Please cite th...
Tabular Data - 162.5 KB - 3 Variables, 875 Observations - UNF:6:5yoy/WBKefH76CeqljSl3Q==
Files from Kornraphop Kawintiranon and Lisa Singh. 2021. Knowledge Enhanced Masked Language Model for Stance Detection. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4725–4735, Online. Association for Computational Linguistics. Please cite th...
Tabular Data - 79.8 KB - 4 Variables, 500 Observations - UNF:6:vW+aNXsP4jVMGQzxZLHw8g==
User tweets published between 9/20/2020 and 01/20/2021 which contain the word "Trump" or "Biden". Collected by the study team across 2022. This dataset was used to train stance models.
Tabular Data - 136.6 KB - 4 Variables, 753 Observations - UNF:6:qIxCyvLBz4uptu6w6won+A==
User tweets published between 9/20/2020 and 01/20/2021 which contain the word "Trump" or "Biden". Collected by the study team across 2022. This dataset was used to validate model results.
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.