Stay Tuned Replication Data

This repository contains replication data for the article: Griswold, Robbins, and Pollard. (2025). "Stay Tuned: Improving Sentiment Analysis and Stance Detection Using Large Language Models". Political Analysis.

Project data included tweets collected from Twitter between Jan 1st. 2020 and Jan 31st, 2021, for a random subset of general Twitter users and for all members of congress. Tweets were coded for stance concerning candidates in the 2020 US Presidential election. For more details, please see the referenced paper.

This repository also includes stance estimates modeled using several OpenAI Large Language Models. These estimates are included for researchers who are unable to reproduce these results using scripts from the code repository, since these models require proprietary access.

Code related to this project can be found at the following repository: https://github.com/maxgriswold/Stay-Tuned---Improving-Sentiment-Analysis-and-Stance-Detection-Using-Large-Language-Models

Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

11 to 20 of 20 Results

raw_train_biden.tab Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models" Tabular Data - 1.1 MB - 3 Variables, 5806 Observations - UNF:6:yz7wPH958bcDWmZhZzAOvQ== Files from Yingjie Li, Tiberiu Sosea, Aditya Sawant, Ajith Jayaraman Nair, Diana Inkpen, and Cornelia Caragea. 2021. P-Stance: A Large Dataset for Stance Detection in Political Domain. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2355–2365, Online. Association for Computational Linguistics. Please cite this...
raw_train_trump.tab Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models" Tabular Data - 1.3 MB - 3 Variables, 6362 Observations - UNF:6:2XxnLaYaT/3UOaJEOZh5NA== Files from Yingjie Li, Tiberiu Sosea, Aditya Sawant, Ajith Jayaraman Nair, Diana Inkpen, and Cornelia Caragea. 2021. P-Stance: A Large Dataset for Stance Detection in Political Domain. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2355–2365, Online. Association for Computational Linguistics. Please cite this...
training_key_biden_handcode.tab Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models" Tabular Data - 2.7 KB - 2 Variables, 250 Observations - UNF:6:Bi+Ugnp/BSMnx5p6YRwVDQ== Fold used in study for user data pertaining to subject "Biden".
training_key_biden.tab Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models" Tabular Data - 65.6 KB - 2 Variables, 5508 Observations - UNF:6:kcZluOXqx2aGc+euEFkbcA== Fold used in study for politician data pertaining to subject "Biden".
training_key_trump_handcode.tab Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models" Tabular Data - 2.6 KB - 2 Variables, 250 Observations - UNF:6:BA5q9fImYIp1lfM6juw6AA== Fold used in study for user data pertaining to subject "Trump".
training_key_trump.tab Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models" Tabular Data - 186.8 KB - 2 Variables, 14934 Observations - UNF:6:uWqPoQU3Twe9YRMC01+RzA== Fold used in study for politician data pertaining to subject "Trump".
trump_stance_test_public.tab Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models" Tabular Data - 69.8 KB - 3 Variables, 375 Observations - UNF:6:Uxo/q/iicdo/bC6vzg3T0A== Files from Kornraphop Kawintiranon and Lisa Singh. 2021. Knowledge Enhanced Masked Language Model for Stance Detection. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4725–4735, Online. Association for Computational Linguistics. Please cite th...
trump_stance_train_public.tab Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models" Tabular Data - 162.5 KB - 3 Variables, 875 Observations - UNF:6:5yoy/WBKefH76CeqljSl3Q== Files from Kornraphop Kawintiranon and Lisa Singh. 2021. Knowledge Enhanced Masked Language Model for Stance Detection. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4725–4735, Online. Association for Computational Linguistics. Please cite th...
user_train_tweets.tab Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models" Tabular Data - 79.8 KB - 4 Variables, 500 Observations - UNF:6:vW+aNXsP4jVMGQzxZLHw8g== User tweets published between 9/20/2020 and 01/20/2021 which contain the word "Trump" or "Biden". Collected by the study team across 2022. This dataset was used to train stance models.
user_val_tweets.tab Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models" Tabular Data - 136.6 KB - 4 Variables, 753 Observations - UNF:6:qIxCyvLBz4uptu6w6won+A== User tweets published between 9/20/2020 and 01/20/2021 which contain the word "Trump" or "Biden". Collected by the study team across 2022. This dataset was used to validate model results.

raw_train_biden.tab

Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models"

Tabular Data - 1.1 MB - 3 Variables, 5806 Observations -

Files from Yingjie Li, Tiberiu Sosea, Aditya Sawant, Ajith Jayaraman Nair, Diana Inkpen, and Cornelia Caragea. 2021. P-Stance: A Large Dataset for Stance Detection in Political Domain. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2355–2365, Online. Association for Computational Linguistics. Please cite this...

raw_train_trump.tab

Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models"

Tabular Data - 1.3 MB - 3 Variables, 6362 Observations -

training_key_biden_handcode.tab

Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models"

Tabular Data - 2.7 KB - 2 Variables, 250 Observations -

Fold used in study for user data pertaining to subject "Biden".

training_key_biden.tab

Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models"

Tabular Data - 65.6 KB - 2 Variables, 5508 Observations -

Fold used in study for politician data pertaining to subject "Biden".

training_key_trump_handcode.tab

Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models"

Tabular Data - 2.6 KB - 2 Variables, 250 Observations -

Fold used in study for user data pertaining to subject "Trump".

training_key_trump.tab

Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models"

Tabular Data - 186.8 KB - 2 Variables, 14934 Observations -

Fold used in study for politician data pertaining to subject "Trump".

trump_stance_test_public.tab

Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models"

Tabular Data - 69.8 KB - 3 Variables, 375 Observations -

Files from Kornraphop Kawintiranon and Lisa Singh. 2021. Knowledge Enhanced Masked Language Model for Stance Detection. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4725–4735, Online. Association for Computational Linguistics. Please cite th...

trump_stance_train_public.tab

Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models"

Tabular Data - 162.5 KB - 3 Variables, 875 Observations -

user_train_tweets.tab

Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models"

Tabular Data - 79.8 KB - 4 Variables, 500 Observations -

User tweets published between 9/20/2020 and 01/20/2021 which contain the word "Trump" or "Biden". Collected by the study team across 2022. This dataset was used to train stance models.

user_val_tweets.tab

Jun 18, 2025 - Replication Data for: "Stay Tuned - Improving Sentiment Analysis and Stance Detection Using Large Language Models"

Tabular Data - 136.6 KB - 4 Variables, 753 Observations -

User tweets published between 9/20/2020 and 01/20/2021 which contain the word "Trump" or "Biden". Collected by the study team across 2022. This dataset was used to validate model results.

Add Data

Share Dataverse

Link Dataverse

Reset Modifications