Census-block-level-TSGI (doi:10.7910/DVN/IAYJOC)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Census-block-level-TSGI

Identification Number:

doi:10.7910/DVN/IAYJOC

Distributor:

Harvard Dataverse

Date of Distribution:

2023-11-27

Version:

3

Bibliographic Citation:

Fu, Xiaokang; Jain, Devika; Hayes, Jack, 2023, "Census-block-level-TSGI", https://doi.org/10.7910/DVN/IAYJOC, Harvard Dataverse, V3

Study Description

Citation

Title:

Census-block-level-TSGI

Identification Number:

doi:10.7910/DVN/IAYJOC

Authoring Entity:

Fu, Xiaokang (Harvard University)

Jain, Devika (Harvard University)

Hayes, Jack (Harvard University)

Distributor:

Harvard Dataverse

Access Authority:

Jain, Devika

Depositor:

Hayes, Jack

Date of Deposit:

2023-10-24

Holdings Information:

https://doi.org/10.7910/DVN/IAYJOC

Study Scope

Keywords:

Arts and Humanities, Earth and Environmental Sciences, Social Sciences, Geospatial, Big Data, Open Source, Social Media, Twitter

Abstract:

<p> Harvard CGA Geotweet Census Archive is a subset of <a href="https://doi.org/10.7910/DVN/3NCMB6"> Harvard CGA Geotweet Archive v2.0 </a> enriched with nationwide census data. It contains the tweet and user identification records along with census variables and sentiment scores for more than 2 billion geo-tagged tweets from January 2012 to July 2023. The sentiment scores are derived from the BERT sentiment scores from the <a href=https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/X2KJPC>Harvard CGA Geotweet Sentiment Archive</a>. This dataset is available to the academic community at large, unlike the <a href="https://doi.org/10.7910/DVN/3NCMB6">Harvard CGA Geotweet Archive v2.0 </a> which is under <a href="https://developer.twitter.com/en/developer-terms/agreement-and-policy">Twitter's redistribution policy</a> restriction for public sharing. It could serve as cross-validation data for publications that used data from <a href="https://doi.org/10.7910/DVN/3NCMB6">Harvard CGA Geotweet Archive v2.0 </a>. <p>If you are interested in accessing this archive, please fill out our <a href="https://gis.harvard.edu/geotweet-request-form">Geotweet Request Form</a>. Before requesting or receiving Tweet IDs, requestors must agree to <a href="https://twitter.com/en/tos">Twitter's Terms of Service</a>, <a href="https://twitter.com/en/privacy">Twitter's Privacy Policy</a>, and <a href="https://developer.twitter.com/en/developer-terms/policy"> Twitter's Developer Policy </a>. Geotweets IDs data provided by CGA can only be used for not-for-profit research and academic purposes. Recipients may not share CGA provided Tweet IDs or content derived from them without written permission from the CGA.</p> <p><strong>Citations:</strong></p> <p>If you use the Geotweet Archive in your research please reference it: "<a href="https://doi.org/10.7910/DVN/KTRIJP">Harvard CGA Geotweet IDs Archive</a>".</p> ======================================================== <p>Schema of Geotweet Census Archive</p> <p><strong>Field name____TYPE____Description</strong></p> <p><strong>day</strong>----TEXT----The date of the tweet (YYYY-MM-DD)</p> <p><strong>GEOID20</strong>----TEXT----Census block geoid</p> <p><strong>tweet_count</strong>----INTEGER----Number of tweets in the census block</p> <p><strong>user_count</strong>----INTEGER----Number of unique users in the census block</p> <p><strong>avg_score</strong>----FLOAT----The average tweet sentiment score in the census block</p> <p><strong>max_score</strong>----FLOAT----The maximum tweet sentiment score in the census block</p> <p><strong>min_score</strong>----FLOAT----The minimum tweet sentiment score in the census block</p> <p><strong>std_score</strong>----FLOAT----The standard deviation of tweet sentiment scores in the census block</p> <p><strong>score_10q</strong>----FLOAT----The 10th quantile tweet sentiment score in the census block</p> <p><strong>score_25q</strong>----FLOAT----The 25th quantile tweet sentiment score in the census block</p> <p><strong>score_50q</strong>----FLOAT----The 50th quantile (median) tweet sentiment score in the census block</p> <p><strong>score_75q</strong>----FLOAT----The 75th quantile tweet sentiment score in the census block</p> <p><strong>score_90q</strong>----FLOAT----The 90th quantile tweet sentiment score in the census block</p>

Methodology and Processing

Sources Statement

Data Access

Archive Where Study was Originally Stored:

<a href="https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/3NCMB6">Harvard CGA Geotweet Archive v2.0</a>

Access Authority:

<p>If you are interested in accessing this archive, please fill out our <a href="https://gis.harvard.edu/geotweet-request-form">Geotweet Request form</a>.

Citation Requirement:

If you use the Geotweet Archive in your research please reference it: "<a href="doi:10.7910/DVN/KTRIJP">Harvard Center for Geographic Analysis Geotweet IDs Archive</a>".</p>

Other Study Description Materials

Other Study-Related Materials

Label:

statistics-2012_day_GEOID20__no_topic.parquet

Text:

2012 daily tweet data enriched with TSGI and census-level geometry

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2013_day_GEOID20__no_topic_0.parquet

Text:

2013 daily tweet data enriched with TSGI and census-level geometry. file 1/4 since 2013 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2013_day_GEOID20__no_topic_1.parquet

Text:

2013 daily tweet data enriched with TSGI and census-level geometry. file 2/4 since 2013 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2013_day_GEOID20__no_topic_2.parquet

Text:

2013 daily tweet data enriched with TSGI and census-level geometry. file 3/4 since 2013 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2013_day_GEOID20__no_topic_3.parquet

Text:

2013 daily tweet data enriched with TSGI and census-level geometry. file 4/4 since 2013 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2014_day_GEOID20__no_topic_0.parquet

Text:

2014 daily tweet data enriched with TSGI and census-level geometry. file 1/4 since 2014 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2014_day_GEOID20__no_topic_1.parquet

Text:

2014 daily tweet data enriched with TSGI and census-level geometry. file 2/4 since 2014 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2014_day_GEOID20__no_topic_2.parquet

Text:

2014 daily tweet data enriched with TSGI and census-level geometry. file 3/4 since 2014 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2014_day_GEOID20__no_topic_3.parquet

Text:

2014 daily tweet data enriched with TSGI and census-level geometry. file 4/4 since 2014 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2015_day_GEOID20__no_topic_0.parquet

Text:

2015 daily tweet data enriched with TSGI and census-level geometry. file 1/4 since 2015 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2015_day_GEOID20__no_topic_1.parquet

Text:

2015 daily tweet data enriched with TSGI and census-level geometry. file 2/4 since 2015 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2015_day_GEOID20__no_topic_2.parquet

Text:

2015 daily tweet data enriched with TSGI and census-level geometry. file 3/4 since 2015 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2015_day_GEOID20__no_topic_3.parquet

Text:

2015 daily tweet data enriched with TSGI and census-level geometry. file 4/4 since 2015 was so large

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2016_day_GEOID20__no_topic.parquet

Text:

2016 daily tweet data enriched with TSGI and census-level geometry

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2017_day_GEOID20__no_topic.parquet

Text:

2017 daily tweet data enriched with TSGI and census-level geometry

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2018_day_GEOID20__no_topic.parquet

Text:

2018 daily tweet data enriched with TSGI and census-level geometry

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2019_day_GEOID20__no_topic.parquet

Text:

2019 daily tweet data enriched with TSGI and census-level geometry

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2020_day_GEOID20__no_topic.parquet

Text:

2020 daily tweet data enriched with TSGI and census-level geometry

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2021_day_GEOID20__no_topic.parquet

Text:

2021 daily tweet data enriched with TSGI and census-level geometry

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2022_day_GEOID20__no_topic.parquet

Text:

2022 daily tweet data enriched with TSGI and census-level geometry

Notes:

application/octet-stream

Other Study-Related Materials

Label:

statistics-2023_day_GEOID20__no_topic.parquet

Text:

2023 daily tweet data enriched with TSGI and census-level geometry

Notes:

application/octet-stream