Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

1 to 10 of 11 Results
Apr 14, 2025
Sood, Gaurav, 2023, "Top News: Story URLs and Text from News Feeds of Major National News Sites (2022 to 03/2025)", https://doi.org/10.7910/DVN/ZNAKK6, Harvard Dataverse, V12
Scripts at: https://github.com/notnews/top_news. We check the RSS Feeds from the major news sites: ABC, CBS, CNN, LA Times, NBC, NPR, NYT, Politico, ProPublica, USA Today, and WaPo and get their URLs and then parse the data using newspaper3k and some custom scripts. To combine usat_html, cat usat_split_* > usat_html_articles_03_25.tar.gz Related Da...
Aug 28, 2023
Sood, Gaurav; Laohaprapanon, Suriyan, 2018, "Not News: Provision of Apolitical News in British News Media", https://doi.org/10.7910/DVN/VZ8DB3, Harvard Dataverse, V3
URL level data (URL, source_name, date, predicted and training set labels) for 5,646,436 articles that underlie Not News: Provision of Apolitical News in British News Media. For more details, see: https://github.com/notnews/uk_not_news
Sep 6, 2022
Sood, Gaurav; Laohaprapanon, Suriyan, 2021, "naampy", https://doi.org/10.7910/DVN/WZGJBM, Harvard Dataverse, V3
Data underlying the Python package `naampy: Infer Sociodemographic Characteristics from Indian Names.` GitHub Link: https://github.com/appeler/naampy Here's another related package: pranaam: predict religion from name. Pranaam uses the Bihar Land Records data, plot-level land records (N= 41.87 million plots or 12.13 individuals/accounts across 35,6...
Sep 4, 2022
Sood, Gaurav, 2022, "Bihar Land Records (2022)", https://doi.org/10.7910/DVN/BI4KZS, Harvard Dataverse, V1
GitHub: https://github.com/in-rolls/bihar_land_records
Jun 4, 2021
Sood, Gaurav; Laohaprapanon, Suriyan, 2021, "Transaction Level Ration Data from Rajasthan (2021)", https://doi.org/10.7910/DVN/FIFZEX, Harvard Dataverse, V2
Transaction Level Ration Data from Rajasthan Website: https://food.raj.nic.in/DistrictWiseCategoryDetails.aspx Scraped in 2021 Github: https://github.com/soodoku/ration
Aug 4, 2020
Sood, Gaurav, 2020, "Maxmind IP Geolocation Archival Data", https://doi.org/10.7910/DVN/RMZOEN, Harvard Dataverse, V3
Maxmind IP Geolocation Archival Data Because of GDPR concerns, Maxmind doesn't provide historical data. We have used this data to do historical studies of IP data for MTurk, etc. and it is quite possible that such data would be useful elsewhere. Maxmind changed its db format from geolite to geolite2 and you will need to use its respective packages...
May 10, 2020
Sood, Gaurav; Laohaprapanon, Suriyan, 2018, "Category of content of unique domains in comScore data", https://doi.org/10.7910/DVN/DXSNFA, Harvard Dataverse, V2
Category of content of unique domains in comScore data using
Sep 4, 2019
Sood, Gaurav; Laohaprapanon, Suriyan, 2018, "DIME Race (1980--2014)", https://doi.org/10.7910/DVN/M5K7VR, Harvard Dataverse, V3, UNF:6:MIJQWSHoaIuOZwU/Lg0cqg== [fileUNF]
Race of people in DIME v2 data. DIME data: https://data.stanford.edu/dime Race imputation using: https://github.com/appeler/ethnicolr Github repo.: https://github.com/appeler/dime_race
Feb 9, 2019
Sood, Gaurav, 2019, "Kerala English Electoral PDFs Google Vision OCR Output (Indian Electoral Rolls)", https://doi.org/10.7910/DVN/MQPPNC, Harvard Dataverse, V2
Google Vision OCR output of Kerala Electoral PDFs. For more about the project, see here: https://github.com/in-rolls. For scripts behind the data, see https://github.com/in-rolls/google_vision_ocr. Check the log file in the repo. for the metadata from the job. Google Vision OCR output for each pdf = PNG with bounding boxes, JSON with coords and tex...
Jun 10, 2018
Sood, Gaurav, 2018, "Street Smart: Learning from Randomly Sampled Images from Google Street View", https://doi.org/10.7910/DVN/L3HN0K, Harvard Dataverse, V2
Data behind https://github.com/geosensing/streetsmart
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.