View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
Top 10 News |
Identification Number: |
doi:10.7910/DVN/OTJMYQ |
Distributor: |
Harvard Dataverse |
Date of Distribution: |
2020-04-09 |
Version: |
1 |
Bibliographic Citation: |
Sood, Gaurav; Laohaprapanon, Suriyan, 2020, "Top 10 News", https://doi.org/10.7910/DVN/OTJMYQ, Harvard Dataverse, V1, UNF:6:jlKZoJmu6AlRH7bK3zv4ig== [fileUNF] |
Citation |
|
Title: |
Top 10 News |
Subtitle: |
Data from Home pages and Top 10 Lists on News Sites |
Identification Number: |
doi:10.7910/DVN/OTJMYQ |
Authoring Entity: |
Sood, Gaurav |
Laohaprapanon, Suriyan |
|
Distributor: |
Harvard Dataverse |
Access Authority: |
Sood, Gaurav |
Depositor: |
Sood, Gaurav |
Date of Deposit: |
2020-04-07 |
Holdings Information: |
https://doi.org/10.7910/DVN/OTJMYQ |
Study Scope |
|
Keywords: |
Social Sciences |
Abstract: |
We scraped and parsed the homepages, politics pages, and top10 lists of prominent news sites for 2012 and 2016--2017. We did all this in 2016--2017, and hence the 2012 data exclusively comes from Internet Archive. For 2016--2017, the data mostly comes from scraping live sites but some of the data---where we realized much too late that we wanted to scrape the site---also comes from Internet Archive. For additional details, see: https://github.com/not_news/top10 |
Methodology and Processing |
|
Sources Statement |
|
Data Access |
|
Notes: |
Data available only for research purposes. |
<a href="http://creativecommons.org/publicdomain/zero/1.0">CC0 1.0</a> |
|
Other Study Description Materials |
|
File Description--f3798658 |
|
File: current-output-homepage.tab |
|
|
|
Notes: |
UNF:6:SCkcwV64HNZCi0UPIBBm7g== |
File Description--f3798630 |
|
File: current-output-politics-homepage.tab |
|
|
|
Notes: |
UNF:6:dj8/fs33o+M3mwpHQBy8Ow== |
File Description--f3798758 |
|
File: ia-output-politics-top10-text-all.tab |
|
|
|
Notes: |
UNF:6:0PC/X7C9sU15yVb3DpABmg== |
List of Variables: | |
Variables |
|
f3798758 Location: |
Summary Statistics: Min. 2.0120701E7; Mean 2.0143088693055693E7; Valid 10181.0; Max. 2.0161006E7; StDev 19799.115771795714 Variable Format: numeric Notes: UNF:6:NCo/BJAQyHFrZyP0JDzPpg== |
f3798758 Location: |
Summary Statistics: Valid 10181.0; Mean 116759.85659561942; StDev 70303.2106729336; Min. 106.0; Max. 235959.0 Variable Format: numeric Notes: UNF:6:OwVdXdmMlAUjuvETcst3+g== |
f3798758 Location: |
Variable Format: character Notes: UNF:6:HzIuujeo8kUuEOY9tMvePA== |
f3798758 Location: |
Summary Statistics: Valid 10181.0; StDev 2.943639038667501; Max. 20.0; Min. 1.0; Mean 5.004321775856926; Variable Format: numeric Notes: UNF:6:oaqJ24dNixRLIlMJrAJXtw== |
f3798758 Location: |
Variable Format: character Notes: UNF:6:ZmrZgrUcJbAMqDQ5SCzkaQ== |
f3798758 Location: |
Variable Format: character Notes: UNF:6:CoB+I56RWcqW6G1Qx2Vtzg== |
f3798758 Location: |
Variable Format: character Notes: UNF:6:VsGPVawEulNWOgpMo6fSLQ== |
f3798758 Location: |
Variable Format: character Notes: UNF:6:wzoZgwkgTZxmoP3qI5ZIpA== |
f3798758 Location: |
Variable Format: character Notes: UNF:6:bmzSyPtWZvXllC2vDjoUtA== |
f3798758 Location: |
Variable Format: character Notes: UNF:6:6dGcwgiBaN5ikLh1pr8sOQ== |
f3798758 Location: |
Variable Format: character Notes: UNF:6:iWaV1WkOHBSS2pIK2AqMLw== |
f3798758 Location: |
Variable Format: character Notes: UNF:6:T7p42PUPcCGa6GqwKcHdPw== |
f3798758 Location: |
Variable Format: character Notes: UNF:6:OphcoOTjqsAEKZM3gD9iVg== |
Label: |
current-homepage-html.tar.gz |
Text: | |
Notes: |
application/gzip |
Label: |
current-output-top10.csv |
Text: | |
Notes: |
text/csv |
Label: |
current-politics-homepage-html.tar.gz |
Text: | |
Notes: |
application/gzip |
Label: |
current-top10-html.tar.gz |
Text: | |
Notes: |
application/gzip |
Label: |
ia-homepage-html-2007-2011.tar.gz |
Text: | |
Notes: |
application/gzip |
Label: |
ia-homepage-html-2012.tar.gz.partaa |
Text: | |
Notes: |
application/gzip |
Label: |
ia-homepage-html-2012.tar.gz.partab |
Text: | |
Notes: |
application/octet-stream |
Label: |
ia-homepage-html-2012.tar.gz.partac |
Text: | |
Notes: |
application/octet-stream |
Label: |
ia-news-top10-html.tar.gz.partaa |
Text: | |
Notes: |
application/gzip |
Label: |
ia-news-top10-html.tar.gz.partab |
Text: | |
Notes: |
application/octet-stream |
Label: |
ia-output-homepage-2012-2016-notext.csv.gz |
Text: | |
Notes: |
application/gzip |
Label: |
ia-output-homepage-2012-text.csv.gz |
Text: | |
Notes: |
application/gzip |
Label: |
ia-output-homepage-2016-text.csv.gz |
Text: | |
Notes: |
application/gzip |
Label: |
ia-output-politics-homepage-2012-2016-notext.csv.gz |
Text: | |
Notes: |
application/gzip |
Label: |
ia-output-top10-text-all.csv |
Text: | |
Notes: |
text/csv |
Label: |
ia-politics-html.tar.gz |
Text: | |
Notes: |
application/gzip |
Label: |
ia-politics-top10-html.tar.gz |
Text: | |
Notes: |
application/gzip |
Label: |
ia-top10-html.tar.gz |
Text: | |
Notes: |
application/gzip |