Closed Caption News Transcripts from the Internet Archive (2014--2023) (doi:10.7910/DVN/OAJJHI)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Closed Caption News Transcripts from the Internet Archive (2014--2023)

Identification Number:

doi:10.7910/DVN/OAJJHI

Distributor:

Harvard Dataverse

Date of Distribution:

2022-12-14

Version:

3

Bibliographic Citation:

Sood, Gaurav; Laohaprapanon, Suriyan, 2022, "Closed Caption News Transcripts from the Internet Archive (2014--2023)", https://doi.org/10.7910/DVN/OAJJHI, Harvard Dataverse, V3, UNF:6:QDjFCPMzxIXpoNWCI+1eNQ== [fileUNF]

Study Description

Citation

Title:

Closed Caption News Transcripts from the Internet Archive (2014--2023)

Identification Number:

doi:10.7910/DVN/OAJJHI

Authoring Entity:

Sood, Gaurav

Laohaprapanon, Suriyan

Distributor:

Harvard Dataverse

Access Authority:

Sood, Gaurav

Depositor:

Sood, Gaurav

Date of Deposit:

2023-11-16

Holdings Information:

https://doi.org/10.7910/DVN/OAJJHI

Study Scope

Keywords:

Social Sciences

Abstract:

Closed Caption News Transcripts from the Internet Archive (2014--2023). The nc- files are ones where the commercials have been stripped out using the data from https://tvnews.stanford.edu/export/commercial For scripts underlying the data pull, see: https://github.com/notnews/archive_news_cc

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

File Description--f7166474

File: search-2023.tab

  • Number of cases: 179797

  • No. of variables per record: 1

  • Type of File: text/tab-separated-values

Notes:

UNF:6:QDjFCPMzxIXpoNWCI+1eNQ==

Variable Description

List of Variables:

Variables

identifier

f7166474 Location:

Variable Format: character

Notes: UNF:6:QDjFCPMzxIXpoNWCI+1eNQ==

Other Study-Related Materials

Label:

archive-cc-2014-all-nc.csv.gzaa

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

archive-cc-2014-all-nc.csv.gzab

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2014-all-nc.csv.gzac

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2014-all-nc.csv.gzad

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2014.csv.xzaa

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2014.csv.xzab

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2017-all-nc.csv.gzaa

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

archive-cc-2017-all-nc.csv.gzab

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2017-all-nc.csv.gzac

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2017-all-nc.csv.gzad

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2017-all-nc.csv.gzae

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2017-all-nc.csv.gzaf

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2017.csv.gzaa

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

archive-cc-2017.csv.gzab

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2017.csv.gzac

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2017.csv.gzad

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2017.csv.gzae

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2017.csv.gzaf

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2022-all-nc.csv.gzaa

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

archive-cc-2022-all-nc.csv.gzab

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2022-all-nc.csv.gzac

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2022-all-nc.csv.gzad

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2022-all-nc.csv.gzae

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2022-all-nc.csv.gzaf

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2022-all-nc.csv.gzag

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2022.csv.gzaa

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

archive-cc-2022.csv.gzab

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2022.csv.gzac

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2022.csv.gzad

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2022.csv.gzae

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2022.csv.gzaf

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

archive-cc-2023-all-nc.csv.gz

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

archive-cc-2023.csv.gz

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

html-2014.7zaa

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2014.7zab

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2014.7zac

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2014.7zad

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2014.7zae

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2014.7zaf

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2017.tar.gzaa

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

html-2017.tar.gzab

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2017.tar.gzac

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2017.tar.gzad

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2017.tar.gzae

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2017.tar.gzaf

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2017.tar.gzag

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2017.tar.gzah

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2017.tar.gzai

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2017.tar.gzaj

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2017.tar.gzak

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzaa

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

html-2022.tar.gzab

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzac

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzad

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzae

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzaf

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzag

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzah

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzai

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzaj

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzak

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzal

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzam

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzan

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzao

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzap

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzaq

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzar

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzas

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzat

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2022.tar.gzau

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2023.tar.gzaa

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

html-2023.tar.gzab

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2023.tar.gzac

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

html-2023.tar.gzad

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

meta-2017.tar.gzaa

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

meta-2017.tar.gzab

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

meta-2022.tar.gz

Text:

Notes:

application/gzip

Other Study-Related Materials

Label:

meta-2023.tar.gz

Text:

Notes:

application/gzip