This Dataverse contains word embeddings, word embedding vocabulary files, and sentence files for sentiment analysis created from Daily Nation (a Kenyan newspaper) article text from 1998-2019. The uncleaned article text is also included for each year and section of the newspaper. There are embeddings created from text separated by individual years (e.g. 2005) and groups of three years (e.g. 1998-2000). There is a set of these word embeddings created using the GloVe algorithm, designated with "glove" at the beginning of the file, and the word2vec algorithm, designated with "daily nation" at the beginning of the file. See the Github repository https://github.com/emmapair/Kenyan-Embeddings-and-Sentiment for the analysis code.
Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

1 to 6 of 6 Results
Dec 7, 2021
Pair, Emma, 2021, "Daily Nation Article Text (Uncleaned) 1998-2019", https://doi.org/10.7910/DVN/MGVVBY, Harvard Dataverse, V1
This contains article text scrapped from the Daily Nation website. It contains article text separated by year and subject (e.g. sports, news).
Dec 7, 2021
Pair, Emma, 2021, "Word2vec Word Embeddings (1 year)", https://doi.org/10.7910/DVN/MA5TNE, Harvard Dataverse, V1
These are word2vec word embeddings created from article text for each year (e.g. 1998, 1999) from the Daily Nation, a Kenyan newspaper, and the corresponding vocabulary files.
Dec 7, 2021
Pair, Emma, 2021, "Word2vec Word Embeddings (3 year)", https://doi.org/10.7910/DVN/SATUTC, Harvard Dataverse, V1
These are word2vec word embeddings created from article text grouped into three years (e.g. 1998-2000, 2001-2003) from the Daily Nation, a Kenyan newspaper, and the corresponding vocabulary files.
Dec 7, 2021
Pair, Emma, 2021, "Glove Word Embeddings (3 year)", https://doi.org/10.7910/DVN/XQ9685, Harvard Dataverse, V1
These are GloVe word embeddings created from article text grouped into three years (e.g. 1998-2000, 2001-2003) from the Daily Nation, a Kenyan newspaper, and the corresponding vocabulary files.
Dec 7, 2021
Pair, Emma, 2021, "Glove Word Embeddings (1 year)", https://doi.org/10.7910/DVN/05OJVF, Harvard Dataverse, V1
These are GloVe word embeddings created from article text for each year (e.g. 1998, 1999) from the Daily Nation, a Kenyan newspaper, and the corresponding vocabulary files.
Dec 7, 2021
Pair, Emma, 2021, "Kenyan Political Leader Sentences", https://doi.org/10.7910/DVN/WFIUIU, Harvard Dataverse, V1
These are sentences from Daily Nation article text, grouped by years of three. We pulled sentences containing male and female gender nouns (female1 and male1) and the names of male and female Kenyan political leaders (female2 and male2) and the two surrounding sentences.
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.