Replication Data for: Synthetically generated text for supervised text analysis (doi:10.7910/DVN/JJ5BBX)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Replication Data for: Synthetically generated text for supervised text analysis

Identification Number:

doi:10.7910/DVN/JJ5BBX

Distributor:

Harvard Dataverse

Date of Distribution:

2024-11-15

Version:

1

Bibliographic Citation:

Halterman, Andrew, 2024, "Replication Data for: Synthetically generated text for supervised text analysis", https://doi.org/10.7910/DVN/JJ5BBX, Harvard Dataverse, V1, UNF:6:JJUrUpeMWFKHndQZmjKvEw== [fileUNF]

Study Description

Citation

Title:

Replication Data for: Synthetically generated text for supervised text analysis

Identification Number:

doi:10.7910/DVN/JJ5BBX

Authoring Entity:

Halterman, Andrew (Michigan State University)

Producer:

<i>Political Analysis</i>

Distributor:

Harvard Dataverse

Access Authority:

Halterman, Andrew

Depositor:

Halterman, Andrew

Date of Deposit:

2024-09-25

Holdings Information:

https://doi.org/10.7910/DVN/JJ5BBX

Study Scope

Keywords:

Social Sciences

Abstract:

Large language models are a powerful tool for conducting text analysis in political science, but using them to annotate text has several drawbacks, including high cost, limited reproducibility, and poor explainability. Traditional supervised text classifiers are fast and reproducible, but require expensive hand annotation, which is especially difficult for rare classes. This article proposes using LLMs to generate synthetic training data for training smaller, traditional supervised text models. Synthetic data can augment limited hand annotated data or be used on its own to train a classifier with good performance and greatly reduced cost. I provide a conceptual overview of text generation, guidance on when researchers should prefer different techniques for generating synthetic text, a discussion of ethics, a simple technique for improving the quality of synthetic text, and an illustration of its limitations. I demonstrate the usefulness of synthetic training through three validations: synthetic news articles describing police responses to communal violence in India for training an event detection system, a multilingual corpus of synthetic populist manifesto statements for training a sentence-level populism classifier, and generating synthetic tweets describing the fighting in Ukraine to improve a named entity system.

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Related Publications

Citation

Title:

Forthcoming, Political Analysis

Bibliographic Citation:

Forthcoming, Political Analysis

File Description--f10551855

File: synth_gujarat_2024-05-17.tab

  • Number of cases: 900

  • No. of variables per record: 3

  • Type of File: text/tab-separated-values

Notes:

UNF:6:o1yVfXB2c0gFEn41HHqH+g==

File Description--f10551861

File: synth_gujarat_2024-05-24.tab

  • Number of cases: 780

  • No. of variables per record: 3

  • Type of File: text/tab-separated-values

Notes:

UNF:6:aqyDEjVF3ucBtr+Ulx4fqw==

File Description--f10551866

File: TOI_active_learning_realistic_results.tab

  • Number of cases: 15000

  • No. of variables per record: 5

  • Type of File: text/tab-separated-values

Notes:

UNF:6:lK9pYarmAGs5dUmaGSTcsQ==

File Description--f10551856

File: TOI_active_learning_realistic_results_2024-08-30.tab

  • Number of cases: 15000

  • No. of variables per record: 11

  • Type of File: text/tab-separated-values

Notes:

UNF:6:T3PDuksi+oYhlHqF7l4WTw==

File Description--f10551864

File: TOI_simple_results_2024-06-18.tab

  • Number of cases: 20400

  • No. of variables per record: 11

  • Type of File: text/tab-separated-values

Notes:

UNF:6:mOEWqLvcz1OO4RTHNymJvw==

File Description--f10552042

File: real_vs_synth_ner.tab

  • Number of cases: 252

  • No. of variables per record: 4

  • Type of File: text/tab-separated-values

Notes:

UNF:6:FOpz1zA+ucO3K+sfppK6fA==

Variable Description

List of Variables:

Variables

text

f10551855 Location:

Variable Format: character

Notes: UNF:6:78CTScnIupbV/iwRTN3F0g==

title

f10551855 Location:

Variable Format: character

Notes: UNF:6:gctIOeV7pWlbwIQjm88CiQ==

label

f10551855 Location:

Variable Format: character

Notes: UNF:6:1R3P7ynfEnxye2gFSONclg==

text

f10551861 Location:

Variable Format: character

Notes: UNF:6:RDYIwuD2LF0i1ZR8zgUfOA==

title

f10551861 Location:

Variable Format: character

Notes: UNF:6:OjPDJv6HEWDRTPvaVC4fPA==

label

f10551861 Location:

Variable Format: character

Notes: UNF:6:m4NJ/YD4elCDVlcPgdN+QA==

num_annot

f10551866 Location:

Summary Statistics: Valid 15000.0; Max. 980.0; Mean 490.0; StDev 288.6270148540851; Min. 0.0

Variable Format: numeric

Notes: UNF:6:kqoT4F+MeRjkMynONRezJQ==

f1

f10551866 Location:

Summary Statistics: Max. 0.7710843373493976; StDev 0.23284022831277068; Mean 0.4988129145868635; Valid 15000.0; Min. 0.0

Variable Format: numeric

Notes: UNF:6:X5X9be5xn0LUcArY6tFj1Q==

event_type

f10551866 Location:

Variable Format: character

Notes: UNF:6:OsTFQCjxH53pFaeVQW/Itg==

seed

f10551866 Location:

Summary Statistics: Min. 0.0; Max. 49.0; Valid 15000.0; Mean 24.5; StDev 14.431350742704252

Variable Format: numeric

Notes: UNF:6:gSjV0bXJpCYbHHTbUBYEXw==

method

f10551866 Location:

Variable Format: character

Notes: UNF:6:0dU3gT/6AOWw+austuV1/A==

num_annot

f10551856 Location:

Summary Statistics: Valid 15000.0; Max. 980.0; Mean 490.0; StDev 288.6270148540851; Min. 0.0

Variable Format: numeric

Notes: UNF:6:kqoT4F+MeRjkMynONRezJQ==

f1

f10551856 Location:

Summary Statistics: Valid 15000.0; Max. 0.7631578947368421; StDev 0.22581957122092397; Min. 0.0; Mean 0.5056345091694852;

Variable Format: numeric

Notes: UNF:6:FSbCFxrP0zvFu350oOcwCQ==

precision

f10551856 Location:

Summary Statistics: Mean 0.7305913603284444; Valid 15000.0; Max. 1.0; StDev 0.17460428030540925; Min. 0.0

Variable Format: numeric

Notes: UNF:6:QiLZIlN6PC5zFSOihgoYIg==

recall

f10551856 Location:

Summary Statistics: Valid 15000.0; StDev 0.21691624152795394; Mean 0.4188488816738817; Min. 0.0; Max. 0.79375

Variable Format: numeric

Notes: UNF:6:n88JmV8S76+/s45UnEsa7g==

accuracy

f10551856 Location:

Summary Statistics: Min. 0.931469708302169; Max. 0.9983171278982798; StDev 0.0037931706497809753; Mean 0.9931895350286711; Valid 15000.0

Variable Format: numeric

Notes: UNF:6:wnLIJhQfN6HL/PfmIKEgZA==

remaining_pos

f10551856 Location:

Summary Statistics: Mean 34.02846666666669; Max. 139.0; Valid 15000.0; StDev 30.183045860207248; Min. 0.0;

Variable Format: numeric

Notes: UNF:6:tpPGcupHRTcZHAfz64mwkg==

perc_positives

f10551856 Location:

Summary Statistics: StDev 0.08109927876184747; Max. 0.6666666666666666; Mean 0.15640469913651195; Min. 0.005263157894736842; Valid 15000.0

Variable Format: numeric

Notes: UNF:6:Gmgy+P8bC4LX9RVmYrRF/Q==

event_type

f10551856 Location:

Variable Format: character

Notes: UNF:6:OsTFQCjxH53pFaeVQW/Itg==

class_weight

f10551856 Location:

Summary Statistics: Valid 0.0; Mean NaN; Min. NaN; StDev NaN; Max. NaN

Variable Format: numeric

Notes: UNF:6:GjAafq4oaAd+hOiWnQmCvQ==

seed

f10551856 Location:

Summary Statistics: Valid 15000.0; Max. 49.0; StDev 14.431350742704252; Mean 24.5; Min. 0.0;

Variable Format: numeric

Notes: UNF:6:gSjV0bXJpCYbHHTbUBYEXw==

method

f10551856 Location:

Variable Format: character

Notes: UNF:6:0dU3gT/6AOWw+austuV1/A==

n_human

f10551864 Location:

Summary Statistics: Mean 470.5882352941176; Max. 1000.0; Min. 0.0; StDev 386.1894081819535; Valid 20400.0

Variable Format: numeric

Notes: UNF:6:Ms8vXs/wAVL8xX4RAxRYEQ==

n_synth_arg

f10551864 Location:

Summary Statistics: Max. 500.0; Valid 20400.0; StDev 186.29643865896256; Min. -1.0; Mean 164.5

Variable Format: numeric

Notes: UNF:6:4GMWskCDWDoK4G/Gg9s/xg==

seed

f10551864 Location:

Summary Statistics: Min. 0.0; Max. 49.0; Valid 20400.0; StDev 14.43122340045245; Mean 24.5

Variable Format: numeric

Notes: UNF:6:cdDmDydaQplZrwSW0Lp5Og==

synth_type

f10551864 Location:

Variable Format: character

Notes: UNF:6:kMRvXsTRDJELlIL2So0ghw==

f1

f10551864 Location:

Summary Statistics: Max. 0.7220216606498194; Valid 20400.0; StDev 0.2096684957771509; Min. 0.0; Mean 0.37664428997884664

Variable Format: numeric

Notes: UNF:6:0WgODARLHZLb+JfrrMbkMw==

perc_pos

f10551864 Location:

Summary Statistics: Min. 0.0; Max. 0.31; StDev 0.06232873741633384; Valid 20400.0; Mean 0.06542067316681392

Variable Format: numeric

Notes: UNF:6:gEJTll7xbSZOAJOl//cFsw==

synth_n

f10551864 Location:

Summary Statistics: Min. 0.0; Valid 20400.0; Mean 430.29411764705884; Max. 1680.0; StDev 500.5471899352906;

Variable Format: numeric

Notes: UNF:6:eJe+l0qZWSNGI5jkEW44yQ==

event_type

f10551864 Location:

Variable Format: character

Notes: UNF:6:q32hJL6/r5/fUQK72SvuIA==

traditional_augmentation

f10551864 Location:

Variable Format: character

Notes: UNF:6:KAv9S2dcKNPGVwdqSkcamA==

balanced_class_weight

f10551864 Location:

Variable Format: character

Notes: UNF:6:EEVgM2SxZLA1tT3mukKJww==

model

f10551864 Location:

Variable Format: character

Notes: UNF:6:qS8ULz43CDIXOygTm9JbkA==

token_f1

f10552042 Location:

Summary Statistics: Mean 0.5133968216481106; Max. 0.7792553191489362; Min. 0.0; StDev 0.20413850170125777; Valid 252.0;

Variable Format: numeric

Notes: UNF:6:jZCU8FXT1c+f6WRwmCv95A==

n

f10552042 Location:

Summary Statistics: Valid 252.0; Max. 300.0; StDev 68.7989030868835; Min. 50.0; Mean 127.77777777777773

Variable Format: numeric

Notes: UNF:6:aCyG/DAq6fjqOWYT2Z62nQ==

train_file

f10552042 Location:

Variable Format: character

Notes: UNF:6:XsUQFWhx5J8/Smk8kyIRgg==

seed

f10552042 Location:

Summary Statistics: Min. 1.0; Max. 14.0; Mean 7.5; StDev 4.039151029097151; Valid 252.0;

Variable Format: numeric

Notes: UNF:6:yWxTbHu/0pCTK1/Gleu8bw==

Other Study-Related Materials

Label:

README.txt

Notes:

text/plain

Other Study-Related Materials

Label:

requirements.txt

Notes:

text/plain

Other Study-Related Materials

Label:

README.md

Text:

Notes:

text/markdown

Other Study-Related Materials

Label:

raw_annotations--raw.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

sents.csv

Text:

Notes:

text/comma-separated-values

Other Study-Related Materials

Label:

.DS_Store

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

.DS_Store

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

embedding_vis.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

generate_figures.pdf

Text:

Notes:

application/pdf

Other Study-Related Materials

Label:

generate_figures.Rmd

Text:

Notes:

text/x-r-notebook

Other Study-Related Materials

Label:

gen_synth_india.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

india_police_events_active_learning.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

india_police_events_experiment.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

theme_pub.R

Text:

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

IPE_events_active_error.png

Text:

Notes:

image/png

Other Study-Related Materials

Label:

IPE_synth_diff.png

Text:

Notes:

image/png

Other Study-Related Materials

Label:

new_ipe_fig_synth_both.png

Text:

Notes:

image/png

Other Study-Related Materials

Label:

new_ipe_fig_synth_unbalanced.png

Text:

Notes:

image/png

Other Study-Related Materials

Label:

new_ipe_fig_unbalanced.png

Text:

Notes:

image/png

Other Study-Related Materials

Label:

.DS_Store

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

README.md

Text:

Notes:

text/markdown

Other Study-Related Materials

Label:

.DS_Store

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

cmp_labeled_sentences_lang.csv

Text:

Notes:

text/comma-separated-values

Other Study-Related Materials

Label:

cmp_labeled_synth_statements_2024_02_16.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

cmp_labeled_synth_statements_2024_02_18.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

gpt3_synth_all_cmp_2022-12-15.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

gpt3_synth_populism2.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

gpt3_synth_populism_neg2.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

manifesto_sent_level.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

manifesto_sent_level_for_prodigy.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

manifesto_sent_level_score_2023-02-05.csv

Text:

Notes:

text/comma-separated-values

Other Study-Related Materials

Label:

manifesto_sent_level_score_mlm.csv

Text:

Notes:

text/comma-separated-values

Other Study-Related Materials

Label:

populism_hand_validation.csv

Text:

Notes:

text/comma-separated-values

Other Study-Related Materials

Label:

populism_validation.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fit_cmp_classifier_new.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

generate_cmp_statements.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

gen_synth_populist.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

get_top_ukip.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

hand_validation.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

synth_pop_classifier_setfit.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

theme_pub.R

Text:

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

config.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

config_sentence_transformers.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

config_setfit.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

model.safetensors

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

model_head.pkl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

modules.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

pytorch_model.bin

Text:

Notes:

application/macbinary

Other Study-Related Materials

Label:

README.md

Text:

Notes:

text/markdown

Other Study-Related Materials

Label:

sentencepiece.bpe.model

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

sentence_bert_config.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

special_tokens_map.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

tokenizer.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

tokenizer_config.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

config.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

config.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

config_sentence_transformers.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

model_head.pkl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

modules.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

pytorch_model.bin

Text:

Notes:

application/macbinary

Other Study-Related Materials

Label:

README.md

Text:

Notes:

text/markdown

Other Study-Related Materials

Label:

sentencepiece.bpe.model

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

sentence_bert_config.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

special_tokens_map.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

tokenizer.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

tokenizer_config.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

config.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

README.md

Text:

Notes:

text/markdown

Other Study-Related Materials

Label:

tweet_text_partial.csv

Text:

Notes:

text/comma-separated-values

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_20_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_20_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_20_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_20_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_20_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_20_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_20_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_50_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_50_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_50_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_50_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_50_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_50_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_50_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_80_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_80_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_80_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_80_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_80_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_80_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.8_80_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_20_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_20_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_20_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_20_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_20_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_20_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_20_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_50_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_50_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_50_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_50_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_50_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_50_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_50_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_80_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_80_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_80_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_80_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_80_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_80_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.95_80_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_20_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_20_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_20_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_20_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_20_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_20_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_20_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_50_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_50_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_50_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_50_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_50_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_50_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_50_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_80_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_80_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_80_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_80_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_80_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_80_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.99_80_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_20_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_20_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_20_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_20_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_20_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_20_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_20_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_50_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_50_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_50_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_50_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_50_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_50_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_50_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_80_0.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_80_0.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_80_0.7_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_80_1.3_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_80_1.5_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_80_1.8_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-25_0.9_80_1_3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.8_50_0.3_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.8_50_0.5_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.8_50_0.7_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.8_50_1.3_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.8_50_1.5_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.8_50_1.8_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.8_50_1_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.95_50_0.3_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.95_50_0.5_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.95_50_0.7_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.95_50_1.3_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.95_50_1.5_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.95_50_1.8_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.95_50_1_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.99_50_0.3_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.99_50_0.5_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.99_50_0.7_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.99_50_1.3_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.99_50_1.5_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.99_50_1.8_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.99_50_1_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.9_50_0.3_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.9_50_0.5_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.9_50_0.7_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.9_50_1.3_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.9_50_1.5_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.9_50_1.8_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

fake_ukraine_tweets_2022-07-26_0.9_50_1_1.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

synth_detection_results_transformer.csv

Text:

Notes:

text/comma-separated-values

Other Study-Related Materials

Label:

ukr_synth_bad_both.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

ukr_weapon_real.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

ukr_weapon_synth_best3.jsonl

Text:

Notes:

application/octet-stream

Other Study-Related Materials

Label:

1_get_tweets.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

2_fine_tune_gpt2_twitter.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

3_generate_synth_tweets_gpt2.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

4_select_generation_parameters_fancy.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

5_fit_ner.py

Text:

Notes:

text/x-python-script

Other Study-Related Materials

Label:

6_real_vs_synth_ner_plots.Rmd

Notes:

text/x-r-notebook

Other Study-Related Materials

Label:

theme_pub.R

Text:

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

real_vs_synth_ner_perf.pdf

Notes:

application/pdf

Other Study-Related Materials

Label:

tweet_discrim_parameters_transformer.pdf

Text:

Notes:

application/pdf

Other Study-Related Materials

Label:

config.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

generation_config.json

Text:

Notes:

application/json

Other Study-Related Materials

Label:

model.safetensors

Notes:

application/octet-stream

Other Study-Related Materials

Label:

training_args.bin

Text:

Notes:

application/macbinary