Climate Change and Environmental Issues Dataset from Ukrainian Telegram Channels

Version 1.1

Ustyianovych, Taras; Fedushko, Solomia, 2025, "Climate Change and Environmental Issues Dataset from Ukrainian Telegram Channels", https://doi.org/10.7910/DVN/NL06IX, Harvard Dataverse, V1, UNF:6:J1ItiZE2+g0wAeefVx90TQ== [fileUNF]

Learn about Data Citation Standards.

Contact Owner

Dataset Metrics

69 Downloads

Description	Overview This repository contains two datasets that were collected and processed as part of a study on public perception of environmental issues and climate change in Ukraine. The datasets are derived from Ukrainian Telegram news channels and include metadata, raw text, and user reactions to posts related to climate events and environmental topics. These datasets are intended to support academic research on the relationship between public discourse, user sentiment, and climate indicators. The datasets are located in the `data` folder with respect to their extension: `csv` and `parquet`. If you decide to read the `climate_text_data_final` in CSV format, please set the encoding to `utf-16`. Datasets climate_text_data_final This dataset contains raw text data from Telegram posts, along with additional metadata. It provides a comprehensive view of the content and context of climate-related discussions. The dataset can be joined with the `final_reactions_data` based on the `channel_name` and `message_id`. Please ensure the encoding is set to utf-16 when reading the CSV format of the dataset. Key Features: Post ID: Unique identifier for each Telegram post. Channel Name: The name of the Telegram channel where the post was published. Text: The raw text of the Telegram post. Metadata: Includes timestamp, number of views, and number of forwards. Purpose: This dataset is designed to support natural language processing (NLP) tasks, such as topic modeling, named entity recognition, and sentiment analysis. It provides a foundation for understanding the themes and narratives surrounding climate change and environmental issues in Ukrainian online information space. final_reactions_data This dataset contains user reactions to Telegram posts, represented as emoji counts. It provides a detailed view of how users engage with climate-related content. Key Features: Post ID: Unique identifier for each Telegram post. Channel Name: The name of the Telegram channel where the post was published. Emoji Reactions: Columns representing counts of various emojis used to react to the post. Is NA: A boolean value showing whether the emoji reaction columns have NaN or at least one non-NA value. Purpose: This dataset enables researchers to analyze user sentiment and engagement with climate-related content. It can be used to identify patterns in public reactions to environmental issues and assess the emotional tone of the discourse. The emojis can be classified into categories to reduce dimensionality and work with a combined representation of emojis. Further, statistics on particular emoji class can be generated. This will lead to a solid understanding of user engagement patterns. Research Context The datasets were collected as part of a study aimed at understanding public attitudes toward environmental issues and exploring the relationship between public perception and climate indicators, especially in the period of the full-scale Russian aggression against Ukraine. The study focused on Telegram channels due to their popularity and influence in Ukraine. The research objectives included: Developing a methodology for automated data collection from Ukrainian Telegram channels on climate-related topics. Conducting a comprehensive analysis of the collected data using natural language processing and statistical methods to identify key topics, trends, and patterns. Investigating the relationship between message characteristics and user reactions to determine factors influencing public perception of environmental issues. The study analyzed content from seven influential Telegram news channels: DW Ukraine, BBC Ukrainian, Ukrayinska Pravda, Voice of America, Radio Liberty, Babel, and ZN.UA. These channels were selected based on their audience size, credibility, and regularity of coverage of environmental issues. The data collection period spanned five years (01.01.2020 - 14.01.2025), allowing for an analysis of trends over time, including the impact of the Russian war in Ukraine on public discourse. Ethical Considerations The datasets do not contain any personally identifiable information (PII). However, we acknowledge that the dataset may contain sensitive content due to the nature of the data. Some records may describe war-related activities, destruction, harm, or other sensitive topics. We have made every effort to remain unbiased in collecting data from the selected channels and have not censored any content. The dataset will undergo ethical clearance at Lviv Polytechnic National University to ensure compliance with ethical standards and guidelines for data collection, processing, and usage. This process aims to address potential concerns related to sensitive content and ensure the responsible use of the dataset in academic research. Recommendations for Ethical Use: Fairness and Bias: Evaluate results with fairness metrics to ensure that analyses are not biased or discriminatory. Transparency: Use tools for interpretability and explainability to ensure transparency in machine learning models and analyses. Monitoring: Implement machine learning monitoring to improve observability and awareness of system performance. Ethical Awareness: Be mindful of the potential for sensitive, distorted, or unfair content, particularly when analyzing topics related to war or conflict. Data Collection Methodology To identify relevant messages, we used an approach based on the `Aho-Corasick` algorithm, which enables efficient multi-pattern search in text data with linear time complexity. This was critical for processing large volumes of information. A thematic dictionary was developed, containing key terms structured into five categories: Climate terms Environmental issues Natural resources Climate events Environmental initiatives The algorithm was implemented in Python using the `telethon` library for collecting messages and the `pyahocorasick` library for building a finite state machine for parallel pattern search. As a result, 5,732 relevant messages related to climate change and environmental issues were identified and selected. (2025-06-18)
Subject	Earth and Environmental Sciences; Engineering; Computer and Information Science; Mathematical Sciences; Social Sciences
Keyword	climate change, environment, Digital platforms, Data analysis, computer science
Related Publication	Is Cited By: Ustianovych T., Fedushko S., Climate event dataset based on Ukrainian online information space. Information, communication, society 2025: ICS-2025 : Proceedings of the XIV International scientific conference, 22-24 May, 2025. Lviv : Lviv Politechnic Publishing House, 2025. P. 73–74.isbn: 978-966-994-052-0
License/Data Use Agreement	CC0 1.0

Filter by

	1 to 5 of 5 Files	Original Format Archival Format (.tab)
	climate_text_data_final.csv Comma Separated Values - 8.2 MB Published Jun 18, 2025 24 Downloads MD5: 27017d5e78decc4ab5aa5e02daf28773	Preview "climate_text_data_final.csv" Access File File Access Public Download Options Comma Separated Values Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	climate_text_data_final.parquet Unknown - 3.7 MB Published Jun 18, 2025 11 Downloads MD5: 98b23bae1caa1e38001538e3b67e037e	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	final_reactions_data.parquet Unknown - 138.1 KB Published Jun 18, 2025 12 Downloads MD5: 2366018155359c59bde52577032c4870	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX
	final_reactions_data.tab Tabular Data - 660.9 KB Published Jun 18, 2025 14 Downloads 75 Variables, 5724 Observations UNF:6:J1ItiZE2+g0wAeefVx90TQ==	Preview "final_reactions_data.tab" Preview "final_reactions_data.tab" Access File File Access Public Download Options Comma Separated Values (Original File Format) Tab-Delimited RData Download Metadata DDI Codebook v2 Data File Citation Download EndNote XML Download RIS Download BibTeX Explore Options Data Explorer v2
	README-3.md Markdown Text - 7.4 KB Published Jun 18, 2025 8 Downloads MD5: 9a6b1537ace2c0bba2d3c691be868107	Preview "README-3.md" Access File File Access Public Download Options Markdown Text Download Metadata Data File Citation Download EndNote XML Download RIS Download BibTeX

Citation Metadata

Persistent Identifier	doi:10.7910/DVN/NL06IX
Publication Date	2025-06-18
Title	Climate Change and Environmental Issues Dataset from Ukrainian Telegram Channels
Author	Lviv Polytechnic National Universityhttps://orcid.org/0000-0002-6323-7924 https://orcid.org/0000-0001-7548-5856
Point of Contact	Use email button above to contact. Fedushko, Solomiia (Lviv Polytechnic National University)
Description	Overview This repository contains two datasets that were collected and processed as part of a study on public perception of environmental issues and climate change in Ukraine. The datasets are derived from Ukrainian Telegram news channels and include metadata, raw text, and user reactions to posts related to climate events and environmental topics. These datasets are intended to support academic research on the relationship between public discourse, user sentiment, and climate indicators. The datasets are located in the `data` folder with respect to their extension: `csv` and `parquet`. If you decide to read the `climate_text_data_final` in CSV format, please set the encoding to `utf-16`. Datasets climate_text_data_final This dataset contains raw text data from Telegram posts, along with additional metadata. It provides a comprehensive view of the content and context of climate-related discussions. The dataset can be joined with the `final_reactions_data` based on the `channel_name` and `message_id`. Please ensure the encoding is set to utf-16 when reading the CSV format of the dataset. Key Features: Post ID: Unique identifier for each Telegram post. Channel Name: The name of the Telegram channel where the post was published. Text: The raw text of the Telegram post. Metadata: Includes timestamp, number of views, and number of forwards. Purpose: This dataset is designed to support natural language processing (NLP) tasks, such as topic modeling, named entity recognition, and sentiment analysis. It provides a foundation for understanding the themes and narratives surrounding climate change and environmental issues in Ukrainian online information space. final_reactions_data This dataset contains user reactions to Telegram posts, represented as emoji counts. It provides a detailed view of how users engage with climate-related content. Key Features: Post ID: Unique identifier for each Telegram post. Channel Name: The name of the Telegram channel where the post was published. Emoji Reactions: Columns representing counts of various emojis used to react to the post. Is NA: A boolean value showing whether the emoji reaction columns have NaN or at least one non-NA value. Purpose: This dataset enables researchers to analyze user sentiment and engagement with climate-related content. It can be used to identify patterns in public reactions to environmental issues and assess the emotional tone of the discourse. The emojis can be classified into categories to reduce dimensionality and work with a combined representation of emojis. Further, statistics on particular emoji class can be generated. This will lead to a solid understanding of user engagement patterns. Research Context The datasets were collected as part of a study aimed at understanding public attitudes toward environmental issues and exploring the relationship between public perception and climate indicators, especially in the period of the full-scale Russian aggression against Ukraine. The study focused on Telegram channels due to their popularity and influence in Ukraine. The research objectives included: Developing a methodology for automated data collection from Ukrainian Telegram channels on climate-related topics. Conducting a comprehensive analysis of the collected data using natural language processing and statistical methods to identify key topics, trends, and patterns. Investigating the relationship between message characteristics and user reactions to determine factors influencing public perception of environmental issues. The study analyzed content from seven influential Telegram news channels: DW Ukraine, BBC Ukrainian, Ukrayinska Pravda, Voice of America, Radio Liberty, Babel, and ZN.UA. These channels were selected based on their audience size, credibility, and regularity of coverage of environmental issues. The data collection period spanned five years (01.01.2020 - 14.01.2025), allowing for an analysis of trends over time, including the impact of the Russian war in Ukraine on public discourse. Ethical Considerations The datasets do not contain any personally identifiable information (PII). However, we acknowledge that the dataset may contain sensitive content due to the nature of the data. Some records may describe war-related activities, destruction, harm, or other sensitive topics. We have made every effort to remain unbiased in collecting data from the selected channels and have not censored any content. The dataset will undergo ethical clearance at Lviv Polytechnic National University to ensure compliance with ethical standards and guidelines for data collection, processing, and usage. This process aims to address potential concerns related to sensitive content and ensure the responsible use of the dataset in academic research. Recommendations for Ethical Use: Fairness and Bias: Evaluate results with fairness metrics to ensure that analyses are not biased or discriminatory. Transparency: Use tools for interpretability and explainability to ensure transparency in machine learning models and analyses. Monitoring: Implement machine learning monitoring to improve observability and awareness of system performance. Ethical Awareness: Be mindful of the potential for sensitive, distorted, or unfair content, particularly when analyzing topics related to war or conflict. Data Collection Methodology To identify relevant messages, we used an approach based on the `Aho-Corasick` algorithm, which enables efficient multi-pattern search in text data with linear time complexity. This was critical for processing large volumes of information. A thematic dictionary was developed, containing key terms structured into five categories: Climate terms Environmental issues Natural resources Climate events Environmental initiatives The algorithm was implemented in Python using the `telethon` library for collecting messages and the `pyahocorasick` library for building a finite state machine for parallel pattern search. As a result, 5,732 relevant messages related to climate change and environmental issues were identified and selected. (2025-06-18)
Subject	Earth and Environmental Sciences; Engineering; Computer and Information Science; Mathematical Sciences; Social Sciences
Keyword	climate change https://www.eionet.europa.eu/gemet/en/concept/1471 (GEMET) https://www.eionet.europa.eu/gemet environment https://www.eionet.europa.eu/gemet/en/concept/2944 (GEMET) https://www.eionet.europa.eu/gemet Digital platforms https://vocabularies.unesco.org/browser/thesaurus/en/page/?uri=http%3A%2F%2Fvocabularies.unesco.org%2Fthesaurus%2Fconcept17162 (UNESCO Thesaurus) http://vocabularies.unesco.org/thesaurus/concept450 Data analysis https://vocabularies.unesco.org/browser/thesaurus/en/page/?uri=http%3A%2F%2Fvocabularies.unesco.org%2Fthesaurus%2Fconcept2214 (UNESCO Thesaurus) http://vocabularies.unesco.org/thesaurus/concept450 computer science https://vocabularies.unesco.org/browser/thesaurus/en/page/?uri=http%3A%2F%2Fvocabularies.unesco.org%2Fthesaurus%2Fconcept450 (UNESCO Thesaurus) http://vocabularies.unesco.org/thesaurus/concept450
Related Publication	Is Cited By: Ustianovych T., Fedushko S., Climate event dataset based on Ukrainian online information space. Information, communication, society 2025: ICS-2025 : Proceedings of the XIV International scientific conference, 22-24 May, 2025. Lviv : Lviv Politechnic Publishing House, 2025. P. 73–74. isbn 978-966-994-052-0 https://ena.lpnu.ua/items/f20c3232-e67a-4f99-8502-80b05e9474f8
Language	English
Depositor	Fedushko, Solomiia
Deposit Date	2025-06-18

Dataset Terms

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Creative Commons CC0 1.0 Universal Public Domain Dedication. CC0 1.0

Dataset Version	Summary	Contributors	Published on
No records found.

Edit File

This file has already been deleted (or replaced) in the current version. It may not be edited.

Restrict Access

Restricting limits access to published files. People who want to use the restricted files can request access by default. If you disable request access, you must add information about access to the Terms of Access field.

Learn about restricting files and dataset access in the User Guide.

Request Access

Enable access request

You must enable request access or add terms of access to restrict file access.

Terms of Access for Restricted Files

Save Changes

Edit Embargo

The selected file or files have already been published. Contact an administrator to change the embargo date or reason of the file or files.

Edit Retention Period

The selected file or files have already been published. Contact an administrator to change the retention period date or reason of the file or files.

Delete Files

The file will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Select File(s)

Please select one or more files.

Share Dataset

Share this dataset on your favorite social media networks.

Continue

Dataset Citations

Citations for this dataset are retrieved from Crossref via DataCite using Make Data Count standards. For more information about dataset metrics, please refer to the User Guide.

Sorry, no citations were found.

Inaccessible Files Selected

The selected file(s) may not be downloaded because you have not been granted access or the file(s) have a retention period that has expired or the files can only be transferred via Globus.

You may request access to any restricted file(s) by clicking the Request Access button.

Ineligible Files Selected

The selected file(s) may not be transferred because you have not been granted access or the file(s) have a retention period that has expired or the files are not Globus accessible.

You may request access to any restricted file(s) by clicking the Request Access button.

Download Options

The files selected are too large to download as a ZIP.

You can select individual files that are below the 15.0 GB download limit from the files table, or use the Data Access API for programmatic access to the files.

Select File(s)

Please select a file or files to be downloaded.

Inaccessible Files Selected

The selected file(s) may not be downloaded because you have not been granted access or the file(s) have a retention period that has expired.

Click Continue to download the files you have access to download.

Ineligible Files Selected

Some file(s) cannot be transferred. (They are restricted, embargoed, with an expired retention period, or not Globus accessible.)

Click Continue to transfer the elligible files.

Delete Dataset

Are you sure you want to delete this dataset and all of its files? You cannot undelete this dataset.

Delete Draft Version

Are you sure you want to delete this draft version? Files will be reverted to the most recently published version. You cannot undelete this draft.

Unpublished Dataset Preview URL

Preview URL can only be used with unpublished versions of datasets.

Unpublished Dataset Preview URL

Are you sure you want to disable the Preview URL? If you have shared the Preview URL with others they will no longer be able to use it to access your unpublished dataset.

Delete Files

The file(s) will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Compute

This dataset contains restricted files you may not compute on because you have not been granted access.

Deaccession Dataset

Are you sure you want to deaccession? This is permanent and the selected version(s) will no longer be viewable by the public.

Deaccession Dataset

Are you sure you want to deaccession this dataset? This is permanent an it will no longer be viewable by the public.

Version Differences Details

Please select two versions to view the differences.

Version Differences Details

Version:
Last Updated:

Select File(s)

Please select a file or files for access request.

Select File(s)

Embargoed files cannot be accessed. Please select an unembargoed file or files for your access request.

Edit Tags

Select existing file tags or create new tags to describe your files. Each file can have more than one tag.

Request Access

You need to Sign Up or Log In to request access.

Dataset Terms

Please confirm and/or complete the information needed below in order to request access to files in this dataset.

This dataset is made available under the following terms. Please confirm and/or complete the information needed below in order to continue.

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Creative Commons CC0 1.0 Universal Public Domain Dedication. CC0 1.0

Preview Guestbook

Upon downloading files the guestbook asks for the following information.

Guestbook Name

Collected Data

Account Information

Package File Download

Use the Download URL in a Wget command or a download manager to download this package file. Download via web browser is not recommended. User Guide - Downloading a Dataverse Package via URL

Download URL

https://qa.dataverse.org/api/access/datafile/

Compute Batch

Clear Batch

Dataset	Persistent Identifier	Change Compute Batch

Compute Batch

Submit for Review

You will not be able to make changes to this dataset while it is in review.

Publish Dataset

Are you sure you want to republish this dataset?

By default datasets are published with the CC0-“Public Domain Dedication” waiver. Learn more about the CC0 waiver here.

To publish with custom Terms of Use, click the Cancel button and go to the Terms tab for this dataset.

Select if this is a minor or major version update.

Minor Release (1.2)

Major Release (2.0)

Publish Dataset

This dataset cannot be published until Solomiia Fedushko Dataverse is published by its administrator.

Publish Dataset

This dataset cannot be published until Solomiia Fedushko Dataverse and Harvard Dataverse are published.

Return to Author

Return this dataset to contributor for modification. The reason for return entered below will be sent by email to the author.

Curation Status History

Status	Date	Assigner
No records found.

Add/Edit a Version Note

Styled Citation

Climate Change and Environmental Issues Dataset from Ukrainian Telegram Channels

Overview

Datasets

Research Context

Ethical Considerations

Data Collection Methodology