CompSoSci Final Assignment

Why are Tweets about the War in Russia and Ukraine relevant?

The war in Ukraine has affected the political climate, the economy and people's sense of security all over the world. This makes it relevant to explore the public opinion towards the Russia-Ukraine situation

Twitter as Database
The ongoing Russian-Ukraine war

Twitter is used by 229 million users daily and is thereby an extensive source to access insights about the public opinion of a trending matter. We exploit Twitter to access all tweets including hashtags and key words related to the Russian-Ukranian war as well as a variety of info about the tweets for the purposes of text and network analysis.

Easy access to huge amounts of data

Quick identification of recent trends

Data format makes it easy to visualise networks graphs

The Russian invasion of Ukraine has shook many countries and has severely affected economies, families and political relationships across the world. As an effect of Russia's endless war crimes, threat to Ukraine's independence and strained relationship to NATO, many countries have imposed sanctions on Russia resulting in price increases and a massive decrease in delivery of gas from Russia to the world. More importantly, more than 14.000 Ukranians have lost their lives and many more has become refugees in Europe.

Political relationships are changing

Sanctions against Russia affect economies worldwide

Ukranians are in need of help from the rest of the world

Breaking down Trends into
#Hashtags, Social Networks and Sentiments

#Hashtags

Has the usage of hashtags changed from February 21st 2022 to now?

Social Networks

Do strong communities within the different languages exist? Do the users gather in communities based on similar opinions?

Sentiments

Are there any shifts in sentiment and discourses from February 21st 2022 until now? Do people using languages express different views on the war?

The quick and dirty on the dataset

More than a million tweets, lots of different languages and 54 days of war

The tweets are downloaded based on a data set consisting of tweet IDs. The dataset is available at the github repo here, where the authors are also listed. The tweet ids from the data set was retrieved from Twitter using the Twitter streaming Api and include tweets from each day in the period 21-02-2022 to 15-04-2022.

The tweet ids have been collected based on a range of relevant keywords (russia, ukraine, putin, zelensky, kyiv, etc.). This list was continuously updated by the authors during the period of the data collection, as more keywords became relevant for the crisis.

Below is a complete list of the keywords as well as their introduction date. All keywords were translated to russian as well as ukranian

27th Feb: russia, ukraine, putin, zelensky, russian, ukranian, keiv, kyiv
1st Mar: kharkiv
3rd Mar: khorsan
4th Mar: zaporizhzhia, energodar

To make sure we had a uniform distribution of data during the whole period, we extracted 18.000 randomly chosen tweet ids from the github repo for each day

Due to the reduction in data from our side as well as the chosen keywords from the author of the dataset, the following analysis is not a complete reflection of sentiment surrounding the Ukraine/Russia war on Twitter. However, it is still a fair indication of it.

After retrieving the tweet ids, we use the Twitter Api v2 to collect the corresponding tweets from Twitter and saved them all in a .csv-file. Our dataset consists of 18.000 tweets for each of the 54 days, and the dataset include tweets from 651.363 users in total. As the majority of the users do not allow tracking of geolocation, we were not able to save the countries/locations of the users and tweets as first intended. Instead we tracked the language used in each tweet and in total we found 32 languages.

We saved the following 8 attributes for each tweet:

- text of tweet (“full_text”)
- hashtags (“entities.hashtags”)
- id of tweet (“id”)
- the name of the user of the tweet (“user.screen_name”)
- the user that the tweet is replying to (“in_reply_to_screen_name”)
- date of tweet upload (“created_at”)
- the language the tweet is written in (“lang”)
- the location of the tweet (if any) (“location”)

The full text and hashtags were saved to be used in our text and discourse analysis later on. We used the creation date and written language of the tweet to filter the tweets on time and language, respectively, when performing the text and network analysis, and the tweet id and parent author were used to generate several social network graphs during the analysis.

After extracting all tweets, we performed multiple steps of data cleaning before using the data for analysis. The following major steps were performed on all tweets in our data cleaning-process:

Translation to english

Lowercasing

Stemming

Stopwords Removal

Filtering out emojis and other non-alphanumeric content

Hashtags detection

Tokenization of tweet texts

TWEET TEXT

#HASHTAGS

TWEET ID

USERNAME OF INTERACTOR

USERNAME OF ORIGINAL TWEET POST

DATE OF TWEET

LANGUAGE OF TWEET

Tweet Volume for English, Ukranian and Russian Tweets

It is easy to see that the volume of tweets written in english far succeeds the tweets written in ukranian and russian respectively. This is only expected as the english language has an overall higher volume of people speaking it. However, this leads to english hashtag trends and keywords possibly dominating in the analysis where we are not splitting by language.

Analysing discourse changes in tweets related to the Russian-Ukranian war

|Computational Social Science Spring 2022|

Natasha Norsker, Alma Fazlagic, Simone von Mehren

Why are Tweets about the War in Russia and Ukraine relevant?

Easy access to huge amounts of data

Quick identification of recent trends

Data format makes it easy to visualise networks graphs

Political relationships are changing

Sanctions against Russia affect economies worldwide

Ukranians are in need of help from the rest of the world

Breaking down Trends into
#Hashtags, Social Networks and Sentiments

#Hashtags

Social Networks

Sentiments

The quick and dirty on the dataset

27th Feb: russia, ukraine, putin, zelensky, russian, ukranian, keiv, kyiv
1st Mar: kharkiv
3rd Mar: khorsan
4th Mar: zaporizhzhia, energodar

Translation to english

Lowercasing

Stemming

Stopwords Removal

Filtering out emojis and other non-alphanumeric content

Hashtags detection

Tokenization of tweet texts

Tweet Volume for English, Ukranian and Russian Tweets

Choose your analysis

Shift in Tweets over Time

Shift in Tweets based on Language

Analysing discourse changes in tweets related to the Russian-Ukranian war

|Computational Social Science Spring 2022|

Natasha Norsker, Alma Fazlagic, Simone von Mehren

Why are Tweets about the War in Russia and Ukraine relevant?

Easy access to huge amounts of data

Quick identification of recent trends

Data format makes it easy to visualise networks graphs

Political relationships are changing

Sanctions against Russia affect economies worldwide

Ukranians are in need of help from the rest of the world

Breaking down Trends into #Hashtags, Social Networks and Sentiments

#Hashtags

Social Networks

Sentiments

The quick and dirty on the dataset

27th Feb: russia, ukraine, putin, zelensky, russian, ukranian, keiv, kyiv 1st Mar: kharkiv 3rd Mar: khorsan 4th Mar: zaporizhzhia, energodar

Translation to english

Lowercasing

Stemming

Stopwords Removal

Filtering out emojis and other non-alphanumeric content

Hashtags detection

Tokenization of tweet texts

Tweet Volume for English, Ukranian and Russian Tweets

Choose your analysis

Shift in Tweets over Time

Shift in Tweets based on Language

Breaking down Trends into
#Hashtags, Social Networks and Sentiments

27th Feb: russia, ukraine, putin, zelensky, russian, ukranian, keiv, kyiv
1st Mar: kharkiv
3rd Mar: khorsan
4th Mar: zaporizhzhia, energodar