Take a look at the word clouds filtered on language in this section, which highlight the most relevant words within the given language on Twitter. The wordclouds are generated based on TF-IDF scores in order to highlight, not simply the most frequent used words, but the most relevant words for the specific language.
Explore the prominent words yourself or take a look at our analysis below.
Looking at the English wordcloud we can get a sense of what words/subjects are discussed in general all over the world, as English is the language most often used to communicate between different nationalities. Some of the most prominent words are "russia(n)", "ukraine(ian)", "zelensky", "putin", "biden" and "trump", which are all very general in terms of the war. In other words the presence of these words could indicate that the political situation and factual development of the war is what is discussed the most across nationalities. This seems sensible as this is the ways in which most countries are affected by the war and therefore this is probably what is interesting to most people worldwide. Other words which represent more specific subjects include "gas", "oligarchs", "nato", "oil", "fight", "support" and "kyiv". This is most likely representing the consequences of the western sanctions towards Russia which have caused a shortage of gas and oil supply as well as price increases. Also it is highly debated to what extend NATO should help Ukraine and the presence of this word could indicate that people are expressing their opinion about this matter online. Lastly words like "please", "fight" and "kyiv" are included in the wordcloud. We cannot conclude anything regarding who is fighting for/supporting who based on these stand-alone words. However, the presence of the word "kyiv" indicates that a majority of the users expressing themselves in English are supporting Ukraine, as the choice of spelling Kyiv in the Ukrainian way has become a symbol of supporting the country during the war.
Looking at the Russian wordcloud, many of the words from the English wordclouds such as "ukraine(ian)", "russia(n)", "zelensky" and "putin" recur in here. However, some distinct words are present as well. These include "terrorist", "operation", "oppression", "belarus", "access", "sanctions" and many more. Once again it is hard to say anything certain about the dicourse and opinions towards these subjects based on single words but we can infer that subjects such as the development in the invasion of Ukraine and the effects of the sanctions towards their country are some of the focuses of the Russians.
One of the most distinctive and informative wordclouds in terms of sentiment and discourse are the Ukrainian. Here we see (probably) hashtags such as "stoprussia" and "standwithukraine" which of course expresses sympathy with their country. Words such as "children" and "feminism" may refer to the war crimes committed by Russia such as killing civilians (including targetting schools and kinder gardens) and rape of the women, and words like "freedom", "please" and "world" may symbolise the Ukranians' cry and hope for help from the rest of the world. It should be emphaised that this is just suggestions and we cannot conclude anything confidently about the meanings behind the wordclouds based entirely on singe words.
The word shift plot to the right displays the shift in use of words from the first two weeks of the war vs. the first two weeks of April. Hold your mouse on top of the bars to the right to see the exact values of word shifts!
To clarify what can be seen in the plot, the light colors represent the words that have been less frequently used in ukranian tweets than in russian tweets and the dark colors respresent words that are more frequently used in ukranian than in russian. Yellow indicates positive words, while blue indicates negative words.
As can be seen by the light yellow color, the word 'russia' is not as commonly used by tweets written in
ukranian, than they are in russian tweets. It can also be seen that it is counted as a positive word. Since
the intention of the tweet is not necessarily for 'russia' to be positive, this might skew the overall
result of the wordshift so that the overall ukranian tweets are counted more negatively than they perhaps
should have been.
To fix this issue, we would either have to have some contextual information and assign 'russia' and
'russian' our own sentiment score depending on the context or we would have to remove the word completely.
Both of these seemed to be subpar solutions, since the decrease in the use of 'russia' and 'russian' is
still a valuable insight and creating a context specific sentiment score for only the words 'russian' and
'russia' would make the wordshift unnecessarily complicated.
It can be seen that ukrainian tweets more often include the word 'glory'. Ukrain has a war slogan 'Slava Ukraini' which translates to 'Glory to Ukraine' in english. 'win','victory' and 'enemy' is also more frequently used by ukrainian tweets.
Russian tweets more often use the negative words 'bomb', 'operation', 'killed', 'attacked' than ukrainian tweets do. They also more often use the positive words like 'russia', 'russian', 'like', 'special', where, as previously stated, the words 'russia' and 'russian' is slightly more ambiguous in terms of what the sentiment score should be.
Overall ukrainian tweets have a higher sentiment score than russian tweets, although the shift in sentiment score is not large.
How do Twitter users interact in Ukrainian and Russian?
In the following network analysis we investigate the nature of how users interact in Ukrainian and Russian by looking at both clustering trends and community detection.
Ukranian vs Random Network
Ukranian Network Colored by Communities
Russian vs Random Network
Russian Network Colored by Communities
Above are the network graphs for both languages displayed - both with and without community detection, alongside their random network counterpart.
Our analysis show that in both cases, the clustering trend for the random network is actually more present than that of the original graphs, which could indicate that users to not seem to stay aorund in the same threads responding to each other. Especially when retweeting is such a big part of twitter - if many people retweet the same tweet, a big cluster will not form around them since they do not reply to eachother. We can relate this to how the orignal networks compare to their random counterpart visually - in both networks, the number of edges are the roughly the same in the original and random networks, but the number of nodes is higher in the random graph. This could explain why the original networks look more sparse than their random counterpart.
Furthermore, coloring the nodes based on the community partitions allows us to detect some cluster structures. As perhaps expected, most of the big clusters come from users retweeting and replying to bigger news stations or organizations in both networks. We do not typically see private people engage in discussion of the same size as for the news stations.
This clustering trend makes sense if we consider how useres generally interact on Twitter. The structure we typically will see in Twitter networks is for instance a certain tweet will go viral, which many will retweet, reply to and quote tweet - then these replies and quote tweeets will be retweeted and replied to and so on and it will continue almost in an exponential manner, meaning that people will not form one big discussion forum in the comments of the original tweet, but rather spread the message in a very fast and "broad" way.
Furthermore, the connections we observe in these networks are not the same connections we see in a social network for instance - the links do not correspond to friendships but rather a reaction a user might have on something another user stated. Parent authors will not be enclined to respond in the same way as on Reddit for instance. This means we will not typically observe triadic closures in these Twitter networks compared to a social network of friends, family and co-workers. Seeing as we also investigate a very difficult and widespread topic, there is not the same sense of community presented compared to how communities behave on Reddit.