X

Research Notebook

The 'Mixology' open research project aims to probe opinions in times of crisis from a corpus collected via the Twitter API. Its other objective is to develop an original research tool to be also reused for the analysis of headlines or media content (computational linguistics and machine learning methods), in line with media studies and journalism studies.

Blog 8: Linguistic and quantitative processing of the ‘vaccination’ corpus (English, part.2)

6 janvier 2022

French

The analysis of the bigrams and trigrams (with the R tidytext package) reminds us that the vaccination campaign is as much about health as politicals. In the subsection of the corpus relating to the United Kingdom, the names of Prime Minister Boris Johnson and Minister of Health Sajid Javid come among the first occurrences. These are notably linked to the vote for a vaccination passport. However, health is also at the center of Twitter users’ concerns, particularly regarding the vaccines’ side effects. Therefore, tweets are both conditioned by the news and also by the well-being of the person (« sore arm », « serious illness », « freezing sick scared », « feel better soon »). These findings are transversal to the entire « vaccination » corpus.

 

 

In the 58,425 observations relating to all countries excluding the United Kingdom (Luxembourg, France, the Netherlands, Belgium, Ireland, Germany, Switzerland – i.e. 20% of this first corpus), the topic of mandatory vaccination is also pregnant. The first politician to be cited in the six EU countries is Ursula von der Leyen, President of the European Commission. Twitter users questioned her relationships with Pfizer/Biontech. Note that one news media is at the top of the trigrams of the European sub-corpus: the New York Times.

 

 

In this second level of analysis, the debate seems less polarized between pro and anti vaccines: the bigrams « unvaccinated people » and « antivax people » obtain respectively 667 and 252 occurrences, while « booster jab » gets 7,776. The trigrams confirm this observation but show some animosity towards those vaccinated, which are associated with the terms denial, extremism, propaganda, activism, and idiots (to be continued).

 

 

# # #

Read more

Blog 13: Building a stop words list

Blog 12: Main Dictionaries for Sentiment Analysis

Blog 11: Statistical description of the corpus #RStats

Blog 10: Sentiment analysis or the assessment of subjectivity

Blog 9: Topic modeling of the ‘vaccination’ corpus (English)

Blog 8: Linguistic and quantitative processing of the ‘vaccination’ corpus (English, part.2)

Blog 7: Linguistic and quantitative processing of the ‘vaccination’ corpus (English, part.1)

Blog 6: Collecting the corpus and preparing the lexical analysis

Blog 5: The textclean package

Blog 4: Refining the queries

Blog 3: The rtweet package

Blog 2: Collecting the corpus

Blog 1: An open research project

The challenges of research on media use in times of crisis