Research Notebook'Mixology' is an open research project, which aims to extract opinions in times of crisis, here from a corpus collected via the Twitter API, from December 12 to 31, 2021.
Blog 8: Linguistic and quantitative processing of the ‘vaccination’ corpus (English, part.2)6 janvier 2022
The analysis of the bigrams and trigrams (with the R tidytext package) reminds us that the vaccination campaign is as much about health as politicals. In the subsection of the corpus relating to the United Kingdom, the names of Prime Minister Boris Johnson and Minister of Health Sajid Javid come among the first occurrences. These are notably linked to the vote for a vaccination passport. However, health is also at the center of Twitter users’ concerns, particularly regarding the vaccines’ side effects. Therefore, tweets are both conditioned by the news and also by the well-being of the person (« sore arm », « serious illness », « freezing sick scared », « feel better soon »). These findings are transversal to the entire « vaccination » corpus.
In the 58,425 observations relating to all countries excluding the United Kingdom (Luxembourg, France, the Netherlands, Belgium, Ireland, Germany, Switzerland – i.e. 20% of this first corpus), the topic of mandatory vaccination is also pregnant. The first politician to be cited in the six EU countries is Ursula von der Leyen, President of the European Commission. Twitter users questioned her relationships with Pfizer/Biontech. Note that one news media is at the top of the trigrams of the European sub-corpus: the New York Times.
In this second level of analysis, the debate seems less polarized between pro and anti vaccines: the bigrams « unvaccinated people » and « antivax people » obtain respectively 667 and 252 occurrences, while « booster jab » gets 7,776. The trigrams confirm this observation but show some animosity towards those vaccinated, which are associated with the terms denial, extremism, propaganda, activism, and idiots (to be continued).