[Reflection #2] – [01/31] -[Liz Dao]

Benjamin D. Horne and Sibel Adali. “This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News.”

Summary:

The main goal of the research is to build a model to differentiate fake news from real news. The authors also analyze satire news, which is considered a type of fake news in this paper. The data is collected from 3 data sets: Buzzfeed 2016 election data set; Burfoot and Baldwin data set; and a data set containing fake, real, and satire news created by the authors. The articles are analyzed and compared with each other based on three main feature categories: stylistic, complexity, and psychological. A Support Vehicle Machine classification model is built with the four most significant features selected from ANOVA and Wilcoxon rank sum test. 

The result of the research is in sync with previous studies of fake news, which can be summarized in three main points:

1.    Fake news articles have more lexical redundancy, more adverbs, more personal pronouns, fewer nouns, fewer analytic words, and fewer quotes. This means that fake news articles is much less informative and require a lower educational level to read than real news articles.

2.    Real news articles convince readers through solid reasoning while fake news supports their claim through heuristics.

3.    The content and title of fake news and satire articles are extremely similar.

Reflection:

First of all, the authors definitely practiced what they preached. The title of the paper is packed with verb phrases yet contains zero stop word. Moreover, all three main conclusions are included in the title, which hopefully will increase the chance of this article, or at least its main points, being read. Despite it might sound like a half-hearted solution, the authors’ suggestion of transforming real news articles’ titles to resemble that of fake news articles is actually a good idea. What will happen if we create the title of real news articles using the formula for fake news articles? Will people be more likely to read them? Will people classify them as fake news based on their titles?

    In spite of successfully building a classification model with a relatively high accuracy distinguishing fake and satire from real news articles, the research fails to deliver any new findings. Indeed, the result is nothing but a reconfirmation of previous studies. Furthermore, the difference between fake and real news articles seems obvious to most people. Similar to clickbait, detecting fake news articles is relatively easy. But the bigger question is how can we improve people willing to read real news articles instead of scrolling through a list of fake news titles?

    One interesting finding is the different title features of fake news articles between BuzzFeed 2016 election news dataset and the political news dataset collected by the authors. The former one uses significantly more analytical words. Nevertheless, the later one has more verb phrases and past tense words. That suggests that there is more than one type of fake news. Their difference in word choices also suggests they might be targeting different groups or trying to provoke different reactions. It can be interesting to study the cause of the distinction in feature between fake news articles.

    In addition, it is surprising that the SVM model produces much more accurate classification with satire articles than with fake news articles. It “achieve a 91% cross-validation accuracy over a 50% baseline on separating satire from real articles.” On the other hand, the model only “achieve a 71% cross-validation accuracy over a 50% baseline when separating the body texts of real and fake news articles.” Is that because of the mocking tone that distinguishes satire from real news articles? Furthermore, the model has a low accuracy when separating satire from fake news articles. This might post an issue as we might want to treat satire articles differently than fake news articles.

    Lastly, the distinction between clickbait’s and fake news articles’ title is quite intriguing. Because of their similarity in lack of validity, ethics, and valuable information; many people put clickbait and fake news in the same category. Yet, these two types of articles serve completely different purposes. Clickbait encourages readers to visit the web page thus the titles “have many more function words, more stop words, more hyperbolic words (extremely positive), more internet slangs, and more possessive nouns rather than proper nouns.” Fake news, meanwhile, wants to deliver their messages even if the majority of the links are never clicked. Hence, their titles, loaded with claims about people and entities, are an extremely concise summary of the whole articles. One way or the other, both fake news and clickbait have found their strategy to attract readers’ attention and engagement. So why real news articles are failing so far behind? Is it because they are not aware of the tricks fake news and clickbait are using? Or that they are too proud to give up their formal and boring titles?

Liz

Leave a Reply

Your email address will not be published. Required fields are marked *