Summary
Horne and Adali focus on the problem of fake news and how its contents differ from real news with an analysis of three different data sets. Previous work has been focused on the issue of how fake news spreads in networks and the two researchers take a new approach and investigate the content-specific differences: style and complexity among others. The researchers conducted an ANOVA test and a Wilcoxon rank sum test to compare statistics for different metrics such as all caps, swear words, pronouns, and also used a support vector machine for classification of fake vs real, satire vs real, and satire vs fake of news articles.
Reflection
This paper was an interesting read, and incredibly relevant given today’s technology that facilitates instantaneous news alerts through mobile and other handheld devices. What was surprising to me was that the researchers offered additional insights beyond the data itself, unlike the previous article. The paper notes that its main findings, that real news and fake news target people differently (real news connects with readers based on arguments while fake news connects with readers based on heuristics), can be explained by the Elaboration Likelihood Model theory. This is an interesting idea, and something that is mentioned for a brief moment in the paper but is something that may contribute to the spread of fake news is echo chambers. If all news comes from biased and un-objective sources, people most likely have a harder time of discerning actual real news from fake news.People can be incredibly trusting towards news sources that their friends and family listen to, so this creates an issue where objectively fake news becomes real news and vice versa and any contradictions are then deemed as fake news.
Another interesting point that is raised in the paper is that classification for satire vs fake news has a much lower classification accuracy with the support vector machine used than for satire vs real news. On Reddit, there is a sub-reddit called “AteTheOnion” which contains screenshots of people on social media who respond to articles without realizing the articles are satire and it would be interesting to analyze the contents of articles referenced in that sub-reddit to see where audiences incorrectly classified news to better determine why exactly the misclassification between satire and real news occurred.To a careful reader, satire should be clear just by examining the headline (does this make sense logically, could this be a joke; these are relevant questions for media literacy and engaging with online media) but to me there are so many factors as to why people may be susceptible to interpreting satire and fake news as real news that it would be hard to classify whether a person will interpret a new news article as fake news/satire as real news.
Additional Questions
- Given what we know about the typical headline and overall structure of fake news, can we automate generation of fake news articles that are believable?
- Based on the previous paper, how does audience perception differ for real news vs fake news in terms of Twitter tweets, favorites, or other metrics?
- Given a person’s reading history can we reliably predict how susceptible they are to incorrectly classifying an article as real, fake, or satire?