Reading Reflection #3 – [2/05/2019] – [Matthew, Fishman]

Automated Hate Speech Detection and the Problem of Offensive Language

Quick Summary

Why is this study important? Classifying hate speech vs. offensive language. Hate speech targets and potentially harms disadvantaged social groups (promotes violence or social disorder). Without differentiating the two, we erroneously consider many to be hate speakers.

How did they do it? The research team got a lexicon of “hate words” from hatebase.org to find over 30k twitter users who used those hate words. They extracted each user’s timeline and took a random sample of 25k tweets containing terms from the lexicon. They used CrowdFlower crowdsourcing to manually label each tweet as hate speech, offensive, or neither. They then considered many features from these tweets and used them to train a classifier. They used a logistic regression model with L2 regularization, making a separate classifier for each class using scikit-learn. Each tweet was classified by the most confident classifier.

What were the results? They found that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Human coders appear to consider sexist/derogatory words towards women are only offensive. While the classifier was great at predicting non-offensive or merely offensive tweets, its struggled to distinguish true hate speech from offensive language.

Reflection

Ways to Improve the Study:

  • Only a small percentage of the tweets flagged by the hatebase lexicon were considered hate speech by human coders. This means that the dictionary used in identifying “hateful” words was extremely broad. Using a more precise lexicon would increase the accuracy of the classifier.
  • I think studying the syntax of hate speech could be particularly interesting. It would be interesting to try to train a classifier without using particular keywords.

What I liked About the Study:

  • The use of CrowdFlower was a very interesting approach to labeling the tweets. Although there were clearly some user-errors in the classifications, the idea of crowd-sourcing to get a human perspective it intriguing, and I plan on looking into this for my future project.
  • They used a TON of features for classifications and tested many models. I think a big reason why the classifiers were so accurate was because of the detail the team took in creating their classifier.

Early Public Responses to the Zika-Virus on YouTube: Prevalence of and Differences Between Conspiracy Theory and Informational Videos

Quick Summary

Why is this Study Important? When alarming news breaks, many internet users consider it a chance to spread conspiracy theories to garner attention. It is important that we learn to distinguish between the truth and these fake news/conspiracy theories.

How did they do it? The team collected the user reactions (comments, shares, likes, dislikes, and the content/sentiment of user responses) to the 35 most popular videos posted on YouTube when the Zika virus began its outbreak in 2016,

What were the Results? The results were not surprising. 12 of the 35 videos in the data set focus on conspiracy theories, but there were no statistical differences between the two types of videos. Both true/informational videos and conspiracy theory videos shared similar numbers of responses, unique users per view, and additional responding per unique user. Both types of videos has similarly negative responses, but informational videos’ comments are concerned with the outcome of the virus, while conspiracy theory videos’ comments were concerned with where the virus came from.

Reflection:

What I would do to Improve the Study:

  • Study user interactions.responses more closely. User demographics might tell a much bigger story about the reactions to these types of videos in comparison to each other. For example, older people might be less susceptible to conspiracy theories and respond less than younger people.
  • Study different aspects of the videos all together. Clearly, user responses/interaction with informational videos and conspiracy theory videos are similar. However, looking at differences in content, titles, and publisher credibility of the video would make a lot more sense in distinguishing the two.

What I liked About the Study:

  • The semantic map of user comments was highly interesting and I wish I had seen more studies using a similar form of expressing data. The informational videos actually used more offensive words and were more clustered than the conspiracy theory videos. A lot of the information in this graphic seemed obvious (conspiracy theory comments were more concerned with foreign entities), but much of the data we could pull from it was useful. I will definitely be looking into making cluster graphs like this a part of my project.

Read More

[Reflection #2] – [1/31] – [Matthew, Fishman]

Quick Summary

Why is this study important? With the sheer quantity of news stories shared across social media platforms, news consumers must use quick heuristics to decide if a news source is trustworthy. Horne et al. set out with the task of finding distinguishing characteristics between real news and fake news, to help news consumers be better able to distinguish the two.

How did they do it? They studied stylistic, complexity, and psychological features of real news, fake news, and satire articles to better help them classify each from one another. Horne et al. then used ANOVA tests on normally-distributed data sets and Wilcoxon rank sum tests on non-normally distributed features to find which features differ between the different categories of news. They then selected the top 4 distinguishing features for both the body text and title text of the articles to create an SVM model with a linear kernel and 5-fold cross-validation.

What were the results? Their classifier achieved 71% to 91% cross-validation accuracy over a 50% baseline when distinguishing between real news and either satire or fake news. Some of the major differences they found between fake news and real news is that real news has a much more substantial body, while fake news has much longer titles with simpler words. This suggests that fake news writers are attempting to squeeze as much substance into the titles as possible.

Reflection

Again, I was not as surprised by the outcomes of this study as I had hoped. It seems obvious that, for example, fake news articles would have a less substantial body and use simpler words in their titles than real news. However, the lack of stop words and length of fake news titles did surprise me; I had always associated fake news with click-bait, which usually has very short titles.

A few problems I had with the study included:

  • The data sets used. If the real enemy here is fake news, then why would the researchers use only a total of 110 fake news sources (in comparison to 308 satire news sources and 4111 real news sources). No wonder the classifier had an easier time distinguishing real news from the other two.
  • The features extracted. The researchers could have used credibility features or different user interaction metrics like shares or clicks to better distinguish fake news from real. If the study utilized more than just linguistic features, their classifier could have been much more accurate.

Going Forward:

Some improvements on the study (in addition to the dataset size and features extracted) could be to do some user research:

  • How much time do users spend reading a fake news article in comparison to real news?
  • What characteristics of a news consumer are correlated with sharing, liking, or believing fake news?
  • What is the ratio of fake news articles clicked or shared to that of real news?

Questions Raised:

  • Can we predict the type of user to be more susceptible to fake news?
  • How have fake news’ linguistics changed over the years? What can we learn from this is predicting how they might change in the future?
  • Should real news sources change the format of their titles to give lazy consumers as much information as possible without needing to read the article? Or, would this hurt their baseline as their articles might not get as many clicks if all the information is in the title?
  • Should news aggregates like Facebook and Reddit be using similar classifiers to mark how potentially “fake” a news article is?

Read More

[Reading Reflection 1] – [01/28] – [Matthew, Fishman]

Quick Summary:

After conducting the largest study on news producers/consumers on Twitter by crawling millions of tweets from 5,000 Twitter accounts as well as those of over a million news consumers, Bagdouri et al. were able to extract some interesting information about Arab and English news producers and the way they broadcast/interact with their consumers. They utilized Welch and Kolmogorov-Smirnov tests to quantify statistically significant differences in eight comparisons.

Some key points of interest include:

  • News organizations tweet more often that journalists, using more hashtags and links
  • Journalists seem to interact with their consumers much more often than organizations
  • English journalists are more likely to be verified than their Arabic counterparts
  • Arab journalists are much more distinguishable from their consumers than English journalists

Reflection:

RThis study was an important step in understanding how news producers’ function is changing in our society. With all the “fake news” going around, it is imperative that we better understand these changes to make sure they are for good, and not for the worse.

Throughout the paper, I found a lot of the results unsurprising (ex: Journalists tweet from their phones more often than news organizations, British and Irish journalists are similar), but there were a few results that peaked my interest.

  • Organization post 3x more, share more links, and post more hashtags than journalists. This leads one to believe that news organizations are using Twitter as a medium to GAIN ATTENTION, rather than perhaps focusing on conveying quality information. And, it appears to work; news organizations have a significant amount more followers than journalists.
  • Journalists mention almost 2x more and reply much more often to their consumers. They also use more personal language. Clearly, journalists interact more with their consumers… but is this a good thing, or is it concerning? It could be that journalists are furthering an investigation on news, they could be looking to further their personal political agenda by interacting with their consumers, OR they may just be trying to create a closer relationship with their consumers. This would be something I personally would find interesting to further explore: What are the motivations behind journalists responding to their consumers so often?
  • English Journalists are more often verified than Arab Journalists. My concern with this statistic is that there could be a myriad of reasons for this. Is Twitter not paying close enough attention to the Arab world to endorse their news sources? Or are they ignoring most non-English speaking countries? Could English journalists be more verified and followed because they reach a greater audience? Is it because more English speakers use Twitter?
  • Arab journalists are 22 times more likely to be verified than their consumers. In comparison, English journalists are only 5 times more likely to be verified than their consumers. This indicates that Arab journalists are much more distinguishable from their consumers- but why? It could be that there are more casual users in the Arab world, or maybe that there are less high-profile Arab celebrities in comparison to their journalists. Either way, this is another question I’d like to see further explored.
  • Finally, I found it interesting was that almost 3% of English news consumers studied were verified. I was SHOCKED to hear that 1-in-34 news-consuming English twitter users were verified- am I next to get the blue checkmark? Or one of the other 34 people in this class?

After examining this study, I feel much more informed about how to disseminate information from various news sources on Twitter, and hope that a more detailed study is done on some of the specific questions asked in my reflection.

Read More