[Reading Reflection 3] – [2/4] – [Henry Wang]

Article 1: “Early Public Responses to the Zika-Virus on YouTube: Prevalence of and Differences Between Conspiracy Theory and Informational Videos”

Summary

In this paper, the researchers analyze the differences between a dataset of Zika-virus videos. The videos that were analyzed were relatively popular at 40,000 or more views and broadly classified into two distinct groups: informational and conspiracy theory. The main research questions are focused on the differences between the two groups of videos as well as the differences between the reactions to the two groups of videos. The investigators of this paper used quite a few different analysis methods, in particular topic modeling and semantic network analysis was used for comment/reply analysis.

Reflection

This article was an interesting change of pace focusing now on conspiracy theories as it relates to the real interpretation of events. This particular topic has always been interesting to me and I definitely feel like it could be a potential research topic. Conspiracy theories involve far-fetched ideas, for example the world is flat or Australia does not exist. Anyone can say those words, but how can we analyze the behaviors and sentiments of those who buy into those theories?This is clearly a very tough question to answer, and based on the results of this paper it is disappointing to see that the investigators of this research paper did not find significant differences. 

One issue I found with the paper is the researchers never explain whether or not trolls may impact the comment analysis. The researchers cleaned up the comment section by doing typical things such as removing punctuation, making words lowercase, etc, but do not account for troll-interactions with videos. YouTube’s comment section is un-moderated, for the most part. How can the researchers be sure comments that they analyzed were authentic?

Additional Questions

  • What differences in user reactions would we see if we analyzed posts that referenced these Zika videos from another platform (for example Facebook/Reddit post linking to the video)?
  • YouTube’s recommender system is personalized so people who engage with specific content see related videos recommended, such as conspiracy-based videos. How can we stop the spread of misinformation in a platform like YouTube?


Article 2: “Automated Hate Speech Detection and the Problem of Offensive Language”

Summary

This article addresses automating hate-speech detection using a classifier to classify Twitter tweets as hate-speech, offensive but not hate-speech, or neither. Previous studies have combined the first two categories into one broad category, and though there is no official definition for hate speech, the researchers in this paper attempt to build such a classifier that is able to accurately classify between the three categories. The investigators tried different models and finally proceeded to use a logistic regression model for the dataset.

Reflection

The discussion of the model used was relatively brief, and knowing that this is a research paper I would have liked to know more about why the investigators chose to first test “logistic regression, naïve Bayes, decision trees, random forests, and linear SVMs” because to me it’s not immediately clear what all of these tests have in common and how they are all suitable choices for the data. 

What was most interesting to me was the fact that with the researcher’s model they discovered that rare types of hate speech were incorrectly classified. This is most interesting to see because it seems like the model would only be useful for classifying if a tweet is hateful based on whether or not it’s one of the more prevalent types of hate speech on Twitter.Future work should focus on identifying causes of this misclassification and give more discussion on this problem instead of referencing another researcher’s paper that had the same issue. 

Additional Questions/Research

  • Future work could be to combine aspects of a Twitter user such as age of account, location, etc. with this analysis and see if this contributes to better or worse classification.
  • Twitter is a social platform network so naturally people might monitor their language. How can we apply these same techniques for gaming platforms that have online chatrooms and similar environments to automate hate-speech detection and are the systems in place (e.g. auto banning) sufficient?
  • Can we use a similar approach of hate-speech detection for verbalized content (e.g. voice chat in videogames)? 



Leave a Reply

Your email address will not be published. Required fields are marked *