Reading Reflection #4

February 7, 2019February 7, 2019 Nathan Gibson Leave a comment

Analyzing Right-wing YouTube Channels: Hate, Violence and Discrimination

In this paper, YouTube videos and comments were analyzed using a multi-layered approach in order to try to observe trends related to hate, violence, and discrimination. YouTube, a video sharing website, has brought up concerns regarding its potential to be used as a platform for extremist actors to spread partisan and divisive content, which is sometimes untrue. Researchers looked to explore whether popular right-wing YouTube channels exhibited different patterns of hateful, violent, or discriminatory content, compared to baseline channels. They collected 3731 right-wing videos and 5,072,728 corresponding comments, as well as 3942 baseline videos and 12,519590 corresponding comments. The following research questions were asked:

“Is the presence of hateful vocabulary, violent content and discriminatory biases more, less or equally accentuated in right-wing channels?”

“Are, in general, commentators more, less or equally exacerbated than video hosts in an effort to express hate and discrimination?”

A unique aspect of this paper was the three-layered approach. Lexical analysis was used to carry out comparisons of semantic fields of words. Topic analysis was used to gather the prevalent topics in each group. Implicit bias analysis was used to measure implicit biases in the text. The researchers found that the right-wing channels included a higher percentage of negative words such as “aggression, kill, rage, and violence”. The baseline channels were found to include a higher percentage of fields like joy and optimism. High evidence of hate was not found for right-wing or baseline videos. Bothe categories were also found to exhibit a discriminatory bias towards Muslims. Comments were found to have more words related to aggression, rage, and violence. Seventy-five percent of right-wing videos were shown to have more Muslim bias in the captions than the comments.

I liked the paper, although not much of the results surprised me. I struggled to understand what the researchers intended the impact to be. In the conclusion they mentioned that these findings contribute to a better understanding of the behavior of general and right-wing YouTube users. They mentioned in the introduction that there are concerns about YouTube being used as an easy platform to spread hateful, violent, and discriminatory content, but did not elaborate in the conclusion how their work impacts this concern. I think that by knowing the trends that are present, more informed content sharing can be done and steps can be taken by others to avoid encouraging hateful or harmful content sharing.

I was surprised by the method of data collection. The researchers used InfoWars as a seed and selected other channels favored by the founder as the other right-wing videos. I thought that this process could have been done more methodically. In addition, they did not specify why they chose k=300 for the topic analysis. They may have gotten different results for different values of k, and did not explain why they selected this as the best option.

I like the three-layered approach and I like it could be useful for other studies that involve text. Future work could include applying these techniques to twitter or reddit data, or doing a similar study to this but involving other topics.

Reading Reflection #3

February 5, 2019February 5, 2019 Nathan Gibson Leave a comment

Early Public Responses to the Zika-Virus on Youtube: Prevalence of and Differences Between Conspiracy Theory and Informational Videos

This paper aimed to analyze the public response to different YouTube videos concerning the Zika virus. Conspiracy theory and informational videos were analyzed. The paper defined a conspiracy theory as “explanations for important events that involve secret plots by powerful and malevolent groups”. Since the spread of conspiracy theory videos is harmful because they contain misleading or untrue information and distract from important health messages, the researchers looked to analyze the sentiment and content of user responses so that implications for online health campaigns and knowledge dissemination are known. False news, of which conspiracy videos fall under, has been shown to spread faster and father than true news. The researchers used 35 of the most popular Zika-virus YouTube videos for their study. They used metrics such as views, comments, replies, likes, dislikes, and shares and compared them for conspiracy and informational videos. They found that there was no statistically significant difference between the two groups of videos. One conclusion was that users respond in similar ways in terms of these metrics to both types of videos. In addition, topic modeling showed that informational videos center around the causes and consequences of the virus while conspiracy videos focus on unfounded theories.

I did not find this paper to provide any surprising results. I think the most significant conclusion was that users responded in similar ways in terms of views, shares, and likes to both types of content. This means that both types of content spread in similar ways. The researchers found that Zika virus video comments were all slightly negative on average, which contradicts prior research that found that false news triggers more negative sentiments than true news. However, they did not suggest why this happened. A takeaway from this study is that health organizations looking to spread helpful health information should give careful thought to how to target audiences and engage them. Future research could include exploring other subjects besides the Zika virus. Since some of the findings contradicted prior work, I think it would be interesting to see if this holds true for other topics. Also, the most effective techniques for spreading true news could be studied. Is it effective to debunk conspiracy videos? What is the best way to engage views with true news?

Automated Hate Speech Detection and the Problem of Offensive Language

Hate speech classifier algorithms have been created before with limited accuracy. Hate speech is particularly challenging to classify because it overlaps with offensive language and context must be take into account when analyzing it. This paper looks to create a hate speech detection model that distinguishes between hate speech and offensive language, with the goal of reaching higher accuracy. They used logistic regression on a sample of about 25,000 tweets to train the model.

I thought the paper had significant implications since many parties are interested in flagging hate speech, such as Twitter, Instagram, or countries with hate speech laws. The researched shared a significant amount of challenges related classifying hate speech. I found it interesting that the definition of hate speech is not well defined and that the researchers had to establish a definition for the premise of this paper. In addition, they found and mentioned several tweets that were misclassified by human coders. I seems difficult to train an algorithm to classify something that is not well defined and is often done with error by even humans. Future work could involve similar research taking into account human biases and using more data. I think that a more established definition of hate speech is necessary before these algorithms can be more accurate.

Reading Reflection #2

January 31, 2019January 31, 2019 Nathan Gibson Leave a comment

Fake news is an increasingly frequent problem and has a negative effect on readers. Horne and Adali used three data sets to try to find distinguishing characteristics of fake news, real news and satire. They looked to answer the question, “Is there any systematic stylistic and other content differences between fake and real news.” The work they did is important since misleading or incorrect information has been found to have a higher potential to become viral. It is important to be able to identify fake news and understand what characteristics set it apart. The authors had the following findings.

-The content of fake and real news is substantially different.

-Titles are a strong differentiating factor.

-Fake content is more closely related to satire than to real.

-Real news persuades through arguments, while fake news persuades through heuristics.

I thought that the paper was well done. The most surprising findings to me were that real news persuades with arguments and fake news persuades through heuristics. The Elaboration Likelihood Model was interesting and future work could be done to further study the effect of fake news on readers and how fake news is perceived. I found it interesting that so much can be determined from the titles of articles. If the intent of fake news publishers is to spread incorrect information, could they change their titles and structure to more closely match real news in order to spread misinformation more effectively? In addition, I thought that the data sets used had limitations and that future work would benefit from more comprehensive data. The authors also acknowledged this. A big challenge seems to be deciding how to classify fake news and real news consistently. Future work could be done to define this further.

Reading Reflection 1

January 29, 2019January 29, 2019 Nathan Gibson Leave a comment

Twitter is an increasingly popular platform for information and news sharing. Prior research has studied the evolving usage behaviors of this new platform. Bagdouri aimed to provide a bird’s eye view of journalists’ use of twitter by collecting and analyzing a large dataset of tweets from two regions, three user account category types, and three media types. The paper explores and compares the differences in usage between these groups with these characteristics, using more data and carrying out a more comprehensive analysis than previous work. The following questions were asked.

-Do journalists engage personally with their audience compared to new organizations?

-Do observations about English journalists apply to journalists from different regional, cultural, and lingual backgrounds?

-Do journalists use Twitter in a manner dissimilar from news consumers, and do these (dis)similarities hold across different regions?

-Are journalists a homogeneous group, or do they differ as a function of the type of the news outlet they work for?

-To which extent do journalists who speak the same language, but belong to different countries share similar characteristic?

Bagdouri collected a large set of tweets and extracted eighteen features with which the analysis was performed. Journalists were found to exhibit more targeted, personalized behavior, while news organizations more commonly use a more official, formalized style. Arab journalists were also found to share more tweets than English journalists and their audience appears to react positively. Finally, print and radio journalists were found to be the most distinguishable groups, while television and radio journalists exhibit similar behavior.

Bagdouri made some interesting conclusions related to Twitter use, but I felt that more explanation could have been included to suggest what the impact of the work is. It was mentioned in the introduction that the findings of the paper could be used to design more customized tools for the referenced group of professionals. However, it is unclear to me what tools are being referred to or how they could be customized. Could Twitter use these findings related to specific user groups to develop new features for these particular users?

I would be interested in seeing additional work exploring the impact of the differences in communication styles between journalists and news organizations. In particular, how do audiences respond to more formalized, official communication compared to more personal messages? A study looking at tweets with identical content but different writing styles would be interesting.

Finally, future work could focus on why Arab journalists were found to tweet twice as often, share 75% more links and include 39% more hashtags. What are the contributing factors? Arab audiences were found to still have a positive reaction to the greater number of tweets, but additional analysis of audience perceptions could be carried out.