Reflection #4 – [02/07/2019] – [Numan Khan]

Analyzing Right-wing YouTube Channels: Hate, Violence and Discrimination

Summary

This study conducted research on determining if presence of hateful vocabulary, violent content and discriminatory biases are depicted in right-wing channels and if commentators are exacerbated by these videos to express hate. These research questions were answered by an analysis of similarities and differences between users’ comments and video content in a selection of right-wing channels and compared it a baseline set using a three-layered approach: analysis of lexicon, topics and implicit biases present in the texts. They collected right-wing videos from Alex Jones’ channel and other 12 channels supported by him. The researchers collected the baseline videos from videos posted in the ten most popular channels in the “news and politics” category.

Reflection

Overall, I found it interesting that this paper chose to analyze YouTube videos that are right-wing related instead of articles. There are countless right-wing articles being published by media outlets like Breitbart. In addition, plenty of people will always read text content such as new articles, magazines, and newspapers,. over viewing videos. Personally, I view YouTube videos as a form of content that individuals are consuming exponentially more every day in the past few years. Therefore, I believe it was a wise choice of the researchers of this paper to analyze videos versus articles.

After reading this paper, I view this research as being very valuable to society because YouTube as a platform has let right-wing voices be heard by bigger and bigger audiences. This is proven by the fact that “…findings of a 2018 newspaper investigation [32] which shows that YouTube’s recommendations often lead users to channels that feature highly partisan viewpoints – even for users that have not shown interest in such content”. This is especially a problem if behaviors associated to hate, violence and discriminatory bias are being supported by these videos which became the focus of this paper’s first research question.

This paper does a great job at utilizing their three-layered approach by thoroughly explaining the methodology and providing thoughtful reflections for analyzing lexical, topical, and implicit bias. While it seemed slightly obvious that right-wing videos would display more hate than the baseline videos, it was interesting that this paper was able to prove that rage and violence was displayed in the captions while swearing words were dominant in the comments. Another finding that made sense to me, was that right-wing videos were more specific than the baseline videos. Right-wing YouTubers want to target specific topics that their audiences would be interested in, rather than broad topics covered by the baseline videos. Another finding I was interested in was the implicit bias analysis. While I am not surprised that there was a greater bias concerning Muslims in right-wing videos compared to the baseline videos, I am surprised that the captions of right-wing videos were statistically higher than the comments which held higher discriminatory bias against LGBT people.

Further Questions

  • One of the future works proposed in this paper was the addition of a temporal component to their analysis. Would the temporal component in this paper’s research show correlation to recent big political events such as significant events that have occurred during the current presidency?
  • What results would we find if the three-layered approach used in this paper was conducted using left-wing YouTube videos?
  • How different with the results from this paper on YouTube videos from different platforms such as Twitter or Facebook posts of right-wing outlets?

Read More

Reading Reflection#3 -[2/5/19]-[Numan Khan]

Automated Hate Speech Detection and the Problem of Offensive Language

Summary:

This paper used crowd-sourcing to label a group of tweets into three different categories: containing hate speech, only offensive language, and those with neither. The researchers of this paper accomplished this by training a multi-class classifier to differentiate between these three categories. The main obstacle addressed was being able to identify hate speech versus offensive speech because there are similar overlapping offensive remarks used in both types of speech.

Reflection:

Something that I’m interested to see in the future is how major social media platforms will respond to the criticism they receive about regulating hate speech. Because of the increasing legal implications of individuals who post hate speech, Twitter and Facebook must be careful when identifying hate speech. If social media platforms autonomously removing posts, how accurate would their algorithms be? As we can see by numerous other studies done on identifying hate speech versus offensive speech, algorithms are still being improved. However, if they manually remove posts, would their removal rate be too slow compared to the rate of post being created that have hate speech? Whatever happens in the future, social media platforms must address the growing problem of hate speech in a careful manner.

Another thing that caught my attention in this paper was that they properly defined hate speech and the process the researchers used for labeling the data. By giving three or more coders their specific definition of hate speech for labeling each tweet, I believe their process makes a lot of sense and does a good job in making sure that they accurately label tweets for their classifier.

Lastly, I appreciate the fact that they used a variety of models to find which model(s) perform(s) the best, instead of simply choosing one or two models. However, one thing I am curious about what features that were used in the final model that was a logistic regression with L2 regularization.

Further Work:

I believe that some future work to improve the model from this paper are to check if the quotes from a song are being used for appreciation or hate speech and checking for cultural context for some offensive language. Furthermore, I am curious if the current definition for hate speech can be even more specific in order to improve the labeling of tweets, therefore, improve the classifier. Lastly, the best way of truly addressing hate speech is by understanding what the root cause is. Maybe by researching different media sources that incite hate we could try to better identify users that use hate speech instead of posts of hate speech.

Early Public Responses to the Zika-Virus on YouTube: Prevalence of and Differences Between Conspiracy Theory and Informational Videos

Summary:

This paper sought to research how much informational and conspiracy theory videos differ in terms of user activity such as number of comments, shares, and likes and dislikes. Furthermore, the also analyzed the sentiment and content of the user responses. They collected data for this study by finding YouTube videos with at least 40,000 views on July 11, 2016. Their search for YouTube videos resulted in a data set containing 35 videos. Their results were that 12 out of the 35 videos were focused on conspiracy theories. However, no statistical differences were found in the number of user activity and sentiment between informational and conspiracy theory videos.

Reflection:

In the present day, YouTube is one of the largest platforms where countless number of people are accessing and posting new video. It can be said that communication have been substantially influenced by platforms like YouTube since it is very easy for people around the world to post videos. With growth of YouTube comes a lot of challenges such as the Ebola outbreak in 2016. I appreciate the effort this study made in trying to differentiate information and conspiracy theory videos. The researchers of this paper provided detailed definitions on the two types of videos and clearly explained how their data collection process. Personally, I am surprised that the sentiment in both types of videos were similar–I had thought there would be a significant difference. However, this study had a small dataset and didn’t have strong arguments.

Future Work:

A sample of size of 35 seems too small when doing any sort of significance test. In the present day, YouTube is one the largest platforms for videos where numerous videos are being posted every hour and the researchers of this study found only 35 videos. My suggestion to these researchers are to increase their sample size by finding more videos. In addition, research any other features that can help when differentiating informational and conspiracy theory videos.

Read More

[Reading Reflection 2] – [01/31] – [Numan, Khan]

This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News

Summary:

This paper’s overall goal is to prove whether fake news differs systematically from real news in style and language use. The authors’ motivation for this goal is to disprove the assumption that fake news is written to look similar to real news. In other words, fooling the reader who doesn’t check for credibility of source and arguments mentioned in the article. The paper’s method of proving this assumption is false is by studying three data sets and their features using one-way ANOVA test and Wilcoxon rank sum test. The first data set is from Buzzfeed’s analysis of real and fake news from 2016 US Elections. The second data set has news articles on US politics from real, fake, and satire news sources. The third data set contains real and satire articles from a previous study. The paper chose to include satire articles in order to differentiate it from other papers on fake news. The paper concluded that fake news articles had less content, more repetitive language, and fewer punctuation than real news. Furthermore, fake news articles have longer titles, use few stop words, and fewer nouns compared to real news. When comparing fake news to satire news, Horne and Adali were able conclude that fake news is more similar to satire news than real news–disproving the assumption from the beginning of the paper.

Reflection:

The assumption that this paper is trying to prove wrong is a belief that I have. When fake news became more prevalent during the 2016 Presidential Election, I viewed those articles as trying to appear as real news but have a lack of credibility in the sources and arguments used. Initially, I found it interesting that they were trying to disprove this assumption because my point of view was that fake and real news were similar. However, I became fascinated with their inclusion of satire news in their data sets. I had never thought of comparing fake news to satire news. Because I agree with the way the paper defines satire news as “…explicitly produced for entertainment”. Why would fake news–that’s purpose is to deceive–be similar to news that is read for entertainment and mockery? But now thinking about it more and looking at the bigger picture, I have a much different view of fake news now after reading this paper. These fakek news articles are not only trying to deceive people but are created for parody purposes since satirical news can easily grabs peoples’ attention too. Therefore, it would make a lot of sense that fake news is similar to satire news.

While I don’t have much experience with Natural Language Processing (NLP) or Understanding (NLU), the features defined in this paper for the datasets obtained made sense to me. In other words, there seem to be no unnecessary or overlooked features for this paper. Being able to gauge word syntax, sentence & word level complexity, and sentiment, all make sense based on the goal of this paper which is to determine if fake news differs systematically from real news in style and language use. These features provide information from a high and low level for language analysis. Personally, due to my inexperience in this field, I would be eager to learn how to analyze natural language in Python in the future.

Something else I appreciated that Horne and Adali did was acknowledging that a statistical test won’t say anything about predicting classes in the data. Therefore, they used the statistical tests as a way of feature selection for their Support Vector Machine (SVM) model that would help them classify news articles based on small feature subsets. It was amazing that from these subsets that they used in their classifier significantly improved the prediction of fake and satire news where they were able to get between 71% and 91% accuracy in separating from real news stories. One question that I was curious about is Horne and Adali selection of features for their SVM. Why did they chose specifically the top 4 features from their hypothesis testing? Is it because they are trying to avoid over-fitting? Would we see a difference if they used Principal Component Analysis as way of telling which features would be the best in terms of classification of fake versus real news?

The reflection from the statistical tests based on their defined features were clear and made sense. Sometimes this paper effectively reflects on the results found, however, some of the reflections in this paper are not surprising. For example, I already knew that fake news titles tend to be shorter in content and have longer titles than real news. My belief–from before reading this paper–is that fake news intends to grabs readers’ attention through those long titles. Therefore, it makes sense that the writers of fake news articles packs as much info into the titles. This leads me to think that readers tend to be more interested in the titles of fake news compared to real news. This leads to me another question that could be a project idea. What can we do to help the general public easily detect real news versus fake news–when readers are obviously attracted to the titles of fake news articles? Should real news articles adjust their titles? While I critique that some of the reflection was obvious, a lot of the results found from the syntax features was very interesting such as fake news uses fewer punctuation and their titles uses fewer stop words but more proper nouns. Overall, this paper was very well written, could have had some more analysis based on their results, but I really appreciate them creating a working SVM model for classifying fake and real news.

Read More

[Reading Reflection 1] – [01/28] – [Numan, Khan]

Journalists and Twitter: A Multidimensional Quantitative Description of Usage Patterns

Summary:

This paper is an extension of work done by De Choudhury, Diakopoulos, and Naaman, where they trained a classifier that categorized Twitter accounts into organizations, journalists/bloggers, and ordinary individuals. However, this study seeks to prove statistical significance of eight different comparisons: journalists and news organizations, journalists and news consumers, four media types (newspaper, magazine, radio, television), and two regions (European English and Arabic speaking countries). Bagdouri chose to use Welch and Kolmogorov-Smirnov tests because of limitations a regular Welch’s t-test poses such as the assumption of normality of distributions and is constrained to the comparison of means.

Bagdouri was able to conclude that journalists target their communication and personally interact with their readers. In contrast, news outlets avoid personal style and broadcast their posts instead. From the region comparison analysis, Bagdouri determined that Arab journalists broadcast more tweets and their audience have a positive reaction to it. When comparing the different media types, print and radio journalists are the most dissimilar groups. It’s important to note that television journalists share similarities with radio and print journalists. For journalists of the same language but from different countries, in this paper’s case are British and Irish journalists, very few dissimilarities are observed.

Reflection:

The study states that “Journalists and organizations also differ in the medium used to publish their tweets. In fact, while they both use a desktop in about 30% of the time, mobile is the preferred medium for journalists (54.95%), and organizations tend to use special Twitter applications for posting more than 28% of the tweets”. Personally, I’m not surprised that the preferred medium for journalists is mobile. I would be very interested in what the age demographics of journalists are because I believe that many of the journalists in this paper are of Generation Y–who are often associated with the current technological revolution. My reasoning is that Generation Y are individuals that grew up along with the significant changes in technology, so they would most likely be more comfortable with posting on Twitter from their mobile compared to news organizations.

One conclusion made in this paper that stood out to me was that most journalists tend to target their communication and interact with their readers more than news outlets. However, Arab journalists go against this notion because they broadcast their tweets. The study mentions that “The broadcast communication behavior is evident for Arab journalists. They tweet more than twice as much as the English ones, share 75% more links, and use 39% more hashtags…For each original tweet, on average, Arab journalists receive over four times more retweets than the English ones do.” If Arab journalists are broadcasting more tweets, I am curious about what conclusions can be made from a comparison of Twitter features between Arab journalists and Arab news organizations. How similar are journalists and news organizations from Arab speaking countries? Furthermore, this study only covered European English and Arabic speaking regions. Consequently, what other regions have journalists broadcasting more tweets similar to Arab speaking countries? Do those regions receive positive reactions similar to Arab speaking countries?

Further Work:

  • After reading through Bagdouri’s study, I am very interested in the vast amount of features that can be extracted from posts on Twitter which makes me eager to use it as a media platform to extract data for my data science project this semester.
  • One of the features this paper utilizes to make many of their conclusions is “Targeted communication”. Specifically, the paper talks about mentions which indicate the average number of mentions of other users per original tweet for a given account and questions which indicates the ratio of original tweets that include a question mark in its Arabic or English form. It would be interesting to see if there is any pattern relating the individuals that journalists and news outlets mention in their tweets and how different are the types of questions journalists ask compared to news outlets.
  • This paper clearly emphasizes the fact that Twitter is becoming a primary platform for breaking news. However, there are other platforms for breaking news as well. For example, Facebook is a platform for which numerous individuals around the world use–including journalists and news outlets. While Facebook is clearly different Twitter, I would like to see if the conclusions made in this study hold up on Facebook. If not, which conclusions don’t hold up and why?
  • Lastly, with the rise of fake news in the present day, a question that arises is how much do news consumers trust journalists compared to news outlets? Consequently, how would we statistically prove the extent news consumers trust a news source? Furthermore, how are the features “audience perception” and “audience reaction” of journalists and news outlets changing over time? Is fake news affecting these features over time?

Read More