Reading Reflection #4 – [02/07] – [Alon Bendelac]

Summary:

This paper studies the presence of hateful, violent, and discriminatory content in YouTube channels of two categories: right-wing and baseline. The study analyzes the lexicon, topics, and implicit biases in the text of the videos. More specifically, the text in video titles, captions, and comments were studied. The dataset used consists of over 7,000 videos and 17 million comments. It was found the right-wing channels contain more “negative” words, contain more topics about war and terrorism, and are more discriminating against minorities such as Muslims and the LGBT community.

Reflection:

Compare against left-wing channels: Instead of comparing right-wing YouTube channels to the rest of YouTube, I think the study should have narrowed their baseline to a set of left-wing channels. This would make the comparison more symmetric and would emphasize the results.

Video Transcript: A significant aspect of a YouTube video is the transcript (i.e. what is being said). I think if the study included that they would have a lot more data to look at.

I think the introduction could be improved. Although the idea of the study is clear, the introduction did not specify a motivation for studying this topic. It did not state in what kind of real-world application this study can be used. Also, related work should have been at the beginning of the paper, not at the end.

In addition to looking at specific words, I think the study should look at the presence of n-grams.

Data over time: Is there a pattern to the findings when looked over an extended time period? I think it would be interesting to do a similar study that looks at how the contents of right-wing channels change over time, and to see how the findings relate to real-world events.

In order to study the content of the video itself, and not just the text surrounding it, I think crowdsourcing could be used to label videos with tags that describe their contents. This dataset could be used to study any other classifications of YouTube videos.

Blocked words: Channels can create a list of words that they want to review before allowing in the comments section. This might limit what users are able to say. Do the right-wing channels block more, or less comments than the baseline channels? How might this impact the results of the study?

Verified channels: Is the percentage of channels that are verified different in the two categories (right-wing and baseline)?

Read More

Reading Reflection #3 – [02/05] – [Alon Bendelac]

Summary:

Automated Hate Speech Detection: This study is about differentiating hate speech from other forms of offensive language. This study uses crowd-sourcing to create a dataset of tweets classified into three categories: hate speech, offensive language, and neither. Multiple models were tested: logistic regression, naïve Bayes, decision trees, random forests, and linear SVMs. The results show that 40% of hate speech was misclassified as offensive or neither.

Early Public Responses: This paper studies the presence of conspiracy theories on YouTube, specifically in Zika virus-related videos. The study looks for differences between user activity, sentiment, and content of two classifications of videos: informational and conspiracy. It is found that user activity and sentiment are similar between the two classifications, but content is different.

Reflection:

Accuracy of CrowdFlower: I wonder how similar is the labeling of different CrowdFlower workers. This could be tested as follows: Let all the workers classify the same set of 100 tweets. For each tweet, calculate the variance in the classifications made by the workers. High variances mean that the workers are classifying the same tweets differently.

Flagged posts: I think it would be interesting to investigate patterns in posts that get flagged for review or reported as inappropriate. Do people tend to flag posts excessively, or not enough? I think these results might affect how social media sites, such as Facebook and Twitter, develop policies on hate speech.

Hate speech in retweets and replies: The paper didn’t mention if the dataset they studied contained only original tweets, or also retweets and replies. I think it would be interesting to study how hate speech differs between these types of tweets. Where is hate speech most prevalent on Twitter?

I think the conclusion of the “Automated Hate Speech Detection” study can be improved. The significance of the findings and future work should both be a lot clearer and concrete.

In the “Early Public Responses,” I think the data could be presented better. Bar graphs would probably be easier to understand than tables.

Small sample size: The sample size is very small (n=23 for information and n=12 for conspiracy). I think the paper should have talked more about how this might affect their results.

Conspiracy theories on other types of social media platforms: I think the same type of study can be done on social media platforms other than YouTube. For example, we can study the prevalence of conspiracy theories in Twitter. Number of views would be replaced by number of retweets, and replies and likes would stay the same.

Read More

Reading Reflection #2 – [01/31] – [Alon Bendelac]

Summary:

The issue of fake news has received a lot of attention lately due to the 2016 US Presidential Elections. This research paper compares real, fake, and satire sources using three datasets. The study finds that fake news is more similar to satire news than to real news; and uses heuristics rather than arguments. It is also found that news article titles are significantly different between real and fake news. Three categories of features were studied: stylistic, complexity, and psychological. The study uses the one-way ANOVA test and the Wilcoxo rank sum test to determine if the news categories (real, fake, and satire) show statistically significant differences in any of the features studied. Support Vector Machine (SVM) classification was used to demonstrate that the strong differences between real, fake and satire news can be used to predict news of unknown classification.

Reflection:

Punctuation: In Table 3(c), one of the stylistic features is “number of punctuation.” The study looks at all types of punctuation as a whole, and only considers the total number of punctuations. I think it would be interesting to look at specific punctuation types separately. For example, maybe fake news articles are more likely than real news articles to have an ellipsis in the title. An ellipsis might be a common technique used by fake news organizations to attract readers. Similarly, question marks in the title might also be commonly used in fake news articles.

Neural networks: The study used Support Vector Machine (SVM) to predict if an article is real or fake. I wonder how a neural network, which is more abstract and flexible than an SVM, would perform. In Table 6, only two of the three categories (real, fake, and satire) are tested at a time, because an SVM is designed for two classes. A neural network could be designed to classify articles into one of the three categories. This would make more sense than an SVM, since we usually can’t eliminate one of the three categories and then test for just the other two.

Processing embedded links: This study only looks at the bodies of articles as plain text, without considering possible links within the text. I think looking into where embedded links direct you could help detect fake news. For example, if an article contains a link to another article known to be fake news, then the first article is most likely also fake news. The research question could be: Can embedded links be used to predict if a news article is fake or real?

Number of citations and references: I believe the real news stories are more likely to contain citations, references, and quotes than fake news stories. Number of quotes was one of the stylistic features, but number of references was not studied. A reference could be to a study or another news article related to the one in question. A reference could also be to a past event.

Read More

Reading Reflection #1 – [01/29] – [Alon Bendelac]

Summary:
This research paper compares microblogging patterns of twitter accounts. The accounts are categorized according to country (United Kingdom, Ireland, Gibraltar), media format (radio stations, TV channels, newspapers, magazines), and profile (journalists, organizations). Eighteen numerical features were tested with two statistical tests: Welch’s t-test (to test if two samples have equal means) and Kolmogorov-Smirnov test (to test if two samples belong to the same distribution). Journalists were classified into two cultures, English and Arab, and comparisons between the two cultures were analyzed. The study found differences between how organizations and journalists disseminate news, as well as differences between how Arab and English journalists disseminate news.

Reflection:
Categorizing journalists by political party affiliation: The paper compares the behaviors of news organizations and individual journalists. It would be interesting to compare journalists by classifying them according to political party: republican, democrat, or independent. Features such as number of followers, percent verified, and number of hashtags could be compared between republican and democratic Twitter accounts. The new research question could be: To what extent do Democratic and Republican journalists share common characteristics?
Percent verified: It would be interesting to investigate whether the percentage of journalist accounts that are verified differs between different countries or regions. For example, do English-speaking countries have higher verification percentages among journalists than non-English-speaking countries do?
Customized tools: The paper suggests that the findings of this research can be used to develop “more customized tools” for journalists. I think the author should have expanded on that in the conclusion section, because it is difficult to understand what exactly they had in mind with this suggestion. I think one idea could be to create a program that crawls Twitter to determine which hashtags in a journalist’s region are most popular, in order to give the journalist recommendations as to what hashtags to use. Similarly, another program could crawl Twitter to determine whether journalist tweets with questions are more popular than tweets without questions, in order to recommend to a journalist whether they should ask questions in their tweets.
Connection between journalist and news organization: The study looks at journalists and news organizations as separate. However, most journalists work for one news organization. I think it would be interesting to look at each journalist’s connection to their news organization. One of the conclusions of the paper is that “organizations broadcast, journalists target.” Do journalists’ techniques of disseminating news more closely resemble the techniques of their news organization than other news organizations? Are there any similarities between a journalist’s account and its news organization’s account?

Read More