Paper 1: Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.” Icwsm. 2015.
Cheng et. al. in this impressive and thorough work demonstrate a way of observing and predicting antisocial behavior on online forums. They study banned users from three communities, retrospectively, to identify antisocial features of their online activity. Keeping with the Donath’s paper, we can find the similar definition of antisocial behavior here, where users act provocatively posting aberrant content. One salient strength of this paper is well-defined problem setup and normalizing measures taken before the analysis to compensate for the peculiarities of the social media. For example, they consider strong signals such as banning and post deletion to pre-filter the bad users and also perform qualitative analysis (human evaluation) to assess the quality of posts. Their observation about bimodal post deletion distribution symbolizes the complexity of overall online interactions. It also suggests that antisocial behavior can be innate as well as adaptive. Based on the paper there are several interesting directions that can be explored.
What are the assessment signals within a group of trolls? The paper mentions several signals of antisocial behavior such as down-voting, reporting, blocking, banning etc. These signals are from the point of view of normal users and community standards. However, it will be interesting to observe whether such users identify each other and show group trolling behavior in communities. The paper mentions that the FBUs are able to garner an increased response by posting provocative content. Does this contention increase if there are multiple trolls involved on the same thread? If yes, how do they identify and support each other?
What about the users that were not banned, but are extremely antisocial on occasions? Consider a scenario in which there are two users, U1 and U2. U2 frequently trolls people of different communities by posting harmless but annoying and argument fuelling content. Based on the observations made in the paper, U2 will most likely get banned because of the community bias and post deletion. Now consider U1 who posts rarely but with extremely offensive content. U1 will most likely have their posts deleted without attracting much attention. Comparing by the number of posts deletions (which is the driving factor of this paper’s predictive model) U2 will have more likelihood of getting banned. But, which one of the two (U1 and U2) is actually more dangerous for the community? The answer is, both! To observe both types of behaviors, there should be multiple granularity levels while analyzing antisocial behavior. Per post, per thread and per user. Analyzing hateful or flaming language in individual posts can have some weight in the predictive model for deciding whether the user is antisocial or not.
Finally, will moderators be biased if they are exposed to the prediction results based on the first five posts of the user? In a scenario like this where banning and blocking infringes on the freedom of speech of the user, knowing the likelihood of a particular user being “good” or “bad” might increase community bias and the amount in which these users get censored. So, features based on more linguistic analysis are definitely relevant in which users the moderators should be warned about.
Lastly, there are few more specific questions that I want to mention:
- Were the Lo-FBUs slower to get banned compared to the Hi-FBUs. Also, did they have a longer/shorter lifespan compared to NBUs and Hi-FBUs. This analysis might give clues about how to build more accurate predictive models for Lo-FBUs.
- Why is Krippendorff’s alpha so low? (And how was it accepted?)
- Can we also quantify the sensitivity of the community and use it as a feature for prediction? (Sort of customizing the banning algorithms as per the individual community’s taste. )
- How to define antisocial behavior in anonymous communities and inherently offensive communities like 4chan?