Reflection #2 – 08/30 – [Viral Pasad]

Justin Cheng, Cristian Danescu-Niculescu-Mizil, Jure Leskovec (2015) – “Antisocial Behavior in Online Discussion Communities”- Proceedings of the Ninth International AAAI Conference on Web and Social Media.

TThe paper discusses about the analysis and early detection of Antisocial Behaviour in Online Discussion Communities. They analyzed the user data of three Online  Discussion Communities, namely, IGN, CNN, and Breitbart. They mention that link spammers and temporary bans have been excluded from the study. However, antisocial behavior would also involve the posting of media often found unpleasant by the community which would be out of the scope of this study. Further, the metrics they use are feature sets that can be classified into Post, Activity, Community and Moderator Feature Set, with the strongest being Moderator and Community Features respectively. They used a random forest classifier. They also used a bag of words model that used logistic regression trained on bigrams, which in spite of performing reasonably well, is less generalizable across communities.

 

  • The paper repeatedly mentions and relies heavily on Moderators in the Online Discussion Community. It may be the case that the Online Communities that the study was conducted upon had reliable moderators, but that need not be the case for other Online Discussion Platforms.
  • Going back to the last discussion in class, In a platform which lacks Moderators, a set of (power-)users with reliably high karma/reputation points could perhaps be made to ‘moderate’ or answer surveys about certain potential Future Blocked Users (FBUs).
  • The early detection of users, begs the question, how soon would be too soon to ban these users or how late would be too late? Furthermore, could an FBU be put on a watchlist after having received a warning or some hit to their reputation? (Extrapolating from the point unfair draconian post deletes with some users making their writing worse, it could also be possible that warnings make them harsher).

But this would also probably eliminate some fraction the 20% of the false positives that get identified as FBUs.

  • The study excluded the occurrences of multiple/temporary bans from the data, however, studying temporary bans could provide more insight regarding behavior change, and also, if temporary bans would worsen their writing just as well as unfair post deletion.
  • The paper states that “the more posts a user eventually makes, the more difficult it is to predict whether they will get eventually banned later on”. But using a more complex and robust classifier instead of random forest would perhaps shed light on behavior change and perhaps even increase the accuracy of the model!
  • Further, we could also learn about the role of communities in incubating antisocial behaviour by monitoring the kind of ‘virtual’ circles that the users interact with after the lift of their temporary ban. It would provide information as to what kind of ‘virtual’ company promotes or exacerbates antisocial behaviour.
  • Another useful insight for the study would be to study, self deletion of posts by the users.
  • Another thing to think about is the handling of false positives (innocent users getting profiled as FBUs) and also false negatives (crafty users who instigate debates surreptitiously or use cleverly disguised sarcasm) which the model will be unable to detect
  • Furthermore, I might be unnecessarily skeptical regarding this but I believe that the accuracy of the same model might not be translated on to other communities or platforms (such as Facebook or Quora or Reddit which cater to multi/different domain discussions and have different social dynamics as compared to CNN.com, a general news site, Breitbart.com, a political news site, and IGN.com, a computer gaming site.

But then again, I could be wrong here, thanks to

  • Facebook’s Comment Mirroring and RSS Feeders, due to which most of Facebook Comments would also get posted on the CNN or IGN threads. 
  • The feature set used in the study which covers the community aspects as well.

Read More