Reflection #2 – 08/30 – [Viral Pasad]

Justin Cheng, Cristian Danescu-Niculescu-Mizil, Jure Leskovec (2015) – “Antisocial Behavior in Online Discussion Communities”- Proceedings of the Ninth International AAAI Conference on Web and Social Media.

TThe paper discusses about the analysis and early detection of Antisocial Behaviour in Online Discussion Communities. They analyzed the user data of three Online  Discussion Communities, namely, IGN, CNN, and Breitbart. They mention that link spammers and temporary bans have been excluded from the study. However, antisocial behavior would also involve the posting of media often found unpleasant by the community which would be out of the scope of this study. Further, the metrics they use are feature sets that can be classified into Post, Activity, Community and Moderator Feature Set, with the strongest being Moderator and Community Features respectively. They used a random forest classifier. They also used a bag of words model that used logistic regression trained on bigrams, which in spite of performing reasonably well, is less generalizable across communities.

 

  • The paper repeatedly mentions and relies heavily on Moderators in the Online Discussion Community. It may be the case that the Online Communities that the study was conducted upon had reliable moderators, but that need not be the case for other Online Discussion Platforms.
  • Going back to the last discussion in class, In a platform which lacks Moderators, a set of (power-)users with reliably high karma/reputation points could perhaps be made to ‘moderate’ or answer surveys about certain potential Future Blocked Users (FBUs).
  • The early detection of users, begs the question, how soon would be too soon to ban these users or how late would be too late? Furthermore, could an FBU be put on a watchlist after having received a warning or some hit to their reputation? (Extrapolating from the point unfair draconian post deletes with some users making their writing worse, it could also be possible that warnings make them harsher).

But this would also probably eliminate some fraction the 20% of the false positives that get identified as FBUs.

  • The study excluded the occurrences of multiple/temporary bans from the data, however, studying temporary bans could provide more insight regarding behavior change, and also, if temporary bans would worsen their writing just as well as unfair post deletion.
  • The paper states that “the more posts a user eventually makes, the more difficult it is to predict whether they will get eventually banned later on”. But using a more complex and robust classifier instead of random forest would perhaps shed light on behavior change and perhaps even increase the accuracy of the model!
  • Further, we could also learn about the role of communities in incubating antisocial behaviour by monitoring the kind of ‘virtual’ circles that the users interact with after the lift of their temporary ban. It would provide information as to what kind of ‘virtual’ company promotes or exacerbates antisocial behaviour.
  • Another useful insight for the study would be to study, self deletion of posts by the users.
  • Another thing to think about is the handling of false positives (innocent users getting profiled as FBUs) and also false negatives (crafty users who instigate debates surreptitiously or use cleverly disguised sarcasm) which the model will be unable to detect
  • Furthermore, I might be unnecessarily skeptical regarding this but I believe that the accuracy of the same model might not be translated on to other communities or platforms (such as Facebook or Quora or Reddit which cater to multi/different domain discussions and have different social dynamics as compared to CNN.com, a general news site, Breitbart.com, a political news site, and IGN.com, a computer gaming site.

But then again, I could be wrong here, thanks to

  • Facebook’s Comment Mirroring and RSS Feeders, due to which most of Facebook Comments would also get posted on the CNN or IGN threads. 
  • The feature set used in the study which covers the community aspects as well.

Read More

Reflection #2 – [08/30] – [Lindah Kotut]

  • Justin Cheng, Cristian Danescu-Niculescu-Mizil, Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.”

Brief:

Cheng et al considered discussion posting from CNN, Breitbart and IGN to study anti-social behavior — mostly trolling, using banned users from these discussion  communities as the ground truth. They applied retrospective longitudinal analysis on these banned users to be able to categorize their behavior. Most of hypothesis about behaviors: change in posting language and frequency, community chastisement and moderator intervention by issuing warnings, temporary or permanent banning – all bear out to be useful markers in creating a classifier that could predict a Future Banned User (FBU) within a few posts.

Reflection:

Considering the anti-social markers and other factors surrounding the posters, we can reflect on different facets and their implications on the classifier and/or the discussion community.

The drunk uncle hypothesis: A cultural metaphor of the relative who makes a nuisance of themselves at formal/serious events (deliberately?) is an equivalent anti-social behavior to online trolls as defined by Cheng et al. (they are given multiple chances and warning to behave accordingly, they cause chaos in discussions, and the community may tolerate them for a time, before they are banned). Questions surrounding the drunk uncle serves as an excellent springboard to query the online troll behavior:

  • What triggered it? (what can be learned from the dileanating point between innocuous and anti-social posts?)
  • Once the drunk uncle is banned from future formal events, do they cease to be the ‘drunk uncle’? — this paper considers some aspect of this with temporary bans. On banning, does the behavior suddenly stop, and the FBU is suitably chastised?

Hijacked profiles and mass chaos: The authors did not make any assumption about the change of posting behavior/language — a troll marker. They only made observations that such behaviors could be used to predict a FBU, but not that the account could have been compromised. I point to the curious case of the Florida dentist posting markedly different sentiments on Twitter  (an intrepid commenter found that the good dentist had bee dead for 3 years, and included an obituary conveniently bearing the same picture as the profile. With this lens in mind:

  • When viewing posts classified to be by FBUs, and given the authors claim of generalization of their model, and swiveling the lens and assuming commenters to be in good faith and a sudden change in behavior an anomaly, what tweaks would need to be made in order to recognize hijacked account (would other markers have to be considered sch as time difference, mass change of behavior, bot-like comments)?
  • The model heavily relies on moderator to classify FBUs, and given the unreliable signals of down-voting, what happens when a troll cannot be stopped? Do other commenters ignore the troll, or abandons the thread entirely?
  • On Trolling-as-a-service, and learning from the mass manipulation of Yelp and Amazon reviews whenever a controversy linked to a place/book (and how the posters have become more sophisticated at beating the Yelp classifier), (how) does this manifest in commenting?

The Discus® Effect: The authors used Discus (either partly or wholly) for this work, and proposed looking at other online communities to challenge both the generalizability of their model, and to observe differences considering a specialized groups. There is another factor to consider in this case: Since the commenters are registered to Disqus and the platform is used by a multitude of websites…

  • What can be learned about a FBU from one community, assuming CNN was using Disqus, and how this behavior transferred to other sites (especially since all comments across different sites are viewable from the users account)?

 

Read More