Reflection #3 – [1/25] – Meghendra Singh

Cheng, J., Danescu-Niculescu-Mizil, C., & Leskovec, J. (2015, April). Antisocial Behavior in Online Discussion Communities. In ICWSM (pp. 61-70).

The paper presents an interesting analysis of users on news communities. The objective here is to identify users who engage in antisocial behavior like – trolling, flaming, bullying and harassment on such communities. Through this paper, the authors reveal compelling insights into the behavior of users who were banned. These insights are: banned users post irrelevantly, garner more replies, focus on a small number of discussion threads and post heavily on these threads. Additionally, posts by such users are less readable, lack positive emotion and more than half of these posts are deleted. Further, the reduction in text quality of their posts and the probability of the posts being deleted increase over time. Furthermore, the authors suggest that certain user features can be used to detect users that will be potentially banned. To this end a few techniques to identify “bannable” users are discussed towards the end of the paper.

First, I would like to quote from the Wikipedia article about Breitbart News:

Breitbart News Network (known commonly as Breitbart News, Breitbart or Breitbart.com) is a far-right American news, opinion and commentary website founded in 2007 by conservative commentator Andrew Breitbart. The site has published a number of falsehoods and conspiracy theories, as well as intentionally misleading stories. Its journalists are ideologically driven, and some of its content has been called misogynist, xenophobic and racist.

My thought after looking through Breitbart.com was, isn’t this community itself somewhat antisocial? One can easily imagine a lot of liberals getting banned in this forum for contending the posted articles? And this is what the homepage of Breitbart.com looked like in the morning:

While the paper itself presents a stimulating discussion about antisocial behavior in online discussion forums, I feel that there is a presumption that a user’s antisocial behavior always results in them being banned. The authors discuss that communities are initially tolerant to antisocial posts and users, and this bias can easily be used to evade getting banned. For example, a troll may initially post antisocial content, switch to the usual positive discussions for a substantial period of time and return to posting antisocial content. Also, what’s to stop a banned user from creating a new account and return to the community, I mean all you need is a new e-mail account for Disqus? This is important because most of these news communities don’t require the notion of reputation for posting comments on their articles. On the other hand, I feel that the “gamified” reputation system on communities like Stack Exchange would act as a deterrent against antisocial behavior. Hence, it would be interesting to find who gets banned in such “better designed” communities and are the markers of antisocial behavior similar to those of news communities? An interesting post here.

Another question to ask is are there deeper tie-ins of antisocial behavior on online discussion forums? Are these behaviors predictors of some pathological condition with the human posting the content? The authors briefly mention these issues in the related work. Also, it would be interesting to discover, if a troll on one community, is also a troll on another community? The authors mention that this research can lead to new methods for identifying undesirable users in online communities. I feel that detecting undesirable users beforehand is a bit like finding criminals before they have committed the crime, and there may be some ethical issues involved here. A better approach might be to looks for linguistic markers that suggest antisocial themes in the content of a post and warn the user of the consequences of submitting it, instead of recommending users to be banned to the moderator, after the damage has already been done. This also leads to the question that what are the events/news/articles that generally lead to antisocial behavior? Are there certain contentious topics that lead regular users to bully and troll others? Another question to ask here is: Can we detect debates in comments to a post? This might be a relevant feature that can predict antisocial behavior. Additionally, establishing a causal link between the pattern of replies in a thread and the content of the replies may help to identify “potential” antisocial posts. A naïve approach to handle this might be to simply restrict the maximum number of comments a user can submit to a thread? Another interesting question maybe to find out, if FBUs start contentious debates, i.e. do they generally start a thread or do they prefer replying to existing threads? The authors provide some indication towards this question, in the section “How do FBUs generate activity around themselves?”.

Lastly, I feel that a classifier precision of 0.8 is not good enough for detecting FBUs. I say this because the objective here is to recommend for banning potential antisocial users to human moderators, so as to keep their manual labor and having a lot of false-positives will defeat this purpose in some sense. Also, I don’t quite agree with the claim that the classifiers are cross-domain. I feel that there will be a huge overlap between CNN and Breitbart.com in the area of political news. Also, the dataset is derived from primarily news websites where people discuss and comment on a articles written by journalists and editors. These might not apply to Q&A websites (For E.g. Quora, StackOverflow) or places where users can submit articles (For E.g. Medium) or more technically inclined communities (For E.g. TechCrunch).

Read More

Reflection #3 – [1/25] – [Md Momen Bhuiyan]

Paper: Antisocial Behavior in Online Discussion Communities

Summary:
Although antisocial behavior in online communities is very common, most of the recent research on this subject has focused on qualitative analysis using a small group of users. This paper uses data from three online communities (CNN.com, breitbart.com, and IGN.com) for quantitative analysis to get a general understanding of the antisocial behavior of users. For the sake of comparison, the paper discusses two types of users: Future-Banned-Users (FBU) and Never-Banned-Users (NBU). By analyzing the post throughout their activity span on the forum authors find that posts by the FBUs are very different than other posts in the thread and harder to read. They also find that their quality of posts worsens over time. Censorship also plays a role in the guiding the writing style of the FBUs. FBUs post a lot and get a higher number of responses. Another finding of the study is that two types of FBUs exist in an online community: one with higher post deletion rate and other with lower. Finally, the authors use four types of features to create a classifier for predicting antisocial behavior: post feature, activity feature, community feature and moderator feature.

Reflection:
The paper explains the process of analyzing antisocial behavior starting from data preparation to analysis in great detail. One interesting aspect of the process was using crowdsourcing for initial classification. The authors’ analysis of the final classifier provided some interesting insight. For example, classifiers performance peaks on seeing the attributes from first 10 posts. This correlates with the idea that other community members judge FBUs in a similar fashion. The performance of the classifier on Hi-FBUs suggests that the classifier learns the post deletion ratio as one of the primary indicators which explains why prediction performance peaks at seeing first 10 posts. The authors’ analysis of the cross-platform performance of the algorithm was very intuitive. Although the prediction quality of the classifier is good enough, there remains the issue of application of such tools. Finally, this paper discusses a sensitive issue of antisocial behavior and creates a tool for prediction. Although the performance is good enough, still there is a necessity of the human factor in preventing such behavior.

Read More

Reflection #3 – [01/25] – [Patrick Sullivan]

Title: Antisocial Behavior in Online Discussion Communities.  Cheng et al. are exploring the dynamics of large online communities with members behaving undesirably.

The authors only looked into three news websites to learn about antisocial behavior. While each site has their own target audience to set it apart, there may be little other variation among them. Antisocial behavior is very present in other media forms, so the publication’s conclusions could be more generalized if they covered online communities on Facebook, Reddit, and Youtube. Perhaps this would show that there’s several more variants of antisocial behavior and its evolution in an online community that depends on the constraints and features of a platform. It also could help protect the results if one community was found to be dysfunctional or especially unique. I do not fault the authors for this oversight for two reasons: It is more difficult to structure and follow up a longitudinal study for several more changing social media platforms; and it is quite difficult to cross-compare all of the platforms I listed whereas Cheng’s platforms each have the same overall structure. I do wish that there was more discussion related to this idea then the small mention in the paper’s conclusion.

Cheng et al. also look into the changes that happen over time to an online community and the users that are eventually banned. While this can be used to predict growing antisocial behavior from within a community, it could be impacted by the overall climate of that platform. I imagine that these platforms undergo rapid changes to both moderation and antisocial activity when an event occurs. A sudden rise in concern over ethics and harassment surrounding video game journalism surely would have a profound effect on a social gaming news website’s community. Likewise, approval or disapproval of news network’s articles from a high-profile political leader would likely harness more attention, moderation action, or antisocial behavior. A longitudinal study might be measuring how an event changed a community (either temporarily or permanently), and not how the members of a community typically moderate and react to moderation. A simple investigation I did using Google trends shows a possibly significant spike in ‘gaming journalism’ interest in the latter half of 2014, and could very well have been captured in the 18-month window this longitudinal study took place. In addition, Google’s related topics during that time show terms like ‘ethics‘, ‘corruption‘, and ‘controversy‘ (see image source). These terms have special meaning and connection to the idea of online community moderation, and should not be taken lightly. The omission of even mentioning this event by the authors makes me question if they were so focused on the antisocial behavior that they did not monitor the community for events that could devalue their data.

Google Trends: 'Gaming Journalism' in 2014

Source: https://trends.google.com/trends/explore?date=2014-01-01%202014-12-31&q=gaming%20journalism

Read More

Reflection #3 – [1/23] – [Jiameng Pu]

Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.” ICWSM. 2015.

Summary:

Users contributions are an important part of kinds of social platforms, e.g., posts, comments, votes, etc. While most users are civil, few of the antisocial users greatly contaminate the environment of the internet. By mainly studying on users who were banned from specific communities and compare two user groups, FBUs(Future-Banned Users) and NBUs(Never-Banned Users), the authors try to characterize antisocial behavior, e.g., how FBUs write, how FBUs generate activity around themselves.  The “Evolution over time” analysis shows that FBUs write worse than other users over time and tend to exacerbate their antisocial behavior when there is more strong criticism in the community. By designing features based on the observations and then categorize them, the work can potentially help alleviate the burden of social community moderators from heavy manual labor. Besides, it proposes a typology of antisocial users based on post deletion rates. Finally, A system is introduced to identify undesired users early on in their community life.

Reflection:

The paper leads some extensive discussion and analysis on the topic of antisocial behavior, I highlight some points impress/inspire me most. First, the analysis about how to measure undesired behavior is a useful one in the data preparation section.  It reminds me that down-voting activities cannot be interpreted as undesirable in the context of “antisocial behavior”, which is a much narrower conception. Personally, I don’t use down-vote functionality that much when I browse Q&A websites like Quora, Zhihu, and StackOverflow. And it turns out many people also keep the same habit, which is a good instance where considering fewer features/data, i.e., report records and post delete rates, makes more sense.  Second, instead of predicting whether a particular post or comment is malicious, they put more focus on individual users and their whole community life, which is harder to analyze but bring more convenience to community moderators, since they can do their job like a real community police but not simply a cleaner.  Third, four categories of feature properly cover all the feature classes, but the author doesn’t mention some of potentially important features in Table 3, e.g., post comments, which could be categorized into post features; user’s followings and followers, which could be categorized into community features. Intuitively, these two features are strong indicators of the user’s properties — people of one mind fall into the same group and harsh criticism would show up in the comment area of malicious posts.

I notice that the author performs the above task on a balanced dataset of FBUs and NBUs (N=18758 for CNN, 1164 for IGN, 1138 for Breitbart), suggesting that these learned models generalize to multiple communities. Though the number of FBUs and NBUs is balanced, would the different number of user samples from three platforms influence the generalization of the resulting classifier? To my point of view, it’s more rigorous for the author to modify lopsided data samples or add more discussion about how data can be properly sampled.

Questions & thoughts:

  1. What’s the proper line between the definition of antisocial and non-antisocial? We should avoid confusing unpleasant users and antisocial users.
  2. Compared to the last paper, there is less description of implementation tools throughout different phases of research. I’m pretty curious about how to do specific procedures practically, e.g., data collecting, feature categorization, investigation of the evolution of user behavior and of community response.
  3. I think the classifiers we choose probably make a difference in the prediction accuracy, so it might be better to compare the performance of those classifiers to find out more feasible classifier for this task.
  4. Although we can roughly see the contribution of each feature category from Table 4, I think more extensive and quantitive analysis would complete the research.

Read More

Reflection #3 – [1/23] – [Deepika Kishore Mulchandani]

[1]. Cheng, J., Danescu-Niculescu-Mizil, C. and Leskovec, J. 2015. Antisocial behavior in online discussion communities.

Summary :

In this paper, Justin Cheng et al. study antisocial behavior in 3 online discussion communities: CNN, a general news site, Breitbart.com, a political news site, and IGN.com, a computer gaming site. The authors first characterize antisocial behavior by comparing the activity of-of users banned from the community(FBUs) and the users never banned(NBUs). They then perform longitudinal analysis, i.e, the study of the behavior of the users over their active tenure in the community. They also consider the readability of the posts and the proportion of user’s post deletion rate as features to train their model. After developing a model, they predict the users who will be banned in the future. With their model, they need to observe only 5 to 10 of the user’s posts to accurately predict that the user will be banned. They present two hypothesis and try to answer the following research questions: Do users become antisocial later? , Does a community’s reaction affect their behavior? , Can antisocial users be identified early?

Reflection:

Antisocial Behavior is a problem is a worry no matter if it is online or in person. That said, this research is an indication of the advancement that is being made to alleviate the ill effects of such behavior. The authors mention the four features that help in recognizing the antisocial users in a community. Out of these features, the one that is salient in the study conducted by the authors is the ‘Moderator features’. Moderators delete the posts and ban the users in a community. They have a particular set of guidelines based on which they delete the posts that they consider antisocial. This raises a few questions. ‘Do these moderators delete posts only based on the language of the post or factors like ‘ the number of down votes’, ‘whether the post is reported’ affect the decision?’ The point of this question is to figure out which do they weigh more heavily. Also, this opens up a variety of questions like, ‘Do moderator demographics(e.g age) play a role in how offensive they find a post to be?’ The authors mention that there were more ‘swear’ words in the posts written by the FBUs. The moderators who are more tolerable of swear words may not delete posts of potential FBUs.

I admire the efforts of the authors in studying the entire history of a particular user to identify patterns in the user behavior over time. I also like the other features used by the authors. The activity features(time spent in a thread) are not that intuitive and end up playing a significant role. The authors made an important observation that the model trained by one community perform relatively well in the other communities as well. Also, they provided some facts that FBUs survived over 42 days in CNN, 82 days in Breitbart and 103 days on IGN. This could be an interpretation of the category of the online discussion community. One could expect the online community which hosts only political news to be more tolerant of antisocial behavior by virtue of the fact that there is opposition inherent in the news. Most of the posts on such a community could have down votes and replies to comments. These are both significant features of the model as well as factors that influence a moderator’s decision.  Thus, the question, ‘Does the category of the online discussion community affect the ban of an antisocial user?’ I also agree with the authors that it is difficult to track users who might instigate arguments but maintain an NBU behavior. This could be a crucial research question to look into.

Read More

Reflection #3 – [1/25] – Aparna Gupta

Paper: Antisocial Behavior in Online Discussion Communities

Summary:

This paper talks about characterizing anti-social behavior which includes trolling, flaming, and griefing, in online communities. For this study the authors have focussed on CCN.com, Breitbart.com and IGN.com. The authors have presented a retrospective longitudinal analysis to quantify anti-social behaviour throughout an individual user’s tenure in a community. They have divided users in two groups – Future Banned Users(FBUs) and Never Banned Users(NBUs) based on the language and the frequency of their posts.

Reflection:

The paper ‘Antisocial Behavior in Online Discussion Communities’ focuses on detecting the anti-social users at an early stage by evaluating their posts. Their results are based on the features – post content, user activity, community response, and the actions of community moderators. In my opinion “What leads an ordinary man to exhibit trolling behaviour” should also have been considered as a contributing feature.

For example, in communities or forums where political discussions are held, comments exhibiting strong opinions are bound to be seen. I therefore feel that What is considered anti-social depends on the particular community and the topic around which the respective community is formed” [1].

What struck my mind was there can be scenarios where discussion context determines the trolling behaviour of an individual. However, the ‘Readability Index’ parameters which authors have considered looked promising.

In the Data Preparation stage to measure “Undesired Behaviour” the authors have stated that “At the user-level, bans are similarly strong indicators of antisocial behaviour”. How is a user getting banned from an online community determines antisocial behaviour? For example, a user got banned from Stack overflow because all of the questions posted were out of scope.

The paper majorly revolves around the 2 hypothesis which authors have stated to evaluate an increase in the post deletion rate. H1: a decrease in posting quality, H2: an increase in community bias. To test both H1 and H2, the authors have conducted 2 studies – 1. Do writers write worse over time? This study is somewhat agreeable where one can analyse how the user writing is changing over time. 2. Does community tolerance change over time?  According to the results presented by the authors this indeed looks true. However, in my opinion It also depends on how opinions/comments are perceived by other members of the community.

In the closing note, the paper presents some interesting facts about how to identify trolls and ban them at the very early stages.

 

[1] https://www.dailydot.com/debug/algorithm-finds-internet-trolls/

Read More

Reflection #3 -[01/25]- [Vartan Kesiz-Abnousi]

Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.” ICWSM. 2015.

Summary

The authors of the article aim to study antisocial behavior in online discussion communities, as the title suggests. They use data from three online discussion-based communities, Breitbart.com. CNN.com and IGN.com.  Specifically, they use the comments on articles that are posted on the websites. The data covers a period of over 18 months, 1.7 million users who have contributed nearly 40 million posts. The authors characterize antisocial behavior by comparing the activity of users who are later banned from a community, namely Future-Banned Users (FBUs), with that of users who were never banned, or Never-Banned Users (NBUs). They find significant differences between the two groups. For instance, FBUs tend to write less similarly to other users, and their posts are harder to understand according to standard readability metrics. In addition, they are more likely to use language that may stir further conflict. FBUs also make posts that tend to be concentrated in individual threads rather than spread out across several. They also receive more replies than average users. In the longitudinal analysis, the authors find that of an FBU worsens over their active tenure in a community. Moreover, they show not only do they enter a community writing worse posts than NBUs, but the quality of their posts also worsens more over time. They also find that the distribution of users’ post deletion rates is bimodal, where some FBUs have high post deletion rates, while others have relatively low deletion rates. Finally, they demonstrate that a user’s posting behavior can be used to make predictions about who will be banned in the future with over 80% AUC.

Reflections

User-generated content has become important for the success of websites. As a result, maintaining a civil environment is important. Anti-social behavior includes trolling, bullying, and harassment. Therefore, platforms implement mechanisms designed to discourage antisocial behavior, such as moderation, up and down voting, reporting posts, mute functionality and blocking users’ ability to post.

The design of the platform might render the results non-generalizable or more platform specific. It would be interesting to see whether the results hold for other different platforms. There are cases of discussion platforms where the moderators have the option of issuing a temporary ban. Perhaps this could work as a mechanism to “rehabilitate” users. For instance, the authors find there are two groups of users where some FBUs have high post deletion rates, while others have relatively low deletion rates. It should be noted that the authors excluded users who were banned multiple times so as not to confound the effects of temporary bans with behavior change.

In addition, it should be stressed that these specific discussion boards have an idiosyncrasy. The primary function of these websites is to be a news network, not a discussion board. This is important, because the level of scrutiny is different in such platforms. For instance, they might choose banning opposing views expressed in an inflammatory language more frequently, to support their editors or authors. The authors write that “In contrast, we find that post deletions are a highly precise indicator of undesirable behavior, as only community moderators can delete posts. Moderators generally act in accordance with a community’s comment policy, which typically covers disrespectfulness, discrimination, insults, profanity, or spam”. The authors do not provide evidence to support this position. This does not necessarily mean they are wrong, since their criticism for other methods is valid.

However, the authors propose a method that explores this problem by measuring text-quality. They do this by sending a sample of posts to Amazon Turk. Then they take this sample and run a classification model to generalize text-quality results for their sample.

They find some interesting results. Deletion rate increase by time for FBUs, but it is constant for NBUs. In addition, they find that text-quality decreases for both groups. This could be attributed either to a decrease in the posting quality, which would explain the deletion, or community bias. Interestingly enough the authors find evidence that supports both hypotheses. For the community bias hypotheses, the authors use propensity score matching and they find that early posts (first 10% of time) compared to later posts (last 10% of time) for the same text-quality are more likely to be deleted for FBUs but not NBUs. They also find that excessive censorship cause users to write worse.

Questions

  1. How would a mechanism of temporary bans affect the discussion community?
  2. The primary purpose of these websites is to deliver news to their target audience. Are the results same for websites whose primary purpose is to provide a discussion platform, such as discussion boards?
  3. Propensity Score matching is biased if there are unobserved variables. This is usually the case in non-experimental, observational, studies. A nearest neighbor matching with Fixed Effects to control for contemporaneous trend, or by matching users by time, in addition to text-quality, might be a better strategy.

Read More

Reflection #3 -[01/25]- [Jamal A. Khan]

This paper studies anti-social behavior of people on three different platforms which i believe is quite relevant looking at the ever increasing consumption of social media. First off,  in my opinion what the authors have studied is not anti-social behavior, but rather negative, unpopular and/or inflammatory behavior (which also might not be the case as I’ll highlight a bit later). Nonetheless, the findings are interesting.

Referring to Table 1 in the paper (also shown above) I’m surprised to see so few posts deleted. I was expecting something in the vicinity of 9-10% but that might be just me though! maybe I have a tendency to run into more trolls online 🙁 . What are other people’s experiences, do these numbers reflect the number of trolls you find online?

Now, a fundamental problem that I have with the paper is the use of moderators actions of “banning or not banning” as the ground truth. This approach fails to address a few things. First, What of the moderators biases? One moderator might consider certain comments on certain topic acceptable while another might not and this varies based on how the person in question feels about the topic at hand.  For example, I very rarely talk or care about politics, hence most comments seem innocuous to me, even ones that i see other people react very strongly to. That being the case, if I was the moderator that saw some politically charged comments i would most probably ignore them.

Second, unpopular opinions expressed by people most certainly don’t count as anti-social behavior or troll remarks or even as attempts to derail the discussion or cause inflammation e.g. one such topic that could pop up on IGN would be the strict gender binary enforced by most video game studios, which will, by my experience, get down voted pretty quickly because people are resistant to such changes. So this raises a few questions as to what is used a metric to deal with unpopular posts? Are down-votes used by the moderators as a metric to remove the posts?

Third, varying use of English based on demographics would thrown off the language similarity among posts for FBUs and NBUs by a fair margin and thee authors don’t seem to have catered for it. The paper relies quite heavily on this metric for making a lot of the observations. So, If we were conducting a follow up study, how would we go about taking cultural difference in use of English into account? Do we even have to i.e. will demographically more diverse platforms automatically have a  normalizing effect?

Finally, the idea of detecting problematic people beforehand seems like a good idea at first but on second thought i think it might not be so! but that depends on how the tool is used. The reason why I say this is because, suppose we had an omnipotent classifier that could with 100% accuracy, what would we do once we have the predictions? Ban the users beforehand? wouldn’t that be a violation of the right to opinion and freedom of speech? wouldn’t the classifier just reflect what the people like to see and hear and end up tailoring their content to their point of views? and in a dystopian scenario wouldn’t it just lead to snowflake culture?

As a closing note, how would the results be if the study was to be repeated on Facebook pages? would the results from this study  generalize?

Read More

Reflection #3 – [1/25] – Ashish Baghudana

Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.” ICWSM. 2015.

Summary

In this paper, Cheng et al. attempt to characterize anti-social behavior in three online communities – CNN, IGN, and Breitbart. The study address three major questions:

  • Is anti-social behavior innate or dependent on the community influences?
  • Does the community help in improving behavior or worsen it?
  • Can anti-social users be identified early on?

They find that the banned users were less readable, get more replies and concentrate on fewer threads. The authors also find that communities affect writing styles over time. If a user’s posts are unfairly deleted, their writing quality is likely to decrease over time. Subsequently, the authors characterized the banned users based on their deleted posts and found the distribution to be bimodal. An interesting characteristic of banned users is that the frequency of their posts is much higher than that of non-banned users. Finally, the authors build a classifier to predict if a user might be banned based on their first 10 posts and report an accuracy of 0.8.

Reflection

The paper comes at a time when cyberbullying, harassment and trolling are at their peak on the Internet. I found their research methodology very didactic – to effectively summarize and divvy up 1.7 million users and 40 million posts. It is also interesting to read into their use of Amazon Mechanical Turk to generate text quality scores, especially because this metric does not exist in the NLP sphere.

At several instances in the paper, I found myself asking the question what kinds of anti-social behavior do people exhibit online? While the paper focused on the users involved, and characteristics of their posts that made them undesirable on such online communities, it would have been much more informative had the authors also focused on the posts itself. Topic modeling (LDA) or text clustering would have been a great way of analyzing why they were banned. Many of the elements of anti-social behavior discussed in the paper would hold true for bots and spam.

Another fascinating aspect that the paper only briefly touched upon was the community effect. The authors chose the three discussion communities very effectively – CNN (left of center), Breitbart (right of center) and IGN (video games). Analyzing the posts of the banned users on each of these communities might indicate community bias and allow us to ask questions such as are liberal views generally banned on Breitbart?

The third set of actors on this stage (the first two being banned users and the community) are the moderators. Since the final decision of banning a user rests with the moderators, it would be interesting to ask the question what kind of biases do the moderators display? Is their behavior erratic or does it follow a trend?

One tiny complaint I had with the paper was their visualizations. I often found myself squinting to be able to read the graphs!

Questions

  • How could one study the posts themselves rather than the users?
    • This would help understand anti-social behavior holistically, and not just from the perspective of non-banned users
  • Is inflammatory language a key contributor to banning certain users, or are users banned even for disagreeing with long-standing community beliefs?
  • Do the banned posts on different communities exhibit distinct features or are they generalizable for all communities?

Read More

Reflection #3 – [1/24] – Hamza Manzoor

Justin Cheng, Cristian Danescu-Niculescu-Mizil, Jure Leskovec. “Antisocial Behavior in Online Discussion Communities” 

 

Summary:

In this paper, Cheng et al present a study of antisocial behavior of users from the moment they join a community up to when they get banned. The paper presents a very important topic, which is cyber bulling and identification of cyber bullies is one of the most relevant topics in current digital age. The authors study users behaviors on three online discussion communities (CNN, Breitbart and IGN) and characterize antisocial behavior by analyzing users who were banned from these communities. The analysis of these banned users reveal that over time they start writing worse than other users and secondly, the tolerance of community towards them reduces.

 

Reflection:

Overall, I felt that the paper was well organized and showed all the steps of analysis from data preparation to results of findings along with visualization but the correctness of analysis performed in the paper is questionable because the basis of entire analysis is number of deleted posts but the authors did not consider all the reasons for posts to be deleted. Some posts get deleted because they are in different languages or sometimes on controversial topics like politics if the reported post does not conform to opinions of moderators. Sometimes users engage in an off-topic discussion and those posts are deleted to maintain relevance of comments to article. The biases of moderators should be considered.

The paper does not mention the population size on some analysis, which makes me question if the sample size was significant, or not. For example: When they analyze if excessive censorship cause users to write worse. In this analysis, one population had four or more posts deleted among their first five posts, which unless mentioned I believe would be negligible. Also, entire analysis in more or less dependent on first five or ten posts which is also questionable because these posts can be on same thread on one single day. This approach has two caveats, since the authors did not analyze the text, it is therefore unfair to ban user on their first few posts because it might be possible that user had a conflicting opinion rather than a troll and secondly, the paper itself shows that many of NBUs initially had negative posts and they got better in time. Therefore, banning users on first few deleted posts means that they will not have an opportunity to become better.

The strongest features in the statistical analysis are moderator features and without those features the results significantly drop. These moderator features require moderators whereas the purpose of this analysis was to automate the process of finding FBUs and their high dependency on these features makes this analysis look not so significant.

Finally, my take on the analysis in this paper is that the use of number of deleted posts is trivial and the text of posts should be analyzed before automating any such process which bans users from posting.

 

Questions:

Are the posts deleted because of inflammatory language only or difference of opinion as well?

One question that everyone might raise is that analyzing users on first few posts is unfair but what should be the threshold? Can we come up with a solid analysis without topic modeling and analyzing the text?

What kind of biases moderators display? Does it play a role in post deletions and ultimately user ban?

Read More