Reflection #2 – [08/30] – [Dhruva Sahasrabudhe]

Paper-

Antisocial Behavior in Online Discussion Communities Justin Cheng, Cristian Danescu-Niculescu-Mizil, Jure Leskovec.

Summary-

The paper explores the characteristics of “antisocial” users, i.e. trolls, online bullies, etc., by creating a category of users called FBUs (Future Banned Users), and tries to distinguish their habits from NBUs (Never Banned Users). It finds that FBUs do not write in tune with the rest of the discourse, write more incomprehensibly, and express more negative emotion than NBUs. Furthermore, it builds a model to try and predict whether a user will be banned, based on features like the post content, frequency, community interaction and moderator interaction. The results are presented quantitatively.

Reflection-

Firstly, the paper seems limited in its choice of source websites for data gathering. It selects only 3, CNN, Breitbart News, and IGN. Its results could be augmented by similar analyses done on other websites, or on a large diverse set of source websites at once.

CNN is a news organization with a left-wing bias (Source), Breitbart news is an extremely right-wing biased website (Source), while IGN being a gaming website, can be thought of as politically neutral. It may be a coincidence, but IGN has the best average cross-domain generalizability for the user banning prediction system. This might suggest that political leanings may have some effect on either generalizability outside the source website, as the politically neutral source generalizes the best.

The paper questions, quite early on, about whether negative community interaction to antisocial behavior encourages or discourages continuation of that behavior, and finds that it actually exacerbates the problem. There are clear parallels between this finding, and certain studies on the effectiveness of the prison system, where “correctional” facilities do nothing to actually steer criminals away from their previous life of crime once they are released from prison.

The paper tries to compare the behavioral patterns of FBUs against NBUs, but through a process called “matching”, they only select NBUs who have the same posting frequency as FBUs. It is worth noting that this frequency is 10 times the posting frequency of regular users, so NBUs themselves may have anomalous usage patterns, or might be a special subset of users. Despite the paper’s claims that this selection choice gives better results, it might be useful to balance this out by collecting the same statistics about a third additional set of random users.

Moreover, the paper claims that FBUs, despite not contributing anything positive to the discussion, receive many more replies on their comments. The parallel to this is news shows with sensationalized, inflammatory news, or deliberately incendiary news panel guests, where the panel discussion does not enlighten the viewers to the issue, but the ensuing argument attracts a large viewership.

The predictive model that the authors create, could be augmented with other features, like post time data, login/logout times, and data about frequency and duration of personal messages between antisocial users and other community users. I suspect that anti-social users would have a number of short, high volume personal message exchanges with other users (maybe an argument with users who were angry enough to personally message the antisocial individual), but not many sustained long-term exchanges with other users. The predictive model, as the paper mentions, could be something more powerful/expressive than a simple piecewise linear model, like a neural network or an SVM.

Lastly, the predictive model, if implemented in real world systems to ban potentially antisocial users, has some problems. Firstly, as the paper briefly mentions, it raises an interesting question about whether we should give these “antisocial” users the benefit of the doubt, and whether it is okay for an algorithm to pre-emptively ban someone, before society (in this case the moderators or the community) has decided that time has come for them to be banned (as is the case in today’s online and real world systems).

 

Read More

Reflection #2 – [08/30] – [Lindah Kotut]

  • Justin Cheng, Cristian Danescu-Niculescu-Mizil, Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.”

Brief:

Cheng et al considered discussion posting from CNN, Breitbart and IGN to study anti-social behavior — mostly trolling, using banned users from these discussion  communities as the ground truth. They applied retrospective longitudinal analysis on these banned users to be able to categorize their behavior. Most of hypothesis about behaviors: change in posting language and frequency, community chastisement and moderator intervention by issuing warnings, temporary or permanent banning – all bear out to be useful markers in creating a classifier that could predict a Future Banned User (FBU) within a few posts.

Reflection:

Considering the anti-social markers and other factors surrounding the posters, we can reflect on different facets and their implications on the classifier and/or the discussion community.

The drunk uncle hypothesis: A cultural metaphor of the relative who makes a nuisance of themselves at formal/serious events (deliberately?) is an equivalent anti-social behavior to online trolls as defined by Cheng et al. (they are given multiple chances and warning to behave accordingly, they cause chaos in discussions, and the community may tolerate them for a time, before they are banned). Questions surrounding the drunk uncle serves as an excellent springboard to query the online troll behavior:

  • What triggered it? (what can be learned from the dileanating point between innocuous and anti-social posts?)
  • Once the drunk uncle is banned from future formal events, do they cease to be the ‘drunk uncle’? — this paper considers some aspect of this with temporary bans. On banning, does the behavior suddenly stop, and the FBU is suitably chastised?

Hijacked profiles and mass chaos: The authors did not make any assumption about the change of posting behavior/language — a troll marker. They only made observations that such behaviors could be used to predict a FBU, but not that the account could have been compromised. I point to the curious case of the Florida dentist posting markedly different sentiments on Twitter  (an intrepid commenter found that the good dentist had bee dead for 3 years, and included an obituary conveniently bearing the same picture as the profile. With this lens in mind:

  • When viewing posts classified to be by FBUs, and given the authors claim of generalization of their model, and swiveling the lens and assuming commenters to be in good faith and a sudden change in behavior an anomaly, what tweaks would need to be made in order to recognize hijacked account (would other markers have to be considered sch as time difference, mass change of behavior, bot-like comments)?
  • The model heavily relies on moderator to classify FBUs, and given the unreliable signals of down-voting, what happens when a troll cannot be stopped? Do other commenters ignore the troll, or abandons the thread entirely?
  • On Trolling-as-a-service, and learning from the mass manipulation of Yelp and Amazon reviews whenever a controversy linked to a place/book (and how the posters have become more sophisticated at beating the Yelp classifier), (how) does this manifest in commenting?

The Discus® Effect: The authors used Discus (either partly or wholly) for this work, and proposed looking at other online communities to challenge both the generalizability of their model, and to observe differences considering a specialized groups. There is another factor to consider in this case: Since the commenters are registered to Disqus and the platform is used by a multitude of websites…

  • What can be learned about a FBU from one community, assuming CNN was using Disqus, and how this behavior transferred to other sites (especially since all comments across different sites are viewable from the users account)?

 

Read More

Reflection #2 – [08/30] – [Bipasha Banerjee]

The paper for today’s discussion:

Cheng, Justin et al. (2015) – “Antisocial Behavior in Online Discussion Communities”- Proceedings of the Ninth International AAAI Conference on Web and Social Media (61-70).

Summary

The paper was mainly focused on analyzing the antisocial behaviors in large online community namely, CNN.com, Breitbart.com and IGN.com. The authors describe undesirable behavior such as trolling, flaming, bullying, harassment and other unwanted online interactions as anti-social behaviors. They have categorized users who display such unwanted attitudes into two broad groups, namely, the Future-Banned Users (FBUs) and the Never-Banned User (NBUs). The authors conducted statistical modelling to predict individual users who will eventually be banned from the community. They collected data from the above-mentioned sites via Disqus for a period of about 13 months. They based their measure of undesirable behaviors on the posts that were deleted by the moderators. The main characteristic of an FBU post are

  • They post more than an average user would and contribute more towards posts per thread.
  • They generally post off topic conversation and which generally are negative emotions.
  • Posting quality decreases over time and this may be as a result of censorship.

It was found that at times the community tolerance changes as well and become less tolerant of an users’ post over time.

The authors further classified the FBU’s into Hi-FBU and Lo-FBU with the name signifying the amount of post deletion that occurs. It was found that Hi-FBUs exhibited strong anti-social characteristics and their post deletion rate were always high. Whereas, for the Lo-FBUs the post deletion rates were low until the second half of their lives where it rose. Lo-FBU start to attract attention (the negative kind) in their later life.  Few features were established in the paper for the identification of the antisocial users namely the post features, activity features, community features and the moderator features. Thus, the authors were able to establish a system that would identify undesirable users early on.

Reflection.

This paper was an interesting read on how the authors conducted the data-driven study of anti-social behavior in online communities. The paper on Identity and Deception by Judith had introduced us to online “trolls” and how their posts are not welcomed by the community and might even lead to the system administrators banning them from such posts. This paper delved further into the topic with analyzing the types of anti-social users.

One issue which comes to my mind is how are the moderators going to block users when the platform is anonymous? The paper on 4chen’s popular board /b/, which was also assigned as a reading, focused on the anonymity of users posting on threads and that the majority of the site attracted anti-social behavior. Is it possible to segregate users and ultimately block them from creating profanity in anonymous platforms?

One platform where I have witnessed such online unwanted comments is YouTube. The famous platform by google has a comment section where anyone having a google account can post their views. I recently read an article “Text Analysis of YouTube Comments” [1]. The article focused on videos from few categories like comedy, science, TV and news & politics. It was observed that news and political related channels attracted the majority of the negative comments whereas, the TV category are mostly positive. This leads me to think that the subject of discussion is sometimes important as well. What kind of topics do generate the most amount of anti-social characteristic in the discussion communities?

The social media in general has now become a platform for cyberbullying and unwanted comments. If these users and their patterns are detected and if such comments are automatically filtered out as “anti-social”, it would be a huge step in the right direction.

[1] https://www.curiousgnu.com/youtube-comments-text-analysis

 

Read More

Reflection #2 – [8/30] – [Deepika Rama Subramanian]

Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.”

Summary
In this paper, Cheng et al. exhaustively study antisocial behaviour in online communities. They classify their dataset into Future Banned Users (FBU) and Never Banned Users (NBU) for the purpose of comparing the difference in their activities in the following factors – post content, user activity, community response and actions of the community moderators. The paper suggest that the content of the posts by FBUs tend to be difficult to understand and full of profanity, they tend to attract more attention to themselves and engage/instigate pointless arguments. With such users even tolerant communities over time begin to penalise FBUs more harshly than they did in the beginning. This maybe because the quality of the FBUs posts have degraded or simply because the community no longer wanted to put up with the user. The paper points out, after extensive quantitative analyses, it is possible for FBU users to be identified as early as 10 posts into their contribution to discussion forums.

Reflection
As I read this paper, there are a few questions that I wondered about:
1. What was the basis of the selection of their dataset? While trolling is prevalent in many communities, I wonder if Facebook or Instagram may have been a better place because trolling is at its most vitriolic when the perpetrator has some access to your personal data.
2. One of the bases for the classification was the quality of the text. There are several groups of people who have reasons other than trolling for the quality of text viz. non-native speakers of English, teens who have taken to unsavoury variations of words like lyk(like), wid (with), etc.
3. Another characteristic of anti-social users online was people who led other users of the community into pointless and meaningless discussions. I have been part of a group that was frequently led into pointless discussions by legitimate well-meaning members of the community. In this community ‘Adopt a Pet’, users are frequently outraged by the enthusiasm that people show in adopting pedigrees versus local mutts. Every time there is a post about pedigree adoptions, there are always a number of users who will be outraged. Are these users considered anti-social?
4. The paper mentions that some NBUs have started out being deviant but had improved over time. If as this paper proposes, platforms begin banning members based on a couple of posts soon after they join, wouldn’t we be losing on these users? And as suggested by the paper, users that believe they have been wrongly singled out (in deleted posts whereas other posts with similar content were not deleted) tend to become more deviant. When people feel like they’ve been wrongly characterised, based on a few posts, wouldn’t they come back with a vengeance to create more trouble on the site?
5. Looking back at the discussion in our previous class, how would this anti-social behaviour be managed in largely anonymous websites like 4chan? It isn’t really possible to ‘ban’ any member of that community. However, maybe because of the ephemerality of the website, if the community ignores trolls, the post may disappear on its own.
6. If we look at communities where deviant behaviour is welcome. If visitors who visit say r/watchpeopledie reports a post to the mod as would the moderator have to delete the post given that it is the norm on that discussion board?

Read More

Reflection #2 – [8/30] – [Parth Vora]

[1] Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.” Icwsm. 2015.

Summary

In this paper, Cheng et al. study three online discussion driven communities (CNN, IGN, and Breitbart) to understand the dynamics behind antisocial behavior in Online Communities. Their results are greatly derived from a quantitative study, where they inspect data to find trends in antisocial behavior and use the same to support their conclusions. This study also tracks how the activities of the users transform over time. The main deciding factor for an individual to be classified as an antisocial entity is the fact that they have been banned by moderators. The paper then goes on to explain a model to predict antisocial behavior using various features that they had discussed earlier.

 

Reflection

The paper answers many questions while leaving many unanswered. Although many conclusions seem intuitive to understand, it is amazing how simply going through the data answers the same questions. Like the one where the authors discuss “if excessive censorship causes users to write worse”. Intuitively, if one is punished for doing the right thing, the chances of repeating the same nice thing again reduces considerably.

What exactly is antisocial behavior? From one online community to another, this definition will change. There can not be a single defining line. For instance, 4chan users will tolerate more antisocial elements than users on Quora. Also, as we move from one geographical area to other, speaking habits will change. What is offensive and inappropriate in some culture might not be inappropriate in some other culture. So, what content is acceptable and to what extent?

Antisocial posts in this paper are labeled by the moderators. These are human moderators and their views are subjective. How can we validate that these posts are actually antisocial and not a positive criticism or some form of sarcasm? Secondly, on huge social networking websites which produce millions of posts every day how can moderation be translated at such a large scale? The paper provides four features and amongst them, the “moderator” feature has more weight in the classifier than the others. But with such large-scale networks, how can one rely on community and moderator features? The model also has a decent accuracy but when extrapolated to a large user base, it could result in banning of millions of innocent user accounts.

Coming to the technical side, the model shows relatively high accuracy during cross-platform testing using simple random forest classifiers and basic NLP techniques. While “Bag of words” model with random forest classifiers is a strong combination, they are insufficient to build the “post features”, in this case. Users have many different writing styles and much depends on the context in which words appear, so something more advanced than “bag of words” is needed. Word vectors would be a very good choice as they help capture context using the relative distance between two words. They can be easily tailored to the common writing style of the platform.

By taking, posts from the same user, we can build a sentiment index for each user. Sentiment index will help predict what the user, in general, feels about a particular topic and prevent incorrect banning. It is comparable to a browser keeping your search history to understand your usage patterns. One can also look at all posts from a general perspective and create an “antisocial index” for each post and only if the index is above a certain threshold, should the user be banned or be penalized. Penalties could include disabling users posting privileges for certain hours, so as to ensure that even if there is a false positive, an NBU is not banned.

In conclusion, the paper provides an informative and intriguing baseline to track antisocial behavior. Many techniques can be used to enhance the proposed model and create an autonomous content filtering mechanism.

Read More

Reflection #2 – [08/30] – [Prerna Juneja]

Antisocial Behavior in Online Communities

Summary:

In this paper authors perform a quantitative, large-scale longitudinal study of antisocial behavior on three online discussion communities namely CNN, IGN and Breitbart by analyzing users who were banned from these platforms. They find that such users use obscene language that contains less positive words and is harder to understand. Their posts are limited to a few threads and are likely to amass more responses from other users. The longitudinal analysis reveals that the quality of their posts degrade with passage of time and community becomes less and less tolerant to their posts. The authors also discover that excessive censorship in the initial stages might aggravate the antisocial behavior in the future. They identify features that can be used to predict whether a user is likely to be banned or not namely content of the post, community’s response & reaction to the post, user activities ranging from posts per day to votes given to other users and actions of moderators. Finally, they build a classifier that can make the aforementioned prediction after observing just 5-10 posts with 80% AOC.

Reflections:

Antisocial behavior can manifest in several forms like spamming, bullying, trolling, flaming and harassment. Leslie Jones, actress starring in movie “Ghostbusters” became a target of online abuse. She started receiving misogynistic and racist comments on her twitter feed from several people including a polemicist Milo Yiannoloulos and his supporters. He was then permanently banned from twitter. Sarahah was removed from app stores after Katrina Collin’s started an online petition accusing the app of breeding haters after her 13 year old daughter received hateful messages, one even saying “i hope your daughter kills herself”. According to an article, 1.4 million people interacted with Russian spam accounts on twitter during the 2016 US elections. Detecting such content has become increasingly important. 

Authors say several techniques exist in the online social communities to discourage antisocial behavior ranging from down voting, reporting posts, mute feature, blocking a user, comments to manual human moderation. It would be interesting to find how these features fit in the design of the community. How are these features being used? Are all these “signals” true indicators of anti-social behavior? e.g. the authors suggest in the paper that downvoting is sometimes used to express disagreement rather than antisocial behavior which is quite true in case of quora and youtube. Both these websites have an option to downvote as well as to report the post. Will undesirable content always have larger number of downvotes? Do posts of users exhibiting antisocial behavior receive more downvotes, do their posts get muted by most of their online friends?

All of the author’s inferences make sense. The FBUs use more profanity and less positive words and get more replies which is expected since they use provocative arguments and attempt to bait users. We saw examples of the similar behavior in the last paper we read ”Identity and deception in the virtual community”. I also decided to visit the 4chan website to see if concept of moderators exist there. Surprisingly it does. But as answered in one of the FAQs one hardly gets to see moderated posts since there are no public records of deletion and since the content is deleted it gets removed from the page. I wonder if it’s possible to study the moderated content using the archives and if the archives keep temporal snapshots of the website’s content. Secondly, the website is famous for it’s hateful and pornographic content. How do you pick less hateful stuff from the hateful. I wondered if hate and sexual content are even considered criteria there. On checking their wiki I found the answer to “how to get banned in 4chan” {https://encyclopediadramatica.rs/4chan_bans=>quite an interesting read}. This makes one thing clear, criteria to moderate content is not universal. It depends a lot on the values and culture of the online community.

Having an automated approach to detect users will definitely lessen the burden from the shoulders of human moderators. But I wonder about the false positive cases. How will it affect the community if a series of harmless posts gets some users banned? Also, some users might redeem themselves later. In Fig 6 c) and corresponding explanation in “Characterizing users who were not banned”, we find that even though some FBUs were improving they still get banned. Punishing someone for improving will make sure that person will never improve in life. And the community might loose faith in the moderators. Considering these factors, is it wise to ban a user after observing initial few posts? Even exposing such users to moderators will make the later biased against the former. How long one should wait to form a judgment?

Overall I think it was a good paper, thorough in every aspect: from data collection, annotation, analysis to specifying the future work. I’ll end by mentioning a special app “ReThink[1]” that I saw in an episode of Sharktank (a show where millionaires invest their money in unique ideas). This app detects when a user writes an offensive message and gives him a chance to reconsider sending that message by showing an alert. Aimed for adolescents, the app’s page mentions that 93% of people do change their mind when alerted. Use of such apps by young people might make them responsible adults and might help in reducing the anti social behavior that we see online.

[1] http://www.rethinkwords.com/whatisrethink

Read More

Reflection #2 – [08/30] – [Neelma Bhatti]

Assigned reading: Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.” Icwsm. 2015.

This paper talks about trolling and undesirable behavior in online discussion communities.  Authors studied data from three online communities namely CNN, Breitbart and IGN to see how people behave in these large discussion based communities, with the aim to create a topology of antisocial behavior seen in such communities, and use it for early detection of trolls. Authors have used complete time stamped data of user activity and the list of total banned users on these websites for a course of one year obtained from Disqus.

Since services like Disqus help fetch data about user profiles, it would be interesting to have users’ demographics to observe whether age, geographical orientation has anything to their said anti-social behavior. These behaviors differ greatly from one community to another i.e. gaming, personal, religion, education, politics, and news communities. The patterns also vary in gender specific platforms.

Community bias is very real. In online discussion boards or groups on Facebook, people whose opinion differs from the admin/moderator are promptly banned. One person who is fairly harmless, in fact likeable in one group can be banned in the other. It has nothing to do with their content being profane, it’s more about a difference in opinion. There are also instances where people gang up against an individual and report/harass them in the name of friendship/gaining approval from the moderator or admin of the group. Analyzing or using such data to categorize users as trolls produces inequitable results.

Some other questions which need consideration are:

 

  • There is a fairly large number of people (without quoting exact stats) who have more than one accounts on social media websites. What if the banned user rejoins the community with a different identity? Can the suggested model do early detection of such users based on historical data?

 

  • Does the title correctly portray the subject matter of the study? As the word anti-social refers to someone who’s uncommunicative and avoids company, but based on the results it could be seen that FBU tend to not only post more, but they also have the capability of attracting more audience and steering the topic to their liking.

 

  • Having read previous articles revolving around identities and anonymity, would the results be same for communities with anonymous users?

 

  • How can we control trolling and profanity in communities such as 4chan, where the actual point of unison among seasoned (but anonymous) users is trolling new users and posting explicit content?

 

  • Authors also try to assess whether excessive censorship makes the users hostile or antisocial. Considering real life instances such as teacher calling out and scolding student for someone else’s mistake, would there be some users who are likely to refrain from posting altogether when once treated unfairly?

 

 

Read More

Reflection #2 – [08/30] – [Shruti Phadke]

Paper 1: Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.” Icwsm. 2015.

Cheng et. al. in this impressive and thorough work demonstrate a way of observing and predicting antisocial behavior on online forums. They study banned users from three communities, retrospectively, to identify antisocial features of their online activity. Keeping with the Donath’s paper, we can find the similar definition of antisocial behavior here, where users act provocatively posting aberrant content.  One salient strength of this paper is well-defined problem setup and normalizing measures taken before the analysis to compensate for the peculiarities of the social media. For example, they consider strong signals such as banning and post deletion to pre-filter the bad users and also perform qualitative analysis (human evaluation) to assess the quality of posts. Their observation about bimodal post deletion distribution symbolizes the complexity of overall online interactions. It also suggests that antisocial behavior can be innate as well as adaptive. Based on the paper there are several interesting directions that can be explored.

What are the assessment signals within a group of trolls? The paper mentions several signals of antisocial behavior such as down-voting, reporting, blocking, banning etc. These signals are from the point of view of normal users and community standards. However, it will be interesting to observe whether such users identify each other and show group trolling behavior in communities. The paper mentions that the FBUs are able to garner an increased response by posting provocative content. Does this contention increase if there are multiple trolls involved on the same thread? If yes, how do they identify and support each other?

What about the users that were not banned, but are extremely antisocial on occasions? Consider a scenario in which there are two users, U1 and U2. U2 frequently trolls people of different communities by posting harmless but annoying and argument fuelling content. Based on the observations made in the paper, U2 will most likely get banned because of the community bias and post deletion. Now consider U1 who posts rarely but with extremely offensive content. U1 will most likely have their posts deleted without attracting much attention. Comparing by the number of posts deletions (which is the driving factor of this paper’s predictive model) U2 will have more likelihood of getting banned. But, which one of the two (U1 and U2) is actually more dangerous for the community? The answer is, both! To observe both types of behaviors, there should be multiple granularity levels while analyzing antisocial behavior. Per post, per thread and per user. Analyzing hateful or flaming language in individual posts can have some weight in the predictive model for deciding whether the user is antisocial or not.

Finally, will moderators be biased if they are exposed to the prediction results based on the first five posts of the user? In a scenario like this where banning and blocking infringes on the freedom of speech of the user, knowing the likelihood of a particular user being “good” or “bad” might increase community bias and the amount in which these users get censored. So, features based on more linguistic analysis are definitely relevant in which users the moderators should be warned about.

Lastly, there are few more specific questions that I want to mention:

  1. Were the Lo-FBUs slower to get banned compared to the Hi-FBUs. Also, did they have a longer/shorter lifespan compared to NBUs and Hi-FBUs. This analysis might give clues about how to build more accurate predictive models for Lo-FBUs.
  1. Why is Krippendorff’s alpha so low? (And how was it accepted?)
  2. Can we also quantify the sensitivity of the community and use it as a feature for prediction? (Sort of customizing the banning algorithms as per the individual community’s taste. )
  3. How to define antisocial behavior in anonymous communities and inherently offensive communities like 4chan?

Read More

Reflection #2 – [08/30] – [Subil Abraham]

Summary:

This paper examines the behavior of anti social users – trolls – in comment sections of three different websites. The goal of the authors was to identify and characterize these users and separate them from the normal users i.e. the ones who are not intentionally creating problems. The paper analyzed more than a years worth of comment section activity on these sites and identified that the users who had a long history of post deletions and were banned after a while were the trolls (referred to as “Future Banned Users (FBUs)” in the paper). They analyzed the posting history and activity of the trolls, looking at post content, frequency of posting, distribution of their posting across different articles and also comparing them with non problematic users (“Never Banned Users (NBUs)”) who have similar posting activity. The trolls were found to post more on average compared to regular users, tended to have more number of posts under an article, the text in their comment replies were less similar to earlier posts compared to an NBU, and they also engaged more number of users. Trolls also aren’t a homogeneous bunch, with one section of trolls having a higher proportion of deleted posts compared to the rest and are also banned faster and spend less time on the site as a result. The results of this analysis were used to create a model to identify trolls with reasonable accuracy by examining their first 10 posts and determining whether they will be banned in the future.

 

Reflection:

This paper seems to me like an interesting follow up to the section on Donath’s “Identity and Deception” paper on trolls. Where Donath studied and documented troll behavior, Cheng et al. seems to have gone further to perform quantitative analysis on the trolls and their life in the online community. Their observation that the behavior of a troll gets worse over time due to the rest of community actively standing against them seems to be parallel to the behavior of humans in the physical world. Children that grow up abused tend to not be the most well adjusted adults, with studies showing higher rates of crime among adults who were abused or shunned by family and/or community as children compared to those who were treated well. Of course, the difference here is that trolls start off with the intention of making trouble whereas children do not. So an interesting question that we could possibly look at is: If an NBU is treated like an FBU in an online community without chance for reconciliation, will they take on the characteristics of an FBU over time?

It is interesting that the authors were able to get an AUC of 0.80 for their model I feel that is hardly sufficient (my machine learning background is minimal so I cannot comment on whether 0.80 is a relatively good result or not from an ML perspective). This is also a fact that the authors touched upon and recommended having a human moderator on standby to verify the algorithm’s claims. Considering that 1 in 5 cases are false positives, what other factors could we add to increase the accuracy? Given that these days, memes play a big role in the activities of trolls, could that also be factored in the analysis or is meme usage still too small compared to plain text to make it a useful consideration?

 

Other Questions for consideration:

  1. How will the statistics change if you analyze cases where multiple trolls work together? Are they banned faster? Or can they cover and support each other in the community, leading to them being banned slower?
  2. What happens when you stick a non anti social user into a community of anti social users? Will they adopt the community mindset or will they leave? How many will stay and how many will leave and what factors determine whether they stay or leave?

 

Read More

Reflection #1 – [8/28] – [Timothy Stelter]

Reading:

[1] Donath, Judith S. “Identity and deception in the virtual community.” In Communities in cyberspace, pp. 37-68. Routledge, 2002.

Summary 

This paper focused on how identity plays a key role in virtual communities. Specifically, Donath used Usenet, a structured bulletin board site that allows for multiple different newsgroups each with communal standards on how their respective communities are ran, to extrapolate how identity is established when social cues and reputations are non-existent factors. To capture how to establish identity,  Donath provides models for honesty and deception in the form of signalling taken from biology. There are tow kinds of signals mentioned: assessment signals and conventional signals where assessment signals are “directly related to the trait being advertised” giving a reliable air while conventional signals are “correlated with a trait by custom or convention” giving an unreliable air.   These signals helps provide context to the deep breakdown of a Usenet letter where writing style, personal interactions, signatures, domain address, account name/ID acted as clues, or signals, to your identity. This established a baseline for how people could point out trolls and even deceptive accounts trying to impersonate a respected member of Usenet sub-community. Donath concludes his paper by questioning how online communities can be built with improved communication with implicit social cues, but adds caution for possible unexpected social ramifications.

Reflections

The virtual environments have changed extensively since 1996 (late update to the paper). Growing up during the early stages of the internet, the idea of establishing an identity whether it be your current physical one or a new virtual identity (or even multiple virtual identities) is an intricate problem. While Usenet provided a great example of extrapolating key features to establish one’s identity, Usenet is now a very outdated system. New social platforms like Facebook, Twitter, Reddit, Instagram, etc. have come into the light. What are some new cues for trusting one’s identity on the internet? For example, with data breaches being the norm in today’s society where does identity theft come into play here? Can establishing a identity virtually help protect your physical one? Additionally, with so many social media platforms, trying to keep one identity across all platforms would be a hard problem.  I would even challenge that we now can see social cues given a particular social media platform unlike Usenet.

Questions

  • What are new ways of establishing identity and noting deception? Would it still be through assessment and conventional signaling?
  • What new designs methodologies could be utilized to strengthen one’s identity online and in the physical world?
  • Does identity theft encourage establishing yourself online? Could this be used to help retain you identity?

——————-

Reading:

[2] Bernstein, Michael S., Andrés Monroy-Hernández, Drew Harry, Paul André, Katrina Panovich, and Gregory G. Vargas. “4chan and/b: An Analysis of Anonymity and Ephemerality in a Large Online Community.” In ICWSM, pp. 50-57. 2011.

Summary 

This paper explored 4chan’s /b/ forum community with goals to quantify ephemerality of /b/ and an analysis of identity and anonymity. 4chan is a forum board known for generating (among other things) memes and being the backbone to some of the internet’s culture. The author’s are pursuing  an anomaly where 4chan’s /b/ forum functions entirely of anonymous users and is extremely ephemeral which is not entirely supported through related works. This presented an opportunity to explore a large online community in the wild where design implications can be extrapolated for online social environments. The first study engaged in data collection over a two week time period where the lifetime of all threads was observed. With the forum moving at such a fast pace it correlated with high ephemerality where actions like bumping and sage, forms of extending a thread’s lifetime, were also observed. The second study looked at the impact of anonymity which left a user’s identity hidden. The same two week sample was used where a 90% of all user’s remain anonymous while posting. But the interesting outcome was anonymity is viewed as a positive feature where open conversation can thrive which supports experimentation with new ideas (such as memes). Interestingly enough, one’s identity is never known but how a user interacts in the community gives an air of seniority to the community (not including moderators).  The author’s conclude the paper seeking to conduct a closer study on the content of 4chan and its users.

Reflections

For the first study, I’v never really considered the pace of the threads before. In fact, how the threads work seems unique in that what is interesting to the current crowd at that moment. Of course, with Figure 3 from the paper does a great job highlighting this.

The authors show culture of /b/ is highly influenced with this fast paced environment (enough that users of /b/ cope with capturing some threads by download images as they come in). It begs the question of what design decisions could we grasp from a high paced, highly influential community and culture in another social platform? 

As someone who is quite similar with 4chan, not as a regular user but more as a sparse lurker over the years of its existence, I found study 2’s results unsurprising. Through observation over the years it was unsurprising to see the high usage of the anonymous (anon) for the majority of the posts. Although, how can we use 4chan’s successful anonyomous structure to help support today’s online identity? Donath mentioned looking at key elements of a Usenet message, but we lose account name/ID and email address information. Would identifying writing style be enough? Would this open up identity deception? The author’s do offer some comments here: “Instead, the /b/ community uses non-technical mechanisms like slang and timestamping to signal status and identity.” This does align with Donath’s comments to some degree, but it would be interesting to see if is enough for identity if a platform is to support mass anonymous users.

Questions

  • What other platforms can take advantage of anonymous interaction between users? Can this feature be extrapolated to a specific problem domain?
  • Does content/memes change anonymous interaction long term that it’s hard to follow those who anonymous bat have been around for a long period of time?

Read More