Reflection #4 – [09/06] – [Parth Vora]

[1] Kumar, Srijan, et al. “An army of me: Sockpuppets in online discussion communities.” Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017.

Summary
This paper analyses various aspects of Sockpuppets and their behavior over online discussion communities through the study of a dataset of 62,744,175 posts and studying the users plus discussions within them. The authors identify and highlight various traits of sockpuppets and their pairs and then compare that with ordinary users to gain insight into how the sockpuppets operate. They also propose two models, one to identify sockpuppets from regular users and the other to identify pairs of sockpuppets in the online community.

Reflection
Widespread access to the internet and the ease with which one can create an account online, has encouraged many individuals to create sockpuppets and use them to deceive and drive online consensus. The scale of online communities which result in delayed repercussions for deceptive behavior has only encouraged individuals. If one has ever used social media, the chances are that one has been followed or mentioned by a suspicious looking account. Facebook and Twitter have millions of such fake sockpuppet accounts, and there are entire industries that are built around this. The paper brings to light some fascinating facts about the dynamics of sockpuppets in online communities and how can we identify and handle them. However, it leaves some questions unanswered.

Sockpuppeting coupled with the power of spambots and astroturfing have become powerful tools for organizations as well as states in some cases to manipulate public opinion and spread misinformation. One can come up with a system to flag such users and fake posts but when people are operating at such a high level of expertise, can such a system actually work? Even if we ban such accounts, it barely takes a few minutes to create a new account and come back online, how do we deal with this?

Twitter has an anti-spam measure, where it takes hashtags out of the trending section if the content of the tweets is irrelevant. While it sounds like a good measure, consider a scenario where an actual topic of concern is buried inside because of sockpuppets flooding Twitter with spam content over critical trending hashtags. So, the mechanism which is used to defeat spam is itself burying essential topics. How can we guarantee that such systems will adequately serve the purpose they are designed for? Also, in large-scale social media settings, do sock pockets actually exist in pairs?

Not only sockpuppets create a disturbance, but they also develop a sense of doubt amongst the ordinary user. Although people have grown accustomed to non-sense speaking accounts over social media, there has been a significant shift in trust on content published online. It has increased the credibility of fake news, while at the same time reduced the credibility of genuine news. This is very prevalent in the Indian political sphere. Disguised under the guise of IT-Cell (party sponsored organizations which are responsible for online campaigns), these groups use sockpuppets masquerading as other influential people to draw attention from essential topics. Follow comment threads [Example 1][Example 2].

From the technical point of view, models can be improved by using “Empath“[1], instead of LIWC. Empath is build using the word embedding models like word2vec and Glove and has a larger lexicon than LIWC. One problem with using unigram based features is that the model fails to capture the underlying meaning of the sentence. For example, for the model there is no difference between the two sentences “the suspect was killed by the woman” and “the woman was killed by the suspect.” Studies have also shown that deep learning based models perform significantly better than standard machine learning models especially in text/image classification[2]. Such complex models with advanced feature sets can be considered for effective labeling of posts.

In conclusion, although the paper highlights essential features to detect sockpuppets and proposes a model to identify them, sockpuppets have evolved to be more sophisticated and backed by technology. One must think of an efficient way to stop them at the source than to filter them after the damage is done.

References
[1] Fast, Ethan, Binbin Chen, and Michael S. Bernstein. “Empath: Understanding topic signals in large-scale text.” Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 2016.
[2] Bengio, Yoshua, and Yann LeCun. “Scaling learning algorithms towards AI.” Large-scale kernel machines 34.5 (2007): 1-41.

Leave a Reply

Your email address will not be published. Required fields are marked *