Paper #1: Uncovering Social Spammers: Social Honeypots + Machine Learning
Paper #2: An Army of Me: Sockpuppets in Online Discussion Communities
Reflection #1:
In this paper authors used honeypots, a type of harmless bots, to collect social spammer information from two early social network sites, Twitter and MySpace. The authors provided a clear motivation for the paper. According to authors email spammers and social spammers are very different. The framework introduced in this paper keeps the precision high for detection of spams. It was deployed in the early stage of the social network sites which probably made it easier to use the specific attributes from the user profile information to automatically detect social spammers. In the recent times, it seems social spammers have already been repurposed for new tasks like introducing misinformation in the network. One of the interesting thing in the paper was using different ensemble based classifier which I didn’t exactly understand. The authors introduced spam precision metric to evaluate their classifier in the wild but didn’t say how it was different from precision metric.
Reflection #2:
Sockpuppets are duplicate accounts to deceive users in a social discussion forum. Authors in this paper looked into 9 online forums to analyze attributes of sockpuppets. The first problem I see with the data collection is using IP addresses to detect sock-puppets. Although authors tried to filter IP addresses that are using NAT, it is not clear if that was effective assuming very few people join discussion forums. The authors did use prior work to characterize other parameters for their framework. Authors found several attributes of the sockpuppets are different from other spammers like bots, trolls etc. Sockpuppets tend to participate in a discussion about controversial topics. They are also more likely to be downvoted. The author used entropy to characterize usage pattern of different sockpuppets but it is not clear if that measurement works. The author used ego network to analyze user-user interaction and found that sockpuppets have higher PageRank in the network. The main success of the paper seems to be finding sockpuppets once one pair has been detected.