Reflection #10 – [03/22] – [Nuo Ma]

  1. Kumar, Srijan, et al. “An army of me: Sockpuppets in online discussion communities.” Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017.

Summary:

Sockpuppets created by unreal users may mislead or have an negative impact in online discussion communities. In this paper, the author present a study about sockpuppets in online discussion communities. The data they use comes from nine online discussion communities and consisted of 2.9 million users. The author first identify sockpuppets by using features like similar names and same IP addresses posted within close time proximity. Then they analyzed the posting behaviors and linguistic features of such sockpuppets. As a result, the author was able to find the behavior of sockpuppets is different from that of ordinary users. Including tendency to write more posts than ordinary users, and shorter posts with a lot of first person pronouns.

 

Reflection:
I like this paper, and there are some points to be discussed. In the data selection, I can see that on average per puppy master owns 2 puppy accounts. This is oddly consistent across all 9 communities and might there be a reason why? And one level deeper into this question, What’s the motivation of such sockpuppets.  As we can see in the figure 6, those topics (usa, world,politics,justice,opinion) have significant higher amount of sockpuppets compared to other topics. So I think it’s safe to assume that people use sockpuppets mainly for political-oriented discussions. But in the data statistics, we can see that the number of sockpuppets in political sites compared to the MLB and allkpop is still average 2 accounts per puppy master, and not even a significant difference in the ratio of #sockpuppets vs #users. Or what is a better way to verify the results of such detection methods.  There can also be multiple purposes of such sockpuppets. It can be from PR companies to give positive comments on a celebrity, on a certain event or for a product.  In fact this is really common practice in some regions, the image of a celebrity can greatly affect the revenue of related movies and product. But I’d say it’s almost impossible to get dataset from these PR companies who own a large number of such sockpuppets.

Read More

Reflection #10 – [03/22] – [Md Momen Bhuiyan]

Paper #1: Uncovering Social Spammers: Social Honeypots + Machine Learning
Paper #2: An Army of Me: Sockpuppets in Online Discussion Communities

Reflection #1:
In this paper authors used honeypots, a type of harmless bots, to collect social spammer information from two early social network sites, Twitter and MySpace. The authors provided a clear motivation for the paper. According to authors email spammers and social spammers are very different. The framework introduced in this paper keeps the precision high for detection of spams. It was deployed in the early stage of the social network sites which probably made it easier to use the specific attributes from the user profile information to automatically detect social spammers. In the recent times, it seems social spammers have already been repurposed for new tasks like introducing misinformation in the network. One of the interesting thing in the paper was using different ensemble based classifier which I didn’t exactly understand. The authors introduced spam precision metric to evaluate their classifier in the wild but didn’t say how it was different from precision metric.

Reflection #2:
Sockpuppets are duplicate accounts to deceive users in a social discussion forum. Authors in this paper looked into 9 online forums to analyze attributes of sockpuppets. The first problem I see with the data collection is using IP addresses to detect sock-puppets. Although authors tried to filter IP addresses that are using NAT, it is not clear if that was effective assuming very few people join discussion forums. The authors did use prior work to characterize other parameters for their framework. Authors found several attributes of the sockpuppets are different from other spammers like bots, trolls etc. Sockpuppets tend to participate in a discussion about controversial topics. They are also more likely to be downvoted. The author used entropy to characterize usage pattern of different sockpuppets but it is not clear if that measurement works. The author used ego network to analyze user-user interaction and found that sockpuppets have higher PageRank in the network. The main success of the paper seems to be finding sockpuppets once one pair has been detected.

Read More

Reflection #10 – [03/22] – [Jamal A. Khan]

Both of the papers revolve around the theme of fake profile, albeit of different types.

  1. Kumar, Srijan, et al. “An army of me: Sockpuppets in online discussion communities.” Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017.
  2. Lee, Kyumin, James Caverlee, and Steve Webb. “Uncovering social spammers: social honeypots+ machine learning.” Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 2010.

The first paper about sockpuppets is well written and pretty well explained throughout. However, the motivation of the paper seems weak! i see that there are ~ 3300 socketpuppets out of a total of ~2.3 million users! So that bring me to my question that is that even worthy enough problem to tackle? Do we need an automated classification model?  why do socketpuppets need to be studied? Do they have harmful effects that need to be mitigated?
Moving forward, the entirety of the paper builds up to a classifier and though not a bad thing, I get a feeling that the work was conducted top down (idea of a classifier for sockpuppets -> features needed to build it) but the writing is bottom up (need to study sockpuppets and the use of the generated material to make a classifier). Regardless, the study does raise some follow up questions, some of which seem pretty interesting to check out:

  • Why do people make sock puppets? What purpose are the puppets are being used for? are they created for a targeted objective or are they more troll like (just for fun? or just because someone can?)
  • How do puppeteers differ from ordinary users?
  • Can community influence the creation of sockpuppets? I realize the paper already partially answer this question, but i think there needs to be much more dialed down attention on temporal effects of community on, and the behavior of, a puppeteer before the creation of the puppets)

Coming onto the classifier, I have a few grievances. Like many other papers that we have discussed in class, the paper lacks the details of the classifier model used e.g. number of tress in the ensemble, the max tree depth, what’s the voting strategy: do all tress get the same vote count?. However, I will give the authors credit because this is the first paper among the ones we’ve read that has used a powerful classifier as opposed to simple logistic regression. Still, the model has poor predictive power. Since, i’m working on an NLP classification problem i’m wondering if the sequential models might work better.

 

Moving onto the second paper, it’s a great idea but executed poorly and so, I apologize for the harsh critique in advance. The idea of honeypot profiles is intriguing but just like how social features can be used to sniff out spammer profiles, they can used to sniff out honeypots and hence, the trap can be avoided. So I think the paper’s approach is naive in the sense that it needed more work on why and how the social honeypots are more robust to changes in strategies by the spammers.

Regardless, the good thing about the project is the actual deployment of the honeypots. However, the main promise of being able to  classify the spammer and non-spammers has not been delivered. The scale of the training dataset is minuscule and not representative i.e. there are only 627 deceptive and 388 legtimate profiles for the MySpace classification task. Hence, the validity of the following table becomes questionable.

With the  dataset of the same scale as the one being used here, we could’ve also fitted a multinational regression and perhaps gotten similar results. The choices of the classifiers has not been motivated and why have so many been tested?. It seems the paper has fallen victim to the approach that “when you have a hammer everything looks like a nail”. The same story is repeated with the twitter classification task.

Regardless of my critique, the paper presents the classifier results in more detail than most papers, so that’s a plus. It was quite interesting to see that age in figure 2 (show below) had a big ROC. So, my question is that are the spammer profiles younger than legitimate user profiles?

Another, question is regarding the test of time for the study, will a similar classifier perform well  on the MySpace of today (i.e. Facebook)? Since, the user base now is probably much different and diverse now, the traits of legitimate users have changed.

Finally, I would like other people’s opinion on the last portion of the paper i.e. “in-the-wild” testing. I think this last section is just plain wrong and misleading at best. The authors say that

“… the traditional classification metrics presented in the previous section would be infeasible to apply in this case. Rather than hand label millions of profiles, we adopted the spam precision metric to evaluate the quality of spam predictions. For spam precision, we evaluate only the predicted spammers (i.e., the profiles that the classifier labels as spam). …”

Correct me if I’m wrong please but the spam precision metric proposed is measuring the true positive rate for the profiles that were actually classified as spam and not the ones that were actually spam. This is misleading because it ignores the profiles that weren’t detected in the first place and so for all we know, the true-negatives may have been orders of magnitude larger than the detected negatives. For Example, suppose in actuality we had 100000 spam profile in 500000 overall profiles, out of which only 5000 were detected. The authors are only reporting how many of the 5000 were actually true positives and  not how many of the 100000 were true positive. There is no shortcut to research and i think the reason cited above in italics is simply a poor excuse to avoid a repetitive and time consuming task. In the past few years, good data is what has driven Machine Learning to it’s current popularity and so to make a claim using ML, the data needs to be to a great degree unquestionable. It’s for the same reason that most companies don’t mind releasing their deep learning model’s architecture because they know that without the data that the company had, no one will be able to reproduce similar results. Therefore, to me all the results in section 5 are bogus and irrelevant at best. Again I apologize for the harsh critique.

Read More

Reflection #10 – [03/22] – [Meghendra Singh]

  1. Kumar, Srijan, et al. “An army of me: Sockpuppets in online discussion communities.” Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017.
  2. Lee, Kyumin, James Caverlee, and Steve Webb. “Uncovering social spammers: social honeypots+ machine learning.” Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 2010.

In the first paper, Kumar et. al. study “sockpuppets” in nine discussion communities (a majority of these are news websites).  The study shows that sockpuppets behave differently than ordinary users, and these differences can be used to identify them. Sockpuppet user accounts have specific posting behavior, they don’t usually start discussions, their posts are generally short and contain certain linguistic markers (For e.g., greater use of personal pronouns such as “I”).  The authors begin by automatically labeling 3665 users as sockpuppets in the 9 discussion communities using their IP addresses and user session data. The authors identify two types of sockpuppets: pretenders (pretending to be legitimate users) and non-pretenders (easily identifiable as fake users, i.e. sockpuppets, by the discussion community). The two main results of the paper are classification of sockpuppet user pairs from ordinary user pairs (ROC AUC=0.91) and predicting if an individual user is a sockpuppet (ROC AUC=0.68), using activity, community and linguistic (post) features. The paper does a good job of explaining the behavior of sockpuppets in the comments section of articles in typical news websites and how these behaviors can be used to detect sockpuppets and thereby lead to maintaining healthy and unbiased online discussion communities. The paper references a lot of prior work and I really appreciate the fact that most of the decisions about features, parameters and other assumptions made in the study, are grounded in past literature. While reading the paper, a fundamental question that came to my mind was, if we can already identify a sockpuppets using IP addresses and temporal features of their comments, what is the point of using predictive modeling to differentiate sockpuppets from ordinary users? In essence, if we already have a high precision, rule-based approach to detect sockpuppets why rely on predictive modeling that performs a little better than random chance (ROC AUC=0.68)?

I found the sockpuppet-ordinary user conversation example at the end of section 2 really funny, and I feel that the first comment itself is rather suspicious. This example also seems to indicate that the puppetmaster (S2) is the author of the article on which these comments are being posted. This leads to the question that given a puppetmaster has multiple sockpuppet accounts will their main account be considered an ordinary user? If not, does this mean that some of the articles themselves are being written by sockpuppets? A research question in this context can be: “detecting news articles written by sockpuppets in popular news websites”. Another question I had was why did the authors use cosine similarity between feature vectors of users? And what are the statistics for this metric (mean and standard deviation of cosine similarities between sockpuppet and ordinary user feature vectors). Additionally, is there a possibility of using a bag of words model here, instead of numeric features like LIWC and ARI computed from user’s posts? Moreover, there is a potential to experiment with other classification techniques here and see if they can perform better than Random Forest.

Lastly, as the authors suggest in discussion and conclusion, it would be interesting to repeat this experiment on big social platforms like Facebook and Twitter. This becomes really important in today’s world, where online social communities are rife with armies of sockpuppets, spambots and astroturfers, hell-bent on manipulating public opinion, scamming innocent users and enforcing censorship.

The second paper by Lee et. al. addresses a related problem of detecting Spammers on MySpace and Twitter using Social Honeypots and classifiers. The study presents an elegant infrastructure for capturing potential spammer profiles, extracting features from these profiles and training popular classifiers for detecting spammers with high accuracy and low FPR. The most interesting finding for me were the most discriminative features (i.e., About Me Text and Number of URLs per tweet) for classifying spammers from legitimate users and the fact that ensemble classifiers (Decorate, etc.) performed the best. Given that deep learning was not really popular in 2010, it would be interesting to apply state of the art deep learning technique for the classification problem discussed in this paper. As we have already seen that the discriminative features that separate spammers from regular users vary from one platform/domain to other, it would be interesting to see if there exist common cross-platform, cross-domain (universal) features that are equivalently discriminative. Although, MySpace may not be dead, it would be interesting to redo this study on Instagram which is a lot more popular now, and has a very real spammer problem. Based on personal experience, I have observed legitimate users on Instagram becoming spammers once they have enough followers. Will a social honeypot based approach work for detecting such users? Another challenge with detecting spam (or spammers) on a platform like Instagram is that most of the spam is in the form of stories (posts which automatically disappear in 24 hours), while the profiles may look completely ordinary.

Read More

Reflection #10 – [03/22] – [Vartan Kesiz-Abnousi]

Topic: Bots & Sock puppets

Definitions:
Sockpuppets: “fake persona used to discuss or comment on oneself or one’s work, particularly in an online discussion group or the comments section of a blog” [3]. Paper [1] defines it  as “a sockpuppet as a user account that is controlled by an individual (or puppetmaster) who controls atleast one other user account.” They [1] also use the term “sockpuppet group/pair to refer to all the sockpuppets controlled by a single puppetmaster”.
BotsInternet Bot, also known as web robot, WWW robot or simply bot, is a software application that runs automated tasks (scripts) over the Internet.” [4]
Summary [1]
The authors [1] study the behavior of sockpuppers. The research goal is to identifying, characterizing, and
predicting sockpuppetry. This study [1] spans across nine discussion communities. They demonstrate that sockpuppets differ from ordinary users in terms of their posting behavior, linguistic traits, as well as social network structure. Moreover, they use the IP addresses and user session data and identify 3,656 sockpuppets comprising 1,623 sockpuppet groups, where a group of sockpuppets is controlled by a single puppetmaster. For instance, when studying “avclub.com”, the authors find that sockpuppets  tend to interact with other sockpuppets, and are more central in the network than ordinary users.  Their findings suggest a dichotomy in the deceptiveness of sockpuppets:  some  are pretenders, that masqueradeas separate users, while others are non-pretenders, that is sockpuppets that are overtly visible to other members of the community. Furthermore, they find that deceptiveness is only important when sockpuppets are trying to create an illusion of public consensus. Finally, they create a model to automatically identifying sockpuppetry.
Reflections [1]
Of the 9 nine discussion communities that were studied, their is a heterogeneity with respect to: a) the “genre”, b) the number of users and c) the percentage of sock-puppets. While these are interesting cases to study, none of them are “discussion forums”. Their main function as websites, and business model, is not to be a discussion platform. This has several ramifications. For instance, “ordinary” users, and possibly moderators, who are participating in such websites might find it more harder to identify “sock-puppetry”, because they can not observe their long-term behavior, as they could in a “discussion forum”.
Their analysis focuses on sockpuppets groups that consist of two sockpuppets. However, neither sockpuppets groups that consist of three or even four sockpuppets are not neglible. What if these sockpuppets demonstrate a different pattern? What if a multitude of sockpuppets of 3, 4 and beyond,  is more likely to engange in systematic propaganda? This is a hypothesis that would be interesting to explore.
I also believe that we can draw some parallels from this paper with another paper that we reviewed in this class regarding “Antisocial Behavior in Online Discussion Communities” [5]. For instance, their definitions are different regarding the definition of “threads” etc. As a matter of fact, two of the authors in both papers are the same, [1] Justin Cheng and Jure Leskovec. Furthermore, in both papers they use “Disqus”, which is the commenting platform that hosted these discussions. Would the results generalize in something else than “Disqus”? This, I believe, remains a central question.
The “matching” by utilizing the propensity score is questionable. The propensity score is a good matching measure only when we account/control for all the factors i.e. we know the “true” propensity score. This does not happen in the real world. It might be a better idea to add “fixed-effects” and restrict the matches to a specific time wedge, i.e. match observations within the same week to control for seasonal effects.  The fact that the dataset is “balanced” after the matching does not consist of evidence that the matching was done correctly.  It is the features they used for matching  (i.e. “the similar numbers of posts and make posts to the same set of discussions) that should be balanced, not the “dataset”. They should have had at least a QQ plot that shows the ex-ante and ex-post matching performance. A poor matching procedure will result into bad inputs into their subsequent machine learning model, in this case random forest. Note that the authors performed the exact same matching procedure in their previous 2015 paper [5]. Apparently nobody pointed this out.
Questions [1]
 
[Q1] I am curious as to why the authors decided take the following action: “We also do not consider user accounts that post from many different IP addresses, since they have high chance of sharing IP address with other accounts“. I am not sure whether I understand their justification. Is their research that backs up this hypothesis? No reference is provided.
In general, remove outliers, for the sake of removing outliers, is not a good idea. Outliers are removed usually when a researcher believes when a specific portion of the data is an wrong data entry i.e. a housing price of $0.
[Q2] A possible extension would be to explore the relationship beyond sockpuppets groups that consist of only two sockpuppets.
[Q3] There is no guarantee that the matching was done properly, as I analyze in the reflection.
Summary [2]
The authors propose and evaluate a novel honeypot-based approach for uncovering social spammers in online social systems. The authors define social honeypots as information system resources that monitor spammers’ behaviors and log their information. The authors propose a method to automatically harvest spam profiles from social networking communities, the development robust statistical user models for distinguishing between social spammers and legitimate users and filtering out unknown  (including zero-day) spammers based on these user models. The data is drawn from two communities, MySpace and Twitter.
Reflections [2]
While I was reading the article, I was thinking of the IMDB ratings. I have observed that there have been a lot of movies, usually controversial, that receive ratings only ratings that in the extremes of the rating scale, either “1” or “10”. Or in some other cases, movies are rated, even though they have still not been publicly released. Which fraction of that would be considered a “social spam” though? Is a mobilization of an organized group that is meant to down-vote a movie a “social spam” [6]?
Regardless, I think it is very important to make sure ordinary users are not classified as spammers, since this could have a cost on the social networking site, including their public image. This means that their should be an acceptable “false positive rate”, tied to the trade-off between between having spammers and penalizing ordinary users, a concept known in mathematical finance as “Value at risk (VaR)”.
Something that we should stress is that in the MySpace random sample, the profiles have to be public and the about me information has to be valid. I found the interpretation of the “AboutMe” feature as the best predictor given by the authors very interesting. As they argue, it is the most difficult feature for a spammer to vary because it contains the actual sales pitch or deceptive content that is meant to target legitimate users
Questions [2]
[Q1] How would image recognition features perform as predictors?
[Q2] Should an organized group of ordinary people who espouse an agenda be treated as “social spammers”?
References
[1] Kumar, Srijan, et al. “An army of me: Sockpuppets in online discussion communities.” Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017.
[2] Lee, Kyumin, James Caverlee, and Steve Webb. “Uncovering social spammers: social honeypots+ machine learning.” Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 2010.
[5] CHENG, J.; DANESCU-NICULESCU-MIZIL, C.; LESKOVEC, J.. Antisocial Behavior in Online Discussion Communities. International AAAI Conference on Web and Social Media, North America, apr. 2015. Available at: <https://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10469>.

Read More

Reflection #10 – [03/22] – [Hamza Manzoor]

[1]. Kumar, Srijan, et al. “An army of me: Sockpuppets in online discussion communities.” Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017.

Summary:

In this paper, Kumar et al. present a study on sockpuppets in online discussion communities. They perform their study on nine different online discussion communities and their data consisted of around 2.9 million users. Authors use multiple logins from the same IP to identify sockpuppets and evaluate the posting behavior and linguistic features of sockpuppets to build a classifier. The authors find that the ordinary users and sockpuppets use different languages and sockpuppets tend to write more posts than ordinary users (699 vs. 19). Sockpuppets also use more first person and second person singular personal pronouns and are more likely to be down-voted, reported, or deleted. The authors also explain the types of sockpuppets as “pretenders and non-pretenders” and “supporters vs. dissenters”.

Reflection:

I really enjoyed reading this paper because the study they performed was very different especially the labels creation using IP addresses and user sessions. The authors received non-anonymized data from Disqus, which makes me question that is it legal for Disqus to share non-anonymized data?

Some of the findings in the study were very astonishing such as email addresses and usernames of sockpuppets are more similar. First, I do not like the approach to classify sockpuppets based on username and second, I am having hard time believing that sockpuppets have similar usernames and emails. Do they mean that all usernames and emails of one puppet master similar to one another? Or are they similar to other sockpuppets?  

Their findings also show that 30% of sockpuppets are non-supporters, 60% are supporters and only 10% are dissenters. It would have been interesting find that sockpuppets support each other on what kind of topics? Do we see more supporters on political topics? Or are there sockpuppets belonging to one puppet master which are supporters on one left leaning topics and dissenters on right leaning topics and vice versa? If yes, can we claim that does some specific party pay someone to create these sockpuppets?

Read More

Reflection #10 – [03/22] – [Jiameng Pu]

An Army of Me: Sockpuppets in Online Discussion Communities

Summary:

People interact with each other on the Internet mainly by discussing mechanism provided by social networks such as Facebook, Reddit. However, sockpuppets created by malicious users badly influence the network environment by engaging in undesired behavior like deceiving others or manipulating discussions. Srijan et al. study sockpuppetry across nine discussion communities. By firstly identify sockpuppets using multiple signals indicating accounts might share the same user, they then characterize their behavior by inspecting different aspects. They find that the behavior of sockpuppets is different from that of ordinary users in many ways, e.g., start fewer discussions, write shorter posts, use more personal pronouns such as “I”. The study contributes towards the automatic detection of sockpuppets by presenting a data-driven view of deception in online communities.

Reflection:

For the process of identifying sockpuppets, the strategy is inspired by Wikipedia administrators who identify sockpuppets by finding accounts that make similar edits on the same Wikipedia article in near-similar time and from same IP address, which makes sense. But for the hyper-parameter, top percentage(5%) of most used IP address, is there any better strategy that can decide the percentage more numerically rather than intuitively? When measuring linguistic traits of sockpuppets, LIWC word categories is used to measure the fraction of each type of words written in all posts, and VADER for sentiment of posts. Up to now, I feel LIWC word categories is powerful and heavily used in social science research, I’ve never used VADER before. In the double life experiment, although they match sockpuppets with ordinary users that have similar posting activity, and that participate in similar discussion, I feel like there is too much uncertainty in the linguistic feature of ordinary users, i.e., different users have different writing style. Then the cosine similarity of the feature vectors for each account would be less convincing.

Uncovering Social Spammers: Social Honeypots + Machine Learning

Summary:

Both web-based social networks (e.g., Facebook, MySpace) and online social media sites (e.g., YouTube, Flickr) rely on their users as primary contributors of content, which made them prime targets of social spammers. Social spammers engage themselves in undesirable behavior like phishing attacks, to disseminate malware and commercial spam messages, etc, which will seriously impact the user’s experience. Kyumin et al. propose a honeypot-based approach for uncovering social spammers in online social systems by harvesting deceptive spam profiles from social networking communities and creating spam classifiers to actively filter out existing and new spammers. The machine learning based classifier is able to identify previously unknown spammers with high precision and a low rate of false positives.

Reflection:

The section of machine learning based classifier impressed me a lot, since it shows how to investigate the discrimination power of our individual classification features apart from only evaluating the effectiveness of classifiers, in which ROC curve plays an important role. Also, AMContent, the text-based features modeling user-contributed content in the “About Me” section, shows me how to use more complicated text feature besides simple data like age, marital status, gender. I’ve never heard of Myspace before but there is still twitter experiment, otherwise I would think this is a weird choice of experiment dataset. For twitter spam classification, we can obviously see the differences in the way they collect account feature, i.e., twitter accounts are noted for their short posts, activity-related features, and limited self-reported user demographics. Thus there is a reminder that feature design varies according to the variation of study subjects.

 

Read More

Reflection #10 – [03/22] – [John Wenskovitch]

This pair of papers describes aspects of those who ruin the Internet for the rest of us.  Kumar’s “An Army of Me” paper discusses the characteristics of sockpuppets in online discussion communities (as an aside, the term “sockpuppet” never really clicked for me until seeing its connection with “puppetmaster” in the introduction of this paper).  Looking at nine different discussion communities, the authors evaluate the posting behavior, linguistic features, and social network structure of sockpuppets, eventually using those characteristics to build a classifier which achieved moderate success in identifying sockpuppet accounts.  Lee’s “Uncovering Social Spammers” paper uses a honeypot technique to identify social spammers (spam accounts on social networks).  They deploy their honeypots on both MySpace and Twitter, capturing information about social spammer profiles in order to understand their characteristics, using some similar characteristics as Kumar’s paper (social network structure and posting behavior).  These authors also build classifiers for both MySpace and Twitter using the features that they uncovered with their honeypots.

Given the discussion that we had previously when reading the Facebook papers, the first thing that jumped out at me when reading through the results of the “Army of Me” paper was the small effect sizes, especially in the linguistics traits subsection.  Again, these included strong p-values of p<0.001 in many cases, but also showed minute differences in the rates of using words like “I” (0.076 vs 0.074) and “you” (0.017 vs 0.015).  Though the authors don’t specifically call out their effect sizes, they do provide the means for each class and should be applauded for that.  (They also reminded me to leave a note in my midterm report to discuss effect sizes.)

One limitation of “Army of Me” that was not discussed was the fact that all nine communities that they evaluated use Disqus as a commenting platform.  While this made it easier for the authors to acquire their (anonymized) data for this study, there may be safety checks or other mechanisms built into Disqus that bias the characteristics of sockpuppets that appear on that platform.  Some of their proposed future work, such as studying the Facebook and 4chan communities, might have made their results stronger.

“Army of Me” also reminded me of the drama from several years ago around the reddit user unidan, the “excited biologist,” who was banned from the community for vote manipulation.  He used sockpuppet accounts to upvote his own posts and downvote other responses, thereby inflating his own reputation on the site.

Besides identifying MySpace as a “growing community” in 2010, I thought that the “Uncovering Social Spammers” paper was a mostly solid and concise piece of research.  The use of a human-in-the-loop approach to obtain human validation of spam candidates to improve the SVM classifier appealed to the human-in-the-loop researcher in me.  Some of the findings from their honeypot data acquisition were interesting, such as the fact that Midwesterners are popular spamming targets and that California is a popular profile location.  I’m wondering if the fact that these patterns were seen is indicative of some bias in the data collection (is the social honeypot technique biased towards picking up spammers from California?), or if there actually is a trend in spam accounts to pick California as a profile location.  This wasn’t particular clear to me; instead, it was just stated and then ignored.

I really liked their use of both MySpace and Twitter, as the two different social networks enabled the collection of different features (e.g., F-F ratio for Twitter, number of friends for MySpace) in order to show that the classifier can work on multiple datasets.  It’s almost midnight and I haven’t slept enough this month, but I’m still puzzled by the confusion matrix that they presented in Table 1.  Did they intend to leave variables in that table?  If so, it doesn’t really add much to the paper, as they’re just describing the standard definitions of precision, recall, and false positive.  They don’t present any other confusion matrices in the paper, so it seems even more out of place.

Read More

Reflection #10 – [03/22] – [Ashish Baghudana]

Kumar, Srijan, et al. “An army of me: Sockpuppets in online discussion communities.” Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017.
Lee, Kyumin, James Caverlee, and Steve Webb. “Uncovering social spammers: social honeypots+ machine learning.” Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 2010.

Summary 1

Kumar et al. present a data-driven view of sockpuppets in online discussion forums and social networks. They identify pairs of sockpuppets in online discussion communities hosted by Disqus. Since the data isn’t labeled, the authors devise an automatic technique of identifying sockpuppets based on the frequency of posting and look for multiple logins from the same IP address. In their research, the authors find linguistic differences in the posts of sockpuppets and ordinary users. Primarily, they find that sockpuppets use the first person or second person more often than normal users. They also find that sockpuppets write poorer English and more likely to be downvoted, reported, or deleted. The authors note that there is a dichotomy of sockpuppets – pretenders and non-pretenders. Finally, the authors build a classifier for sockpuppet pairs and determine the predictors of sockpuppetry.

Critique 1

I found the paper very well structured and the motivations clearly explained. The authors received non-anonymous data of users on Disqus. Their dataset creation technique using IP addresses, user sessions, and frequency of posting was very interesting. However, it appears like they use some sense of intuition to determine these three factors in identifying sockpuppets. In my opinion, they should have attempted to validate their ground truth externally – possibly using Mechanical Turks. Their results also seem to suggest that there is a primary account and the remaining are secondary. In today’s world of fake news and propaganda propagation, I wonder if accounts are created solely for the purpose of promoting one view. I was equally fascinated by the dichotomy of sockpuppets. In the non-pretenders group, users post different content on different forums. This would mean that the sockpuppets are non-interacting. Why then would a person create two identities?

Summary 2

Following the theme of today’s class, the second paper attempts to identify social spammers in social networking contexts like MySpace and Facebook. They propose a social honeypot framework to lure spammers and record their activity, behavior, and information. A honeypot is a user profile with no activity. On a social network platform, if this honeypot receives unsolicited friend requests (MySpace) or followers (Twitter), it is likely a social spammer. The authors collect information about such candidate spam profiles and build an SVM classifier to differentiate spammers and genuine profiles.

Critique 2

Unlike a traditional machine learning model, the authors opt for a human in the loop model. A set of profiles selected by the classifier are marked to human validators. Based on the feedback from the validators, the model is revised. I think this is a good approach to data collection, validation, and model training. As more feedback is incorporated, the model keeps getting better and encompassing different social spam behaviors. The authors also find an interesting classification of social spammers – more often than not, they attempt to sell pornographic content or enhancement pills, promote their businesses or attempt to phish user details by redirecting people to phishing websites. Since the paper is from 2010, they also use MySpace (a now defunct social network?). It would have been nice to see an analysis of which features stood out in their classification task – however, the authors only presented results of different models.

Read More

Reflection #10 – 03/15 – Pratik Anand

Paper 1 : An Army of Me: Sockpuppets in Online Discussion Communities

Paper 2 : Uncovering Social Spammers: Social Honeypots + Machine Learning

The first paper is of special interest to me. It deals with sockpuppets and fake accounts on social media and forums. Being an active user, I see a lot of this in action. The paper identifies sockpuppets accounts are those which are maintained by a single person, referred to as puppeteer, who uses this account to either promote/denounce a certain viewpoint or cause overall dissent without the consequences of getting affected by account bans by moderators.
The authors acknowledge that it is very hard to get ground truth data for this matter. So they use observational studies to get insights into sockpuppet behavior. Could the techniques which are used for ground truth on spam messages be applied for sockpuppets ? In my opinion, a list of banned users in a social forum is a good way to get started.
Some of the traits observed are :

1.  Sockpuppets accounts are created quite early on by the users
2.  They usually post at the same time and on the same topics.
3. They rarely start a discussion but usually participate in discussions.
4. The discussion topics are always very controversial.
5. They write very simialr content

The authors also observed that the sockpuppets are treated harshly by the community. Is it due to their behavior or just a side-effect of the fact that they only participate in posts about controversial topics? Not all sockpuppets are malicious. The question of pretenders vs non-pretenders was very intriguing. Some people keep a sock puppet for entirely comical/other purposes and I don’t believe that the authors’ method of classifying them based on username is effective enough.
This is because many non-pretenders may keep multiple sockpuppet account based around a joke which will fail to be classified as non-pretending account by the authors’ method.

The authors have provided a case where two sockpuppets, by the same pupetmaster, argue against each other. They justify this behavior as their means to increase traffic to the given post. I am not sure if that is the reason. They didn’t provide a way to identify if those sockpuppets are indeed handled by the same person. There is also a possibility of a group of people maintaining certain sockpuppet accounts. This will make their patterns everchanging and also provide alternate reasoning to the argument point raised above.

The second paper deals with creating honeypots to learn about spam account traits and using it for spam classifier. The authors do a good job of explainaing how social spam is different from email spam as it has a touch of personalized message which is a more effective startegy for luring users. Though this paper doesn’t go into details of how they setup the honeypots, they share the observations from analysing the spammers who got into the honeypot. The honeypots were created on MySpace and Twitter and spammer behavior vary different in both cases. The authors note that MySpace is more of a long form of social communication platform. Thus, they identify “About Me” section as the most important part of a spammer profile which can be used in classification. They make an asusmption that it won’t change radically as it is like a sales pitch and thus, the spam classifiers will be able to detect them. I believe this is a limitation of the technique. About Me can be changed as easily as any other section. It is indeed important but replacing it will be like replacing one sales pitch with another. Hence, that justification doesn’t hold up.

The paper details that the authors’ created MySpace social profile with geographic location pertaining to every state in the USA. What was the reasoning behind this ? Do different geographic locations provide a level of genuineness which these honeypots profile require?
Lastly, can a reverse technique be used by the spamemrs to identify honeypots profiles and take safeguards against them ?

Read More