Reflection #4 – [09/06] – Subil Abraham

Summary:

This paper analyzes the phenomenon of sock puppets – multiple accounts controlled by a single user. The authors attempt to identify these accounts specifically on discussion platforms from the various signals that are characteristic of sock puppets and build a model to then identify them automatically. They have also characterized different kinds of sock puppet behavior and show that not all sock puppets are malicious (though keep in mind that they use a wider definition of what a sock puppet account is). They have found that it is easier to identify a pair of sock puppets (of a sock puppet group) from their behaviour with respect to each other than it is to find a single sock puppet in isolation.

 

Reflection:

It seems to me that though this paper specifically mentions that they have a broad definition of what a sock puppet is and distinguishes between pretenders and non pretenders, the paper seems to be geared more towards the study and identification of pretenders. The model that is built seems to be better trained at identifying the deceptive kinds of sock puppets (specifically, pairs of deceptive sock puppets of the same group) given the features it uses to identify them. I think that is fair, since the paper mentions that most sock puppets are used for deception and identifying them is of high benefit to the discussion platform. But I feel like if the authors were going to discuss non pretenders too, they should be explicit about their goals with regards to the detection they are trying to do. Just stating “Our previous analysis found that sockpuppets generally contribute worse content and engage in deceptive behavior.” seems to be going against their earlier and later statements about non pretenders and seems to clump them together with the pretenders. I know I’m rambling a bit here but it kind of stood out to me. I would say separate out the discussions of non pretenders and only briefly mention them, focus exclusively on pretenders.

Following that train of thought, let’s talk about non pretenders. I like the idea of having multiple online identities and using different identities for different purposes. I believe that it was something that was more widely practiced in the earlier era where everyone was warned not to use their real identity on the internet (but in the era of Facebook and Instagram and personal branding, everyone seems to have gathered towards using one identity – their real identity). It’s nice to see that there are still some holdouts and it’s something that I would like to see studied. I want to ask questions like: Why use different identities? How many explicitly try to keep their separate identities as separate (i.e. not allow anyone to connect their different identities? How would you identify non pretender sock puppets since they don’t tend to share the same features of the pretenders that the model seems to be (at least to me) is optimised for? Perhaps one could compare writing styles of suspected sockpuppets using word2vec or look at what times they post at (i.e. looking at the time period in which they are active rather than looking at how quickly they post one after another like you would for a pretender).

The authors have pointed out that sock puppets share some linguistic similarities with trolls. This takes me back to the paper on anti social users [1] we read last week. Obviously, not all sock puppets are trolls. But I think an interesting question is how many of the puppet masters fall under the umbrella of anti social users seeing as they are somewhat similar. The anti social paper focused on single users but what if you threw the idea of sock puppets into the mix? I think with the findings of this paper and that paper, you would be able to identify more easily the anti social users who use sock puppet accounts. But they are probably only a fraction of all anti social users so it may or may not be very helpful in the larger scale problem of identifying all the antisocial users.

One final thing I thought about was studying and identifying teams of different users who post and interact with each other similar to how sock puppets accounts work. How would identifying these be different? I think they might have a similar activity feature values to sock puppets and at least slightly different post features. Will having different users rather than the same user post and interact and reinforce each other muddy the waters enough that ordinary users, moderators and algorithms can’t identify them and kick them out? Can they muddy the waters even further by having each user in the team have their own sock puppet group but where the sock puppets within a group avoid interacting with each other like a regular pretender sock puppet group would, but instead only with sock puppets of the other users on their team? I think the findings of this paper could be effectively be used to identify these cases as well with some modification, since in this case the teams of users are essentially doing the same thing as single user sock puppets. But I wonder what these teams could do to bypass that. Perhaps they could write longer and different posts than a usual sock puppet to bypass the post features test. Perhaps post at different times and interact more widely to fool the activity tests. The model in this paper could provide a basis but would definitely need tweaks to effectively use it.

 

[1] Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.” Icwsm. 2015.

Leave a Reply

Your email address will not be published. Required fields are marked *