Reflection #4 – [09/06] – [Lindah Kotut]

Reading Reflection:

  • Kumar, S., Cheng, J., LesKovek, J., and Subrahmanian, V.S. “An Army of Me: Sockpuppets in Online Discussion Communities

Brief:
The authors make a case of the fact that anonymity encourages deception via sock-puppets, and so propose a means of identifying, characterizing and predicting sockpuppetry using user IP (if there are at least 3 posts from the same IP) and user session data (if posts occur within 15 minutes). Some of the characteristics found to be attributable to sockpuppets include the use of more first-person pronouns, fewer negations, fewer English parts-of-speech (worse-writing than average user). They also found sockpuppets to be responsible for starting fewer conversations but participate in more replies in the same discussion than can be attributed to random chance. They were also more likely to be down-voted, reported and/or deleted by moderators, and tended to have higher page rank and higher local clustering coefficient

Authors also note some concerns regarding the use of sockpuppets in discussion communities: notably, the potentiality of showing a false equivalence, as an act of vandalism and/or cheating.

Reflection:
What happens when the use deceptive sockpuppets are capable of usurping and undermining the will of the majority? I do not have a good example of a case where this is true on social media (separate from the case of battle of bots during the 2016 U.S. election cycle), but there is ample cases where this case could be examined: The FCC request for comment during the Net Neutrality debate in 2017/2018 and the saga of Boaty McBoatface serve as placeholder cautionary tales, for there was no do-over to correct for sockpuppets especially in the case of FCC. This is concern, because this phenomenon can erode the very fabric by which trust in the democratic process is built upon (beyond the fact that some of this events happened over two years ago with no recourse/remedies applied to-date). A follow-up open question would be: what then would replace the eroded system? Because if there is no satisfactory answer to this, then maybe we should have some urgency in shoring up the systems. How then do we mitigate sockpuppetry apart from using human moderators to moderate and/or flag suspected accounts? A hypothetical solution that uses the characteristics pointed out by the authors in automating the identification and/or suspension of suspected accounts is not sufficient as a measure in itself.

The authors, in giving an example of an exchange between two sock-puppets and that of a user who identifies the sockpuppet as such, reveals the presence/power of User Skepticism. How many users are truly fooled by these sockpuppets over the nuisance questions. A simplified way this can be done is a simple recruitment of users to determine whether a certain discussion(s) can be attributed to regular users or sockpuppets. This consideration can lead down the path of measuring for over-corrections:

  • is the pervasive knowledge of the presence of these sockpuppets lead to users doubting even legitimate discussions (and to what extent is this prevalent)?

This paper’s major contribution is in looking at sockpuppets in discussions/replies (therefore this point is not to detract from this contribution). On the matter of the (mis)use of pseudonyms: From a benign use-case such as the Reddit for example has a term “throw-away account” from when a regular user wants to make a discussion about controversial topic that s/he does not want to associate with their regular account, to the extreme end of a journalist using it to “hide” their activities in alt-right community discussions.

  • Can these discussions be merged, or does the fact that it does not strictly adhere to the authors’ definition disqualify it? (Because I believe that considering why users resort to the use of sockpuppets beyond faking consensus/discussion and sowing discord.)

A final point regards positive(ish) use. A shopkeeper with a new shop that wants customers can loudly hawk their wares in front of their shop to attract attention: which is to say, could we consider positive use-cases of this behavior, or do we categorize it as all bad? A forum can attract shy contributors and spark a debate by using friendly sockpuppetry to get things going. Ethical?

Read More

Reflection #4 – [09/06] – [Neelma Bhatti]

Kumar, Srijan, et al. “An army of me: Sockpuppets in online discussion communities.” Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017.

Summary

Kumar et al. focus this paper on the identification, characterization and prediction of sock puppets in online discussion communities. They define sockpuppets as extra account(s) created and managed by a single user for influencing or manipulating public opinion, igniting debates or vandalizing content (in case of Wikipedia). They characterize sockpuppets as pretenders vs non-pretenders and supporters vs. dissenters  based on their linguistic traits, online activity and reply network structure.

Reflection

Although this study nicely situates itself in the body of work currently done in the domain of deception, I felt that it does not establish a very strong objective of being carried out.

It would also be interesting to see if a sock puppet account, or a pair of such is operated by more than one person interchangeably, which not only makes the concept of one puppet master imprecise, but also weakens the statistics obtained by the data with a single user in mind. The hypothesis of puppet masters leading double life reminded me of Facebook, where spouses access each other’s accounts without any problem, sometimes simply in order to peek into the content of ladies only groups, and even comment or react on different posts just for the sake of fun. Although very different from the topic under discussion, it poses the question of whether a study on online behavior of such individuals produce accurate results because of multiple users associated with a single account.

The authors have also used IP as a means to cluster different sock puppets, I was wondering if users logging in to the social platform using proxy servers would be easy to identify using the same study? What if the puppet master uses a both sock puppets and bots to steer the discussion?  In such case, the detection system can be made more robust by incorporating mechanisms to not only to take linguistic traits and activity, but also consider the amount of customization in creating user profile and geographical metadata [1]. This will not only help detecting sockpuppets, but will also be able to identify bots from sockpuppets.

The authors also rightly point out that study of behavior or personality traits would add another dimension to this research. The reasons of having more than one identity online can go beyond sadism, and also be a product of sheer boredom or for the sake of bragging in front of friends.  The puppetmaster can also create multiple identities to avenge a previous ban.

[1] Bessi, A., & Ferrara, E. (2016). Social bots distort the 2016 US Presidential election online discussion.

Read More

Reflection #4 – [09/06] – Vibhav Nanda

Reading:

[1] An Army of Me: Sockpuppets in Online Discussion Communities

Summary:

The authors of this paper have devoted their energy towards talking about sockpuppets in online discussion communities. So as to comprehensively study sockpuppets, and their associated online behavior the authors obtained data from nine different online discussion communities consisting of 2,129,355 discussions , 2,897,847 users, and 62,744,175 posts. They then decided to identify sockpuppets by using a combination of three elements: their IP address, their activity in the discussion post, and the time at which the comment(s) were made. By using this combination of factors, they were able to formally define sockpuppets — “a user account that post from the same IP address in the same discussion in close temporal proximity at least 3 times.” Utilizing this formal definition and an analytical model, the authors deduced 1,623 sockpuppet groups and 3,656 sockpuppets from nine different online discussion communities. Outcome of the project ensued in plethora of intuitive but interesting results including but not limited to the following list:

  1. Sockpupptets start fewer discussions, and post more in existing discussions.
  2. Sockpuppets tend to participate in discussions with more controversial topics.
  3. Sockpuppets are treated harshly by the community.
  4. Sockpuppets in a pair interact with each other more. etc.

Reflection and Questions:

I had really never thought about this area of research and hence this reading ensnared my attention and interest. Howbeit, as I read through the paper it seemed as if it was more focused towards the pretenders and less on the non-pretenders and that reflected in the way they defined sockpuppets — which is totally fine but according to me the authors should have mentioned their focus somewhere in the introduction. Since I didn’t find quality material for non-pretenders, I started thinking how would I define sockpuppets with respect to non-pretenders? Assuming complete access to a users profile, I would start by correlating basic information of the user; for instance their birthday, secret questions, name (in some cases), small variations in username, family information (if available), and contact information. Since non-pretenders do not masquerade, simply use different accounts for different use cases, I would assume that they would not have any reason to manipulate their basic information — unless the platform prevents the user from doing so. Whilst reading the paper, I started to contemplate what could be the embolding factor behind puppetmasters? The only reason I could think was the motivating factor to push their/their sponsors’ political, and ideological agenda, or dilute the opponents agenda. Howbeit in both cases I would assume that puppetmasters would be more articulate in their writing to effectively sway the audience in/against either direction, but the results of this paper — that sockpuppets write shorter sentences with more swear words and use more personal pronouns — were counterintuitive to me. As I was reading through the fifth section of the paper, it occurred to me to think about how long these accounts have been active ? or how frequently does a supposed puppetmaster create new accounts? I am not sure yet what new things we might discover by seeking answers to these questions, but I think these are interesting questions. Another correlation that I strongly thought about was to check if sockpuppets are recycled among different puppetmasters/groups? If we find this to be true, and do some analysis on the topics that these sockpuppets try to propagate or demolish the support of, then we can group the groups according to their affiliations; and if we add a spatial aspect to the groups of groups then we may be able to associate and identify what kind of ideologies are wide-spread in what part of the world. We might also be able to find out if a group is trying to propagate it’s own ideology or demolish another regions’. For instance a group from country X is spreading hate towards topic Y, but as a matter of fact topic Y is appreciated in this country X, then we know that this group is demolishing ideology in a different region and so is true for the opposite where topic Y is hated in country X.

Read More

Reflection #4 – [09/06] – [Shruti Phadke]

Kumar, Srijan, et al. “An army of me: Sockpuppets in online discussion communities.” Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017.

Summary:

Sockpuppets in reference to this paper, are online discussion accounts belonging to the same user, also referred here as the “puppetmaster”. Kumar et. al’s work in studying sockpuppets in online communities observe the posting behavior, interactions and linguistic characteristics of sock puppets, finally leading to a predictive model. They observe that the sockpuppets tend to comment more, write shorter posts, use more personal pronouns such as “I”, and are more likely to interact with each other. Finally, in the predictive task the authors find that activity, community and post features are most relevant to detecting the sockpuppets with 0.68 AUC. Here are some thoughts on the data collection, method and impact of this work:

 

Why is this research even important? Even though this paper has excellent technical and analytical aspects, I believe that there should have been some more stress on why sockpuppetry is harmful in the first place.

“In 2011, a California company called Ntrepid was awarded a $2.76 million contract from US Central Command for “online persona management” operations[42] to create “fake online personas to influence net conversations and spread US propaganda” in Arabic, Persian, Urdu and Pashto.” (Wikipedia)

I found some more reasons which I think are important to situate this research with community betterment

  1. Bypassing the ban on the account by creating another account (mentioned in the paper)
  2. Sockpuppeting during an online poll to submit multiple votes in favor of the puppeteer.
  3. Endorsing a product by writing multiple good reviews
  4. Enforcing public opinion about a policy or candidate by sheer numbers

 

How to build a better ground truth? One obvious point of contention with this paper is the way the data is collected and labeled as sockpuppet accounts. There is no solid validation regarding whether the selected accounts are actually sockpuppets. The authors mention that they had conservative filters while selecting the sockpuppet accounts but it also means that they might have missed significant true positives. So what can be done to build a better ground truth?

  1. Building a strong “anti-ground truth”. There are performance comparisons between sockpuppets and ordinary users throughout the paper. If the sampled list of ordinary accounts was vetted more strongly (if they had a stronger anti group) the comparisons would have been more telling. One way to do this is to collect accounts which posted from different IPs or location at the exact same time.
  2. Asking the discussion groups for sockpuppets. Even though this seems harder, it can form a very strong ground truth and validation point

Lastly, there are several comparisons between the pairs of sockpuppets and two ordinary users. I am not sure whether the ordinary user’s measure was a normalized aggregate of all pairwise ordinary measures. In any case, instead of comparing the sockpuppet pair activity with generic pairwise activity, it would be better to find out the comparison with two ordinary users with some probability of interaction (eg. same discussion, location, posting time etc.) Also, while comparing between pretenders and non-pretenders, it would be beneficial to have a comparison with ordinary users as a ground truth measure.

In the discussion, the authors claim that not all sockpuppets are malicious. Further research can be focused on finding characteristics of only malicious sockpuppets or online deception “artists”!

Read More

Reflection #4 – [09/06] – Subil Abraham

Summary:

This paper analyzes the phenomenon of sock puppets – multiple accounts controlled by a single user. The authors attempt to identify these accounts specifically on discussion platforms from the various signals that are characteristic of sock puppets and build a model to then identify them automatically. They have also characterized different kinds of sock puppet behavior and show that not all sock puppets are malicious (though keep in mind that they use a wider definition of what a sock puppet account is). They have found that it is easier to identify a pair of sock puppets (of a sock puppet group) from their behaviour with respect to each other than it is to find a single sock puppet in isolation.

 

Reflection:

It seems to me that though this paper specifically mentions that they have a broad definition of what a sock puppet is and distinguishes between pretenders and non pretenders, the paper seems to be geared more towards the study and identification of pretenders. The model that is built seems to be better trained at identifying the deceptive kinds of sock puppets (specifically, pairs of deceptive sock puppets of the same group) given the features it uses to identify them. I think that is fair, since the paper mentions that most sock puppets are used for deception and identifying them is of high benefit to the discussion platform. But I feel like if the authors were going to discuss non pretenders too, they should be explicit about their goals with regards to the detection they are trying to do. Just stating “Our previous analysis found that sockpuppets generally contribute worse content and engage in deceptive behavior.” seems to be going against their earlier and later statements about non pretenders and seems to clump them together with the pretenders. I know I’m rambling a bit here but it kind of stood out to me. I would say separate out the discussions of non pretenders and only briefly mention them, focus exclusively on pretenders.

Following that train of thought, let’s talk about non pretenders. I like the idea of having multiple online identities and using different identities for different purposes. I believe that it was something that was more widely practiced in the earlier era where everyone was warned not to use their real identity on the internet (but in the era of Facebook and Instagram and personal branding, everyone seems to have gathered towards using one identity – their real identity). It’s nice to see that there are still some holdouts and it’s something that I would like to see studied. I want to ask questions like: Why use different identities? How many explicitly try to keep their separate identities as separate (i.e. not allow anyone to connect their different identities? How would you identify non pretender sock puppets since they don’t tend to share the same features of the pretenders that the model seems to be (at least to me) is optimised for? Perhaps one could compare writing styles of suspected sockpuppets using word2vec or look at what times they post at (i.e. looking at the time period in which they are active rather than looking at how quickly they post one after another like you would for a pretender).

The authors have pointed out that sock puppets share some linguistic similarities with trolls. This takes me back to the paper on anti social users [1] we read last week. Obviously, not all sock puppets are trolls. But I think an interesting question is how many of the puppet masters fall under the umbrella of anti social users seeing as they are somewhat similar. The anti social paper focused on single users but what if you threw the idea of sock puppets into the mix? I think with the findings of this paper and that paper, you would be able to identify more easily the anti social users who use sock puppet accounts. But they are probably only a fraction of all anti social users so it may or may not be very helpful in the larger scale problem of identifying all the antisocial users.

One final thing I thought about was studying and identifying teams of different users who post and interact with each other similar to how sock puppets accounts work. How would identifying these be different? I think they might have a similar activity feature values to sock puppets and at least slightly different post features. Will having different users rather than the same user post and interact and reinforce each other muddy the waters enough that ordinary users, moderators and algorithms can’t identify them and kick them out? Can they muddy the waters even further by having each user in the team have their own sock puppet group but where the sock puppets within a group avoid interacting with each other like a regular pretender sock puppet group would, but instead only with sock puppets of the other users on their team? I think the findings of this paper could be effectively be used to identify these cases as well with some modification, since in this case the teams of users are essentially doing the same thing as single user sock puppets. But I wonder what these teams could do to bypass that. Perhaps they could write longer and different posts than a usual sock puppet to bypass the post features test. Perhaps post at different times and interact more widely to fool the activity tests. The model in this paper could provide a basis but would definitely need tweaks to effectively use it.

 

[1] Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.” Icwsm. 2015.

Read More

Reflection #3 – [09/04] – [Parth Vora]

[1] Mitra, Tanushree, Graham P. Wright, and Eric Gilbert. “A parsimonious language model of social media credibility across disparate events.” Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 2017.

 

Summary

This paper proposes a model to quantify the perceived credibility of posts shared over social media. Using Twitter, the authors collect data on 1377 topics ranging over three months and constituting of a total of 66 million tweets. They study how the language used can define an event’s perceived credibility. Mechanical Turkers were asked to label the posts on a 5-point Likert scale which ranged from [-2 to 2]. The authors then went on to define four classes to classify credibility and defined 15 linguistic measures as indicators of perceived credibility level. The rest of the paper goes on to discuss the results and what can be learned from them.         

 

Reflections

There is a large scale penetration of social media in our daily lives. It has become a platform for people to share their views, opinions, feelings, and thoughts. Individuals use language which is frequently accompanied by ambiguity and figure of speech, which makes it difficult even for humans to comprehend it. With any event occurring around the world, tweets are one of the first things that start floating on the internet. This calls for a credibility check.

Use of Twitter is very apt because the language of Twitter is brief due to its character limitations, which makes it ideal to study language features. Although the paper performs a thorough analysis to create a feature set that can help quantify credibility, there are many features that can improve the model.

Social media is filled with informal language, which is hard to process from a natural language processing point of view. It is unclear how the model deals with it. For example, the word “happy” has a positive sentiment while the word “happppyyyyyyyyy” shows a more positive sentiment. The paper considers punctuation marks like question marks and quotations but fails to acknowledge a very important sentence modifier – the exclamation mark. It serves as an emotion booster. For example, the observe the difference between the sentences “the royals shutdown the giants” and “the royals shutdown the giants !!!!”.

Twitter has evolved over the years and with it the way people use it. Real-time event reporting tweets now span over multiple tweets, where each reply is a continuation of the previous tweet. Example. Tweets reporting news also include images or some sort of visual media to give a better idea of the ground reality. The credibility of the author and the people retweeting it also change the perceived credibility. For example, if someone with a high follower to following ratio makes a tweet or retweets some other tweet, its credibility will naturally increase. Can we include these changes to better understand the perceived credibility?

Subjective words like the ones associated with trauma, anxiety, fear, surprise, disappointment are observed to contribute to credibility. This raises the question, can emotion detection in these tweets contribute to perceived credibility? Having worked with emotional intelligence over twitter data, I believe we could come up with complex feature sets that consider the emotion of the tweet as well as the topic in hand and study how emotions play a role in estimating credibility.

One contradiction, I observed is that when hedge words like “to my knowledge” are used in the tweet, they contribute to higher perceived credibility. However, the use of evidential words like “reckon” result in lower perceived credibility, In regular language, both can be used interchangeably but evidently one increases credibility while the other decreases it. Why would this be the case?

There is one more general trend in the observations which is intriguing. In most of the cases, the credibility of a post is high if it tends to agree with the situation at hand. Does that mean, a post will have high credibility if it agrees with a fake event and have low credibility if it disagrees with it?

In conclusion, the paper does an exhaustive study of different linguistic cues that change the perceived credibility of the posts and discussed in detail how the credibility changes from one language feature to another. However, considering how social media has evolved over time, many new amendments can be made to this existing model to create an even more robust and general model.

 

Read More

Reflection #3 – [9/4] – [Viral Pasad]

  1. “A Parsimonious Language Model of Social Media Credibility Across Disparate Events.” Tanushree Mitra, Graham P. Wright, Eric Gilbert
    Mitra et. al. put forth a  study assessing the credibility of the events and the related content posted on social media websites like Twitter. They have presented a parsimonious model that maps linguistic cues to the perceived credibility level and results show that certain linguistic categories and their associated phrases are strong predictors surrounding disparate social media events.

    A model that captures text used in tweets covering 1377 events (66M tweets) they used Mechanical Turks to obtain labeled credibility annotations, Mechanical Lurk by Amazon was used. The authors used trained penalized logistic regression employing 15 linguistic and other control features to predict the credibility (Low, Medium or High) of event streams.

    The authors mention that the model is not deployable. However, the study is a great base for future work in this topic. It is simple model deals with only linguistic cues, and the Penalized Ordinal Regression seems like a prudent choice, but coupled with other parameters such as location and timestamp among other things, it could be designed as a complete system in itself.

    • The study mentions that the content of a tweet is more reliable, than source, when it comes to assessment of credibility. This would hold true almost always except for when the account posting a certain news/article is an account notorious for fake news or conspiracy theories. A simple additional classiffer could weed out such outliers from general consideration.
    • A term used in the paper, ‘stealth advertisers’ stuck to my head and it got me thinking about ‘stealth influencers’ masquerading as unbiased and reliable members of the community. They often use click-baits and the general linguistic cues possessed by them which are generally in extremes such as, “Best Gadget of the Year!!” or “Worst Decision of my Life”
    • And their tweets may often fool a naive user/mode looking for linguistic cues to assess credibility. This revolves around the study by Flanagin and Metzger, as there are characteristics worthy of being believed and then there are characteristics likely to be believed.[2] This begs the question, is the use of linguistic cues to affect credibility on social media hackable?
    • Further, Location/ Location based context is a great asset to assess credibility. Let me refer to the flash-flood thunderstorm warning issued on recently in Blacksburg. A similar type of downpour or notification would not be heeded seriously in a place which experiences a more intense pour. Thus location based context can be a great marker in the estimation of credibility.
    • The authors included the number of retweets as a predictive measure, however, if the reputation/ verified status/karma of the retweet-ers is factored into account, the prediction might become a lot more easier. This is because, multiple trolls retweeting a sassy/ fiery comeback, is different from reputed users retweeting genuine news.
    • Another factor is that linguistic cues picked up from a certain region/community/discipline may not be generalizable as every community has a different way of speaking online with jargons and argot. The community here may refer to a different academic discipline or ethnicity, The point being, that the linguistic cue knowledge has to be learned and cannot be transferred.

    [2] – Digital Media and Youth: Unparalleled Opportunity and Unprecedented Responsibility Andrew J. Flanagin and Miriam J. Metzger

Read More

Reflection #3 – [9/04] – Eslam Hussein

Tanushree Mitra, Graham P. Wright and Eric Gilbert “A Parsimonious Language Model of Social Media Credibility Across Disparate Events”

Summary:

The authors in this paper did a great work trying to measure the credibility of a tweet/message based on its linguistic features and they also incorporated some non-linguistic ones. They built a credibility classifier statistical model that depends on 15 linguistic features which could be classified into two main categories:

1- Lexicon based: that depends lexicons built for special tasks (negation, subjectivity …)

2- Non-lexicon based: questions, quotations … etc.

They also included some control features used to measure the popularity of the content such as number of retweets, tweet length … etc.

They used a credibility annotated dataset CREDBANK to build and test their model. And they also used several lexicons to measure the features of each tweet. Their model achieved a 67.8% accuracy which conclude that the language usage has considerable effect on assessing the credibility of a message.

 

Reflection:

1- I like how the authors addressed the credibility of information on social media from a linguistic perspective. They neglected the source credibility factor when assessing the credibility of the information due to studies that show that information receivers pay more attention to its contents than its source. In my opinion the credibility of the source is a very important feature that should have been integrated to their model. Most of people tend to believe information delivered by credible sources and question ones that come from unknown sources.

2- I would like to see the results after training a deep learning model with this data and those features.

3- Although this study is very important step in countering misinformation and rumors in social media. I wonder how people/groups who used to spread misinformation would misuse those findings and linguistically engineer their false messages in order to deceive such models. What other features could be added in order to prevent them from using the language features to deceive their audience?

4- This work inspires me to study the linguistic features of rumors that have been spread during the Arab spring.

5- I find the following findings very interesting and deserves further study. “The authors found that the number of retweets was one of the top predictors of low perceived credibility. Which means the higher number of retweets the less credible the tweet, and also retweets and replies with longer message lengths were associated with higher credibility scores”. That finding reminds me of online misinformation and rumor attacks during the political conflict between Qatar and its neighboring countries, where online paid campaigns organized to spread misinformation through twitter characterized by the huge number of retweets without any further replies or comments, just retweets. How numbers could be misleading.

Read More

Reflection #3 – [9/04] – Dhruva Sahasrabudhe

Paper-

A Parsimonious Language Model of Social Media Credibility Across Disparate Events – Tanushree Mitra, Graham P. Wright, Eric Gilbert.

Summary-

The paper attempts at using the language of a twitter post, to detect and predict the credibility of a text thread. It focuses a dataset of around 1300 event streams, using data from Twitter. It relies solely on theory-guided linguistic category types, both lexicon based, (like conjunctions, positive and negative emotion, subjectivity, anxiety words, etc.) and non-lexicon based (like hashtags, question tags, etc.). Using these linguistic categories, it creates variables corresponding to words belonging each category. It then tries to fit a penalized ordered logistic regression model, to a dataset (CREDBANK), which contains perceived credibility information corresponding to each event thread, as determined by Amazon Mechanical Turk. It then tried to predict the credibility of a thread, and also determine which linguistic categories are strong predictors of credibility, and which ones are weak indicators, and which words among these categories are positively or negatively correlated with credibility.

 

Reflection-

The paper is thorough with its choice of linguistic categories, and acknowledges that there may be even more confounding variables, but some of the variables chosen do not intuitively seem like they would actually influence credibility assessments, e.g. question marks, hashtags. It does turn out, from the results, that these variables do not correlate with credibility judgements. Moreover, I fail to understand why the paper is using both average length of tweets and no. of words in the tweets as control variables. This seems strange, as both these variables are very obviously correlated, and thus will be redundant.

The appendix mentions that the Turkers were instructed to be knowledgeable about the topic. However, it seems that this strategy would make the credibility judgements susceptible to the biases of the individual labeler. The Turker will have preconceived notions about the event and its credibility, and it is not guaranteed that they will be able to separate that out from their assessment of the perceived credibility. This is a problem, since the study focuses on extracting information only from linguistic cues, without considering any external variables. For example, a labeler who believes global warming is a myth will be biased towards labeling a thread about global warming as less credible. This can perhaps be improved by assigning Turkers topics which they are neutral towards, or are not aware of.

The paper uses a logistic regression classifier, which, of course, is a fairly simplistic model, which cannot map a very complex function in the feature space very well. Using penalized logistic regression makes sense given that the number of features were almost 9 times the number of event threads, but a more complex model, like a shallow neural network could be used, if more data were to be collected.

The paper has many interesting findings about the correlation of words and linguistic categories with credibility. I found it fascinating that subjective phrases associated with newness/uniqueness, complexity/weirdness, and even certain strongly negative words were positively correlated with credibility. It was also surprising that boosters (an expression of assertiveness) were negatively correlated, if in the original tweet, and hedges (an expression of uncertainty) were positively correlated, if in the original tweet. The inversion in correlation of the same category words, based on if they appeared in the original tweet or the replies speaks to a fundamental truth of communication, where different expectations are put on the initiator of the communication, than those put on the responder to the communication.

Finally, the paper states that this system would be useful for early detection of credibility of content, while other systems would need time for the content to spread, to analyze user behavior to help them make predictions. I believe that in today’s world, where information spreads to billions of users within minutes, the time advantage gained by only using linguistic cues would not be enough to offset the drawbacks of not considering information dissemination and user behavior patterns. However, the paper has a lot of insights to offer social scientists or linguistics researchers.

— NORMAL —

Read More

Reflection #3 – [9/4] – [Mohammad Hashemian]

Mitra, T., Wright, G. P., & Gilbert, E. (2017, February). A parsimonious language model of social media credibility across disparate events. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (pp. 126-145). ACM.

 

Today, people can easily share literally everything they want through social networks and a huge amount of data is produced by them every day. But how can we distinguish true information from rumors in social networks? To address the information credibility problem in social networks, the authors in this paper, built a model to predict perceived credibility from language using a corpus of Twitter messages called CREDBANK.

As the authors mentioned, three parts have been defined for the information credibility that are message credibility, media credibility and source credibility. The authors state that their proposed credibility assessment focuses more on the information quality and they did not consider source credibility. Referring to several studies, they express their reasons for emphasizing on linguistic markers instead of source of information. But some questions came to my mind when I read their reasons. They quoted from Sundar that “it is next to impossible for an average Internet user to have a well-defined sense of the credibility of various sources and message categories on the Web…”. I have no doubt about what Sunder mentioned, but if it is not possible for an average Internet user to evaluate the credibility of various sources, is it possible for that user to evaluate the content to assess credibility of the information? In my opinion, credibility of sources in social networks can be much easier than credibility of the content for an average Internet user.

Users usually trust a social media user who is more popular. For example, the more followers you have (popularity), the more trustable you are. To measure the popularity or in other words credibility of a user, there are several other approaches such as the number of retweets and mentions in Tweeter or the number of viewers and likes in Facebook (and YouTube). I agree with Sundar when he talks about multiple layers of source in social networks, but I think most of the time popular users share reliable information. So, even if, for example, a tweet has been handed several times, it’s possible to assess the credibility of the tweet by evaluating the users who have retweeted that tweet.

I have been also thinking about using these approaches to spot fake reviews. The existence of fake reviews even in Amazon or Yelp is undeniable. Although Amazon repeatedly claims that more than 99 percent of its users’ reviews are real (written by real users) but several reliable researches show something else.

There are many websites in the Internet where sellers are looking for shoppers to give positive feedback in exchange for money or other compensation. The existence of the paid reviews has made customers suspicious about the credibility of the reviews. One approach to spot the fake reviews is evaluating the credibility of the sources (reviewers). Does a given reviewer leave only positive reviews? Do they tend to focus on products from unknown companies? Ranking users on their credibility can be considered as a solution to evaluate the credibility of the reviews. Amazon has done this approach by awarding badges to customers based on the type of their contributions in Amazon.com such as sharing reviews frequently. However, it seems that their solutions have not been sufficient.

I think employing the same approach demonstrated in this paper, to spot the fake reviews can be useful. I still believe that source credibility has a very important role in the information credibility, however, can we have a better evaluation of the information by combining these two approaches together?

Read More