Reflection #4 – [09/06] – Subil Abraham

Summary:

This paper analyzes the phenomenon of sock puppets – multiple accounts controlled by a single user. The authors attempt to identify these accounts specifically on discussion platforms from the various signals that are characteristic of sock puppets and build a model to then identify them automatically. They have also characterized different kinds of sock puppet behavior and show that not all sock puppets are malicious (though keep in mind that they use a wider definition of what a sock puppet account is). They have found that it is easier to identify a pair of sock puppets (of a sock puppet group) from their behaviour with respect to each other than it is to find a single sock puppet in isolation.

 

Reflection:

It seems to me that though this paper specifically mentions that they have a broad definition of what a sock puppet is and distinguishes between pretenders and non pretenders, the paper seems to be geared more towards the study and identification of pretenders. The model that is built seems to be better trained at identifying the deceptive kinds of sock puppets (specifically, pairs of deceptive sock puppets of the same group) given the features it uses to identify them. I think that is fair, since the paper mentions that most sock puppets are used for deception and identifying them is of high benefit to the discussion platform. But I feel like if the authors were going to discuss non pretenders too, they should be explicit about their goals with regards to the detection they are trying to do. Just stating “Our previous analysis found that sockpuppets generally contribute worse content and engage in deceptive behavior.” seems to be going against their earlier and later statements about non pretenders and seems to clump them together with the pretenders. I know I’m rambling a bit here but it kind of stood out to me. I would say separate out the discussions of non pretenders and only briefly mention them, focus exclusively on pretenders.

Following that train of thought, let’s talk about non pretenders. I like the idea of having multiple online identities and using different identities for different purposes. I believe that it was something that was more widely practiced in the earlier era where everyone was warned not to use their real identity on the internet (but in the era of Facebook and Instagram and personal branding, everyone seems to have gathered towards using one identity – their real identity). It’s nice to see that there are still some holdouts and it’s something that I would like to see studied. I want to ask questions like: Why use different identities? How many explicitly try to keep their separate identities as separate (i.e. not allow anyone to connect their different identities? How would you identify non pretender sock puppets since they don’t tend to share the same features of the pretenders that the model seems to be (at least to me) is optimised for? Perhaps one could compare writing styles of suspected sockpuppets using word2vec or look at what times they post at (i.e. looking at the time period in which they are active rather than looking at how quickly they post one after another like you would for a pretender).

The authors have pointed out that sock puppets share some linguistic similarities with trolls. This takes me back to the paper on anti social users [1] we read last week. Obviously, not all sock puppets are trolls. But I think an interesting question is how many of the puppet masters fall under the umbrella of anti social users seeing as they are somewhat similar. The anti social paper focused on single users but what if you threw the idea of sock puppets into the mix? I think with the findings of this paper and that paper, you would be able to identify more easily the anti social users who use sock puppet accounts. But they are probably only a fraction of all anti social users so it may or may not be very helpful in the larger scale problem of identifying all the antisocial users.

One final thing I thought about was studying and identifying teams of different users who post and interact with each other similar to how sock puppets accounts work. How would identifying these be different? I think they might have a similar activity feature values to sock puppets and at least slightly different post features. Will having different users rather than the same user post and interact and reinforce each other muddy the waters enough that ordinary users, moderators and algorithms can’t identify them and kick them out? Can they muddy the waters even further by having each user in the team have their own sock puppet group but where the sock puppets within a group avoid interacting with each other like a regular pretender sock puppet group would, but instead only with sock puppets of the other users on their team? I think the findings of this paper could be effectively be used to identify these cases as well with some modification, since in this case the teams of users are essentially doing the same thing as single user sock puppets. But I wonder what these teams could do to bypass that. Perhaps they could write longer and different posts than a usual sock puppet to bypass the post features test. Perhaps post at different times and interact more widely to fool the activity tests. The model in this paper could provide a basis but would definitely need tweaks to effectively use it.

 

[1] Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.” Icwsm. 2015.

Read More

Reflection #3 – [09/04] – [Parth Vora]

[1] Mitra, Tanushree, Graham P. Wright, and Eric Gilbert. “A parsimonious language model of social media credibility across disparate events.” Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 2017.

 

Summary

This paper proposes a model to quantify the perceived credibility of posts shared over social media. Using Twitter, the authors collect data on 1377 topics ranging over three months and constituting of a total of 66 million tweets. They study how the language used can define an event’s perceived credibility. Mechanical Turkers were asked to label the posts on a 5-point Likert scale which ranged from [-2 to 2]. The authors then went on to define four classes to classify credibility and defined 15 linguistic measures as indicators of perceived credibility level. The rest of the paper goes on to discuss the results and what can be learned from them.         

 

Reflections

There is a large scale penetration of social media in our daily lives. It has become a platform for people to share their views, opinions, feelings, and thoughts. Individuals use language which is frequently accompanied by ambiguity and figure of speech, which makes it difficult even for humans to comprehend it. With any event occurring around the world, tweets are one of the first things that start floating on the internet. This calls for a credibility check.

Use of Twitter is very apt because the language of Twitter is brief due to its character limitations, which makes it ideal to study language features. Although the paper performs a thorough analysis to create a feature set that can help quantify credibility, there are many features that can improve the model.

Social media is filled with informal language, which is hard to process from a natural language processing point of view. It is unclear how the model deals with it. For example, the word “happy” has a positive sentiment while the word “happppyyyyyyyyy” shows a more positive sentiment. The paper considers punctuation marks like question marks and quotations but fails to acknowledge a very important sentence modifier – the exclamation mark. It serves as an emotion booster. For example, the observe the difference between the sentences “the royals shutdown the giants” and “the royals shutdown the giants !!!!”.

Twitter has evolved over the years and with it the way people use it. Real-time event reporting tweets now span over multiple tweets, where each reply is a continuation of the previous tweet. Example. Tweets reporting news also include images or some sort of visual media to give a better idea of the ground reality. The credibility of the author and the people retweeting it also change the perceived credibility. For example, if someone with a high follower to following ratio makes a tweet or retweets some other tweet, its credibility will naturally increase. Can we include these changes to better understand the perceived credibility?

Subjective words like the ones associated with trauma, anxiety, fear, surprise, disappointment are observed to contribute to credibility. This raises the question, can emotion detection in these tweets contribute to perceived credibility? Having worked with emotional intelligence over twitter data, I believe we could come up with complex feature sets that consider the emotion of the tweet as well as the topic in hand and study how emotions play a role in estimating credibility.

One contradiction, I observed is that when hedge words like “to my knowledge” are used in the tweet, they contribute to higher perceived credibility. However, the use of evidential words like “reckon” result in lower perceived credibility, In regular language, both can be used interchangeably but evidently one increases credibility while the other decreases it. Why would this be the case?

There is one more general trend in the observations which is intriguing. In most of the cases, the credibility of a post is high if it tends to agree with the situation at hand. Does that mean, a post will have high credibility if it agrees with a fake event and have low credibility if it disagrees with it?

In conclusion, the paper does an exhaustive study of different linguistic cues that change the perceived credibility of the posts and discussed in detail how the credibility changes from one language feature to another. However, considering how social media has evolved over time, many new amendments can be made to this existing model to create an even more robust and general model.

 

Read More

Reflection #3 – [9/4] – [Viral Pasad]

  1. “A Parsimonious Language Model of Social Media Credibility Across Disparate Events.” Tanushree Mitra, Graham P. Wright, Eric Gilbert
    Mitra et. al. put forth a  study assessing the credibility of the events and the related content posted on social media websites like Twitter. They have presented a parsimonious model that maps linguistic cues to the perceived credibility level and results show that certain linguistic categories and their associated phrases are strong predictors surrounding disparate social media events.

    A model that captures text used in tweets covering 1377 events (66M tweets) they used Mechanical Turks to obtain labeled credibility annotations, Mechanical Lurk by Amazon was used. The authors used trained penalized logistic regression employing 15 linguistic and other control features to predict the credibility (Low, Medium or High) of event streams.

    The authors mention that the model is not deployable. However, the study is a great base for future work in this topic. It is simple model deals with only linguistic cues, and the Penalized Ordinal Regression seems like a prudent choice, but coupled with other parameters such as location and timestamp among other things, it could be designed as a complete system in itself.

    • The study mentions that the content of a tweet is more reliable, than source, when it comes to assessment of credibility. This would hold true almost always except for when the account posting a certain news/article is an account notorious for fake news or conspiracy theories. A simple additional classiffer could weed out such outliers from general consideration.
    • A term used in the paper, ‘stealth advertisers’ stuck to my head and it got me thinking about ‘stealth influencers’ masquerading as unbiased and reliable members of the community. They often use click-baits and the general linguistic cues possessed by them which are generally in extremes such as, “Best Gadget of the Year!!” or “Worst Decision of my Life”
    • And their tweets may often fool a naive user/mode looking for linguistic cues to assess credibility. This revolves around the study by Flanagin and Metzger, as there are characteristics worthy of being believed and then there are characteristics likely to be believed.[2] This begs the question, is the use of linguistic cues to affect credibility on social media hackable?
    • Further, Location/ Location based context is a great asset to assess credibility. Let me refer to the flash-flood thunderstorm warning issued on recently in Blacksburg. A similar type of downpour or notification would not be heeded seriously in a place which experiences a more intense pour. Thus location based context can be a great marker in the estimation of credibility.
    • The authors included the number of retweets as a predictive measure, however, if the reputation/ verified status/karma of the retweet-ers is factored into account, the prediction might become a lot more easier. This is because, multiple trolls retweeting a sassy/ fiery comeback, is different from reputed users retweeting genuine news.
    • Another factor is that linguistic cues picked up from a certain region/community/discipline may not be generalizable as every community has a different way of speaking online with jargons and argot. The community here may refer to a different academic discipline or ethnicity, The point being, that the linguistic cue knowledge has to be learned and cannot be transferred.

    [2] – Digital Media and Youth: Unparalleled Opportunity and Unprecedented Responsibility Andrew J. Flanagin and Miriam J. Metzger

Read More

Reflection #3 – [9/04] – Eslam Hussein

Tanushree Mitra, Graham P. Wright and Eric Gilbert “A Parsimonious Language Model of Social Media Credibility Across Disparate Events”

Summary:

The authors in this paper did a great work trying to measure the credibility of a tweet/message based on its linguistic features and they also incorporated some non-linguistic ones. They built a credibility classifier statistical model that depends on 15 linguistic features which could be classified into two main categories:

1- Lexicon based: that depends lexicons built for special tasks (negation, subjectivity …)

2- Non-lexicon based: questions, quotations … etc.

They also included some control features used to measure the popularity of the content such as number of retweets, tweet length … etc.

They used a credibility annotated dataset CREDBANK to build and test their model. And they also used several lexicons to measure the features of each tweet. Their model achieved a 67.8% accuracy which conclude that the language usage has considerable effect on assessing the credibility of a message.

 

Reflection:

1- I like how the authors addressed the credibility of information on social media from a linguistic perspective. They neglected the source credibility factor when assessing the credibility of the information due to studies that show that information receivers pay more attention to its contents than its source. In my opinion the credibility of the source is a very important feature that should have been integrated to their model. Most of people tend to believe information delivered by credible sources and question ones that come from unknown sources.

2- I would like to see the results after training a deep learning model with this data and those features.

3- Although this study is very important step in countering misinformation and rumors in social media. I wonder how people/groups who used to spread misinformation would misuse those findings and linguistically engineer their false messages in order to deceive such models. What other features could be added in order to prevent them from using the language features to deceive their audience?

4- This work inspires me to study the linguistic features of rumors that have been spread during the Arab spring.

5- I find the following findings very interesting and deserves further study. “The authors found that the number of retweets was one of the top predictors of low perceived credibility. Which means the higher number of retweets the less credible the tweet, and also retweets and replies with longer message lengths were associated with higher credibility scores”. That finding reminds me of online misinformation and rumor attacks during the political conflict between Qatar and its neighboring countries, where online paid campaigns organized to spread misinformation through twitter characterized by the huge number of retweets without any further replies or comments, just retweets. How numbers could be misleading.

Read More

Reflection #3 – [9/04] – Dhruva Sahasrabudhe

Paper-

A Parsimonious Language Model of Social Media Credibility Across Disparate Events – Tanushree Mitra, Graham P. Wright, Eric Gilbert.

Summary-

The paper attempts at using the language of a twitter post, to detect and predict the credibility of a text thread. It focuses a dataset of around 1300 event streams, using data from Twitter. It relies solely on theory-guided linguistic category types, both lexicon based, (like conjunctions, positive and negative emotion, subjectivity, anxiety words, etc.) and non-lexicon based (like hashtags, question tags, etc.). Using these linguistic categories, it creates variables corresponding to words belonging each category. It then tries to fit a penalized ordered logistic regression model, to a dataset (CREDBANK), which contains perceived credibility information corresponding to each event thread, as determined by Amazon Mechanical Turk. It then tried to predict the credibility of a thread, and also determine which linguistic categories are strong predictors of credibility, and which ones are weak indicators, and which words among these categories are positively or negatively correlated with credibility.

 

Reflection-

The paper is thorough with its choice of linguistic categories, and acknowledges that there may be even more confounding variables, but some of the variables chosen do not intuitively seem like they would actually influence credibility assessments, e.g. question marks, hashtags. It does turn out, from the results, that these variables do not correlate with credibility judgements. Moreover, I fail to understand why the paper is using both average length of tweets and no. of words in the tweets as control variables. This seems strange, as both these variables are very obviously correlated, and thus will be redundant.

The appendix mentions that the Turkers were instructed to be knowledgeable about the topic. However, it seems that this strategy would make the credibility judgements susceptible to the biases of the individual labeler. The Turker will have preconceived notions about the event and its credibility, and it is not guaranteed that they will be able to separate that out from their assessment of the perceived credibility. This is a problem, since the study focuses on extracting information only from linguistic cues, without considering any external variables. For example, a labeler who believes global warming is a myth will be biased towards labeling a thread about global warming as less credible. This can perhaps be improved by assigning Turkers topics which they are neutral towards, or are not aware of.

The paper uses a logistic regression classifier, which, of course, is a fairly simplistic model, which cannot map a very complex function in the feature space very well. Using penalized logistic regression makes sense given that the number of features were almost 9 times the number of event threads, but a more complex model, like a shallow neural network could be used, if more data were to be collected.

The paper has many interesting findings about the correlation of words and linguistic categories with credibility. I found it fascinating that subjective phrases associated with newness/uniqueness, complexity/weirdness, and even certain strongly negative words were positively correlated with credibility. It was also surprising that boosters (an expression of assertiveness) were negatively correlated, if in the original tweet, and hedges (an expression of uncertainty) were positively correlated, if in the original tweet. The inversion in correlation of the same category words, based on if they appeared in the original tweet or the replies speaks to a fundamental truth of communication, where different expectations are put on the initiator of the communication, than those put on the responder to the communication.

Finally, the paper states that this system would be useful for early detection of credibility of content, while other systems would need time for the content to spread, to analyze user behavior to help them make predictions. I believe that in today’s world, where information spreads to billions of users within minutes, the time advantage gained by only using linguistic cues would not be enough to offset the drawbacks of not considering information dissemination and user behavior patterns. However, the paper has a lot of insights to offer social scientists or linguistics researchers.

— NORMAL —

Read More

Reflection #3 – [9/4] – [Mohammad Hashemian]

Mitra, T., Wright, G. P., & Gilbert, E. (2017, February). A parsimonious language model of social media credibility across disparate events. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (pp. 126-145). ACM.

 

Today, people can easily share literally everything they want through social networks and a huge amount of data is produced by them every day. But how can we distinguish true information from rumors in social networks? To address the information credibility problem in social networks, the authors in this paper, built a model to predict perceived credibility from language using a corpus of Twitter messages called CREDBANK.

As the authors mentioned, three parts have been defined for the information credibility that are message credibility, media credibility and source credibility. The authors state that their proposed credibility assessment focuses more on the information quality and they did not consider source credibility. Referring to several studies, they express their reasons for emphasizing on linguistic markers instead of source of information. But some questions came to my mind when I read their reasons. They quoted from Sundar that “it is next to impossible for an average Internet user to have a well-defined sense of the credibility of various sources and message categories on the Web…”. I have no doubt about what Sunder mentioned, but if it is not possible for an average Internet user to evaluate the credibility of various sources, is it possible for that user to evaluate the content to assess credibility of the information? In my opinion, credibility of sources in social networks can be much easier than credibility of the content for an average Internet user.

Users usually trust a social media user who is more popular. For example, the more followers you have (popularity), the more trustable you are. To measure the popularity or in other words credibility of a user, there are several other approaches such as the number of retweets and mentions in Tweeter or the number of viewers and likes in Facebook (and YouTube). I agree with Sundar when he talks about multiple layers of source in social networks, but I think most of the time popular users share reliable information. So, even if, for example, a tweet has been handed several times, it’s possible to assess the credibility of the tweet by evaluating the users who have retweeted that tweet.

I have been also thinking about using these approaches to spot fake reviews. The existence of fake reviews even in Amazon or Yelp is undeniable. Although Amazon repeatedly claims that more than 99 percent of its users’ reviews are real (written by real users) but several reliable researches show something else.

There are many websites in the Internet where sellers are looking for shoppers to give positive feedback in exchange for money or other compensation. The existence of the paid reviews has made customers suspicious about the credibility of the reviews. One approach to spot the fake reviews is evaluating the credibility of the sources (reviewers). Does a given reviewer leave only positive reviews? Do they tend to focus on products from unknown companies? Ranking users on their credibility can be considered as a solution to evaluate the credibility of the reviews. Amazon has done this approach by awarding badges to customers based on the type of their contributions in Amazon.com such as sharing reviews frequently. However, it seems that their solutions have not been sufficient.

I think employing the same approach demonstrated in this paper, to spot the fake reviews can be useful. I still believe that source credibility has a very important role in the information credibility, however, can we have a better evaluation of the information by combining these two approaches together?

Read More

Reflection #3 – [9/4] – [Nitin Nair]

  1. Mitra, G. P. Wright, and E. Gilbert, “A Parsimonious Language Model of Social Media Credibility Across Disparate Events”

Written language for centuries, coming from word of mouth, has been the primary mode for discourse and the transportation of ideas. Due to its structure and capacity it has shaped our view of the world.But, due to changing social landscapes, written language’s efficacy is being tested. The emergence of social media, preference of short blobs of text, citizen journalism, the emergence of cable reality show, I mean, the NEWS and various other related occurrences are driving a change in the way we are informed of our surroundings. These are not only affecting our ability to quantify credibility but is also inundating us with more information one can wade through. In this paper, the author explores the idea of whether language from one of the social media website, twitter, can be a good indicator of the perceived credibility of the text written.

The author tries to predict the credibility of the credibility of news by creating a parsimonious model(low number of input parameter count) using penalized ordinal regression with scores “low”, “medium”, “high” and “perfect.” The author uses CREDBANK corpus along with other linguistic repositories and tools to build the model. The author picks modality, subjectivity, hedges, evidentiality, negation, exclusion and conjugation, anxiety, positive and negative emotions, boosters and capitalization, quotations, questions and hashtags as its linguistic features while using number of number of original tweets, retweets and replies, average length of original tweets, retweets and replies and number of words in original tweets, retweets and replies as the control variables. Measures were also taken like the use of penalized version of ordered logistic regression to handle multicollinearity and sparsity issues. The author then goes on to rank and compare the different input variables listed above by its explanatory powers.

One of the things I was unsure of after reading the paper is if the author accounted for long tweets where the author uses replies as a mean to extend one’s tweet. Eliminating this could make the use of number of replies as a feature more credible. One could also see that, the author has missed to accommodate for spelling mistakes and so forth, as this preprocessing step could improve the performance and reliability of the model.
It would be an interesting idea to test if the method the author describes can be translated to other languages especially languages which are linguistically different.
Language has been evolving ever since its inception. New slangs and dialects adds to this evolution. Certain social struggles and changes also have an impact on language use and vice versa. Given such a setting, is understanding credibility from language use a reliable method? This would be an interesting project to take on to see if these underlying lingual features have remained same across time. One could pick out texts involving discourse from the past and see how the reliability of the model build by the author changes if it does. But this method will need to account for the data imbalance.
When a certain behaviour is penalized, the repressed always find a way back. This can also be applicable to the purveyors of fake news. They could game the system in using certain language constructs and words to evade the system. Due to the way the system is build by the author, it could be susceptible to such acts. In order to avoid such methods one could automate this feature selection. The model could routinely recalculate the importance of certain features while also adding new words into its dictionary.
Can a deep learning mode be built to better the performance of credibility measurement? One could also try building a sequential model may it be LSTMs or even better a TCN [2] to which vectors of words in a tweet generated using word2vec could be given as input along with some attention mechanism or even [4] to allow us to have an interpretable model. Care has to given that models especially in this area have to be interpretable model so as to avoid not having an accountability in the system.

[2] Colin Lea et al, “Temporal Convolutional Networks for Action Segmentation and Detection”
[3] T. Mikolov et al, “Distributed Representations of Words and Phrases and their Compositionality”
[4] Tian Guo et al, “An interpretable {LSTM} neural network for autoregressive exogenous model”

Read More

Reflection #3 – [9/4] – [Deepika Rama Subramanian]

  1. Mitra, G. P. Wright, and E. Gilbert, “A Parsimonious Language Model of Social Media Credibility Across Disparate Events”

SUMMARY:

This paper proposes a model that aims in classifying the credibility level of a post/tweet as one of Low, Medium, High or Perfect. This is based on 15 linguistic measures including modality, subjectivity which are lexicon-based measures and questions, hashtags that are not lexicon based. The study uses a CREDBANK corpus which contains events, tweets and crowdsourced credibility annotations. It not only takes into consideration the original tweet but also retweets and replies to the original tweet and other parameters such as tweet length. The penalized ordinal regression model shows that several linguistic factors have an effect on perceived credibility most of all subjectivity followed by positive and negative emotions.

REFLECTION:

  1.  The first thing that I was concerned about was tweet length, this was set as a control. We have, however, in the past, discussed as to how shorter tweet lengths tend to be perceived as truthful because the tweeter wouldn’t have much time to type in a tweet while in the middle of a major event. The original tweet length itself negatively correlated with perceived credibility.
  2. The language itself is constantly evolving, wouldn’t we have to continuously train with newer lexicon as time goes by? 10 years ago, the word ‘dope’ and ‘swag’ (nowadays used interchangeably with amazing or wonderful) would have meant some very different things.
  3. A well-known source is one of the most credible ways of getting news offline. Perhaps combining the model with one that tests perceived credibility based on source could give us even better results. Twitter has some select verified accounts that have higher credibility than others. The platform could look to assign something akin to karma points for accounts that have in the past given out only credible information.
  4. This paper has clearly outlined that some words evoke the sense of a certain tweet being credible more than some others. Could these words be intentionally used by miscreants to seem credible and spread false information? Since this model is lexicon based, it is possible that the model cannot automatically adjust for it.
  5. One observation that initially irked me in this study was that the negative emotion was tied to low credibility. This seems about correct when we think about how the Kubler-Ross model’s first step is denial. If this is the case, I first wondered how anyone was going to be able to deliver bad news to the world ever. However, taking a closer look at the words that have a negative correlation specifically are ones that seem negatively accusatory (cheat, distrust, egotist) as against sad (missed, heartbroken, sobbed, devastate). While we may be able to get the word out about say a tsunami and be believed, outing someone to be a cheat may be a little more difficult.

Read More

Reflection #3 – [09/04] – [Neelma Bhatti]

  1. Garrett, R.K. and Weeks, B.E., 2013, February. The promise and peril of real-time corrections to political misperceptions. In Proceedings of the 2013 conference on Computer supported cooperative work (pp. 1047-1058). ACM.
  2. Mitra, T., Wright, G.P. and Gilbert, E., 2017, February. A parsimonious language model of social media credibility across disparate events. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (pp. 126-145). ACM.

Summary and Reflection for paper 1

Article one talks about how people tend to be picky and choosy when it comes to rumors and their correction. They find a news hard to believe if it doesn’t align with their preconceived notions about an idea, and even harder to made amends for proliferation of a false news if it does align with their agenda/belief.  It presents plausible recommendations about fine graining the correction into users view so that it is more easily digestible and acceptable. I personally related with recommendation 2 about letting the user know about the risks associated with hanging on to the rumor, or the moral obligation of correcting their views.  However, does the same user profiling and algorithms for guessing preferences work across sources of news other than the traditional ones i.e. twitter, CNN etc.?

As delayed correction seemed to work better in most of the cases, can a system decide how likely the user is to pass on the news to other sources based on his/her profile, present real-time corrections to users who tend to proliferate fake news faster than others by using a mix of all three recommendations presented in this paper?

 

Summary for paper 2

As long as there’s market for juicy gossips and misinterpretation of events, rumors will keep spreading in one form or the other. People have a tendency to anticipate, and readily believe things which are either consistent with their existing beliefs, or give an adrenaline rush without potentially bringing any harm to them.  Article 2 talks about using language markers and cues to authenticate a news or its source which can, when subsumed with other approaches of classifying credibility, work as an early detector of false news.

Reflection and Questions

  • Credibility score can be maintained and publicly displayed for each user, which starts from 0 and is decreased every time the user is reported for posting or spreading a misleading news .Can such credibility score be used to determine how factual someone’s tweets/posts are?
  • Can such a score be maintained for news too?
  • Can a more generous language model be developed, which also takes multilingual postings into account?
  • How can number of words used in a tweet, retweet and replies be an indicator of authenticity of a news?
  • Sometimes users use emoticons/emojis in the end of the tweet to indicate satire and mockery of the otherwise seriously portrayed news. Does the model include their effect on the authenticity of the news?
  • What about rumors posted via images?
  • So much propaganda is spread via videos or edited images on the social media. Sometimes, all textual news that follows is the outcome of a viral video or picture circulating around the internet. What mechanism can be developed to stop such false news from being rapidly spread and shared?

Read More

Reflection #3 – [09/04] – [Bipasha Banerjee]

Today’s topic of discussion is Credibility and Misinformation online.

Mitra, Tanushree et al. (2017) – “A Parsimonious Language Model of Social Media Credibility Across Disparate Events”- CSCW 2017 (126-145).

Summary

The paper mainly focuses on establishing the credibility of news across social media. The authors identified 15 theoretically grounded linguistic assumptions and took help of the CREDBANK corpus to construct a model that would map language to the perceived levels of credibility. Credibility has been broadly described as believability, trust and reliability along with other related topics. However, the term credibility has been termed as both subjective or objective depending on the area of expertise of the researcher. A CREDBANK [1] was constructed which is essentially a corpus of tweets, topics, events and associated human credibility judgements. The corpus has credibility annotations on a 5-point scale (-2 to +2). The paper dealt with the perceived credibility (annotations based as “Certainly Accurate”) of the reported twitter news of a particular event. Proportions of annotations (Pca = “Certainly Accurate” ratings of event / Total rating for that event) was calculated. An event was rated as “Certainly Accurate” if its Pca belonged to the “Perfect Credibility class” (0.9≤ Pca ≤1). All events were given a credibility class of Low to Perfect (rank as Low ≤ Medium ≤ High ≤ Perfect). The linguistic assumptions were considered as the potential predictors of perceived credibility. The potential credibility markers were namely, Modality, Subjectivity, Hedges, Evidentiality, Negations, Exclusions and Conjugations, Anxiety, Positive and negative emotions, Boosters and Capitalization, Quotation, Questions and Hashtags. Nine variables were used as controls namely, Number, average length and number of words in original tweets, retweets and replies. The regression technique used an alpha (=1) parameter to determine the distribution of weight amongst the variables. It was found out that retweets and replies with longer message lengths were associated with higher credibility scores whereas, higher number of retweets were correlated with lower credibility scores.

Reflection

It has become increasingly common for people to experience news through social media and with this comes the problem of the authenticity of that news. The paper dealt with few credibility markers which assessed the credibility of the particular post. It spoke about the variety of words used in the post and how they are perceived to be.

Firstly, I would like to point out that certain people have their own jargon. The millennials speak in a specific language, a medical professional may use a certain language. This may be perceived as negative or dubious language which may in turn reduce the credibility.  Does the corpus have variety of informal terms and languages as well as group specific languages in the database to avoid erroneous result?

Additionally, a statement in the paper says, “Moments of uncertainty are often marked with statements containing negative valence expressions.” However, negative expressions are also used to depict some unfortunate event. Let’s take the example of the missing plane MH 370. People are likely to use negative emotion while tweeting about that incident. This certainly doesn’t make it uncertain or less credible.

Although this paper dealt with the credibility of news in the social media realm, namely twitter, credibility of news is still a valid concern when it comes to all forms of news sources. Can we apply this to Television and Print media as well? They are often accused of reporting unauthenticated news or even being bias in some cases. If a credibility score of such media is also measured other than the infamous “TRP or Rating”, it would make these news outlets credible as well. It would force the news agencies to validate their source and this index or score would also help the readers or followers of the network to judge the authenticity of the news being delivered.

[1] Mitra, Tanushree et.al. (2015)- “CREDBANK: A Large-scale Social Media Corpus With Associated Credibility Annotations

Read More