Reading:
[1] A Parsimonious Language Model of Social Media Credibility Across Disparate Events
Summary:
This paper is geared towards understanding perceived credibility of information that is disseminated using social media platforms. In order to understand perceived credibility of published information, the authors of this paper decided to examine 66 million twitter messages a.k.a tweets, which were associated with 1,377 events that occurred over a period of 3 months — between October 2014 and February 2015. In order to examine these tweets from a linguistic vantage point, the authors came up with fifteen linguistic dimensions that assisted them to conceive a model that “maps language cues to perceived credibility.” In addition to the various linguistic dimensions the authors also highlighted the importance of particular phrases within these dimensions. To establish credibility of various tweets, the authors executed various experiments where the subjects were asked to rate a tweet on a 5 point Likert scale ranging from -2 (certainly inaccurate) to +2 (certainly accurate). The authors also employed nine control variables — in addition to the results from the experiment, linguistic dimensions, and identification of various phrases within these dimensions — that helped them account for the effect of content popularity. The culmination of myriad of linguistic and statistical models ensued in a definitive parsimonious language model — howbeit the authors warn against independent usage of this model. Albeit they argue that the language model serves as an important step towards a fully autonomous system.
Reflection and Questions:
Comprehending the fact that utilization of specific words, writing styles, and sentence formations can alter the perceived credibility of a post made on social media surprises me, partly because its a new finding for me and partly because as an engineer I have never really paid attention to language formation but only to the facts in the text. Throughout the paper the authors have engaged in the idea of credibility of the post/tweet, howbeit according to my understanding it is the source of the information that necessitates “credibility” and text presented by the source necessitates “accuracy” and “reliability”. The authors write that “words indicating positive emotion were correlated with higher perceived credibility;” the question then arises: what about news bearing bad news? for instance death of a world leader; that news will not bear any “positive emotion.” Whilst reading the paper I came across a sentence stating that disbelief elicits ambiguity, which I disagree with. Disbelief can be used in a variety of combinations, none of which I think elicit ambiguity.
Reading the paper, I couldn’t help but think how does this model utilize slang language ? There could be a credible post that involves slang language because according to me millennial’s are more prone to trust a post that contains colloquial language instead of formal language, unless the source of information is associated with main stream media. The previous question alludes to the next question how is slang language in different countries/ usage of English in other countries taken into consideration ? The reason I ask this is because specific words have different interpretations in different countries/ different regions of the same country resulting in different perceived credibility. As we are on the topic of interpretation of language in different regions, the question arises that: is this model universally suitable for all the languages in the world (with slight alterations), or would different languages require different models ? The main reason for this question is because people tweet in varied languages and language barrier could change perceived credibility of the post/tweet. Lets hypothesize that a post is originally made in a language other than English, howbeit English readers use the translate button on twitter/facebook to read the post, now the perceived credibility of post depends on the region the person resides in and the accuracy of translate feature of the particular social media platform. How can multiple considerations be synthesized to create a more suitable perceived credibility score for a specific situation ?