Reflection #4 – [1/30] – [Meghendra Singh]

  1. Garrett, R. Kelly, and Brian E. Weeks. “The promise and peril of real-time corrections to political misperceptions.” Proceedings of the 2013 conference on Computer supported cooperative work. ACM, 2013.
  2. Mitra, Tanushree, Graham P. Wright, and Eric Gilbert. “A parsimonious language model of social media credibility across disparate events.” Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 2017.

Both the papers focus on issues surrounding credibility of information available on the world wide web and provide directions for future research in the subject matter. Garrett and Weeks focus on the implications of correcting inaccurate information in real-time versus presenting the corrected information after a delay (after a distractor task). Here, the authors used a between-participants experiment to compare participant beliefs on the issue of electronic health records (EHRs), when a news article about EHRs is presented with corrections as opposed to when it is presented with delayed corrections. The study was conducted on 574 demographically diverse U.S. based participants. On the other hand, Mitra et. al., present a model for assessing the credibility of social media events, which was trained using linguistic and control features present in 1377 event streams (66M twitter posts) of the CREDBANK corpus. In this case the authors first use Mechanical Turkers to score the credibility of individual event streams and subsequently train a penalized logistic regression (using LASSO regularization) to predict the ordinal credibility level (Low, Medium or High) of the event streams.

In their paper, Garrett and Weeks explore a subtle yet interesting issue of real-time correction of information leading to individuals rejecting carefully documented evidence and forming a distrust for the source. The paper seems to suggest that people who are predisposed to a certain ideology are more likely to question and not trust any real-time corrections to information on the internet (articles, blogposts, etc.) that go against there ideology. Whereas, people who already have doubts about the information are more likely to agree with the corrections. Upon reading the initial sections of the paper I felt that delayed corrections to information available online might not be really useful. I say this because people rarely revisit an article which they have read in the past. If the corrections are not presented as the readers are going through the information, how will they ultimately receive the corrected information? It is highly unlikely that they will revisit the same article in the future?  I also feel that the study might be prone to sample bias since the attitudes, biases and predispositions of people in the U.S. may not reflect those of another geography. Additionally, as the authors also mention in the limitation of the study, we might get different results if the particular issue that was being analyzed is changed (e.g. we might get different results if the issue was anti-vaccination?).

In the second paper, Mitra et. al. focused on predicting the “perceived” credibility of social media event reportage using linguistic and non-linguistic features. Although the approach is interesting, I feel that there can be a difference between the perceived an actual credibility of an event. For example, given that Mitra et. al., have published that Subjectivity in the original tweets is a good predictor of credibility, malicious twitter users, wanting to spread misinformation, might artificially incorporate language features that improve the Subjectivity of their tweets so that, they seem more credible? A system based on the model presented in the paper would likely assign high perceived credibility to the tweets spreading misinformation in this case? A research question might be, to come up with a model that can detect and compensate for such malicious cases? Another interesting question might be to devise a system that can measure and present users with an events’ “actual” credibility (maybe using crowdsourcing or dependable journalistic channels?) instead of the “perceived” credibility based on language markers in the tweets about the event?

Another, question I have is why the authors use the specific form of Pca (i.e. why were the +1 or “Maybe Accurate” ratings not used for computing Pca?). Also, there are 66M tweets in CREDBANK, given that these are clustered into 1377 event streams, there should be roughly 47K tweets in each event stream (assuming an even distribution). Did each of the 30 Turkers (who were rating an event) read through the 47K tweets or were these divided between the Turkers? Although, I do agree with the authors that this study circumvents the problem of sampling bias as it analyzes a comprehensive collection of a large set of social media events, I feel there is a fair chance of “Turker bias” creeping into the model (in Table 2, we generally see a majority of Turkers rating the events as [+2] i.e. Certainly Accurate? I am curious, was there a group of Turkers who always rated any event stream presented to them as “Certainly Accurate”?)

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *