Reflection #4 – [01/30] – [Vartan Kesiz-Abnousi]

Zhe Zhao, Paul Resnick, and Qiaozhu Mei. 2015. Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts. In Proceedings of the 24th International Conference on World Wide Web (WWW ’15). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1395-1405. DOI: https://doi.org/10.1145/2736277.2741637

 

Summary

The authors aim to identify trending topics in social media, even topics that are not pre-defined. They use Twitter as their source of data. Specifically, they analyze 10,417 tweets related to five rumors. They present a technique to identify trending rumors. Moreover, they define them into topics that include disputed factual claims. Subsequently, they deem the identification of trending rumors as soon as possible important. While it is not easy to identify the factual claims on individual posts, the authors redefine the problem in order to deal with this. Therefore, they find cluster of posts whose topic is a disputed factual claim. Furthermore, when there is a rumor there are usually posts that raise questions by using some signature text phrases. The authors search for these keywords in order to identify them such as: “Is this true?”. As the authors find, many rumor diffusion processes have some posts that have such enquiry phrases quite early in the diffusion. Subsequently, the authors develop a rumor detection method that looks for the enquire phrases. It follows five steps: identification of signal tweets and signal clusters, detect statements, capture non-signal tweets, and rank candidate rumor clusters. Therefore, the method clusters similar posts together and finally collects the related posts that do not contain the enquire phrases. Next, they rank the clusters of posts by their likelihood of really containing a disputed factual claim. The detectors find that the method has a very good performance. About a third of the top 50 clusters were judged to be rumors, a high enough precision

 

Reflections

 

The broad success of online social media has created fertile soil for the emergence and fast spread of rumors.  A notable example is that one week after the Boston bombing, the official Twitter account of the Associated Press (AP) was hacked. The hacked account sent out a tweet about two explosions in the White House and the President being injured. Subsequently, the authors have an ambitious goal. They propose that instead of relying solely on human observers to identify trending rumors, it would be helpful to have an automated tool to identify potential rumors. I find the idea of identify rumors in real time, instead of retrieving all the tweets related to them, very novel and intelligent. To their credit, the authors acknowledge that identifying the truth value of an arbitrary statement is very difficult, probably as difficult as any natural language processing problems. They stress that their goal does not make any attempt to assess whether rumors are true or not, or classify or rank them based on the probability that they are true. They rank the clusters based on the probability that they contain a disputed claim, not that they contain a false claim.

 

I am particularly concerned regarding the adverse effect of automated rumor detection. In particular, its use in either damage control or disinformation campaigns. The authors write: “People who are exposed to a rumor, before deciding whether to believe it or not, will take a step of information enquiry to seek more information or to express skepticism without asserting specifically that it is false”. However, this statement is not self-evident. For instance, what if the flagging mechanism of a rumor, “disputed claim”, does not work for all cases? Government official statements would probably not be flagged as “rumors”. A classic example is the existence, or lack thereof, of WMD’s in Iraq. Most of the media corroborated with the government’s (dis)information. To put things into more technical terms, what if the twitter posts do not have any of the enquiry phrases (i.e. “Is this true?”)? The clusters would then not detect them as “signal tweets”. In that case, the automated algorithm would never find a “rumor” to begin with. The algorithm would do what it was programmed to do, but it would have failed to detect rumors.

 

Perhaps the greatest controversy is surrounded by how “rumor” is defined. According to the authors, “A rumor is a controversial and fact-checkable statement”. By “Fact-checkable”: In principle, the statement has a truth value that could be determined right now by an observer who had access to all relevant evidence. By “Controversial (or Disputed)”: At some point in the life cycle of the statement, some people express skepticism. I think the “controversial” part might be the weakest part of the definition. Would the statement “earth is round” be controversial because at “some point in the life cycle of the statement, some people express skepticism”? The authors try to recognize such tweets into a category they label as “signal tweets”.

Regardless, I particularly liked the rigorous definitions provided in the “Computational Problem” section that leaves no room for misinterpretation. There is room for research in the automated rumor detection area. Especially if it could broaden the “definition” of rumor and somehow embed it in the detection method.

Questions

  1. What if the human annotators are biased in manually labeling rumors?
  2. What is the logic regarding the length of the time interval? Is it ad hoc? How sensitive are the results to the choice of time interval?
  3. Why was Jaccard similarity coefficient set to a 0.6 threshold? Is this the standard in this type of research?

 

 

Mitra, Tanushree, Graham P. Wright, and Eric Gilbert. “A parsimonious language model of social media credibility across disparate events.” Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 2017.

 

Summary

The main goal of this article is to examine whether the language captured in unfolding Twitter events provide information about the event’s credibility. The data is a corpus of public Twitter messages with 66M messages corresponding to 1,377 real-world events over a span of three months, October 2014 to February 2015. The athors identify 15 theoretically grounded linguistic dimensions and present a parsimonious model that maps language cues to perceived levels of credibility. The results demonstrate that certain linguistic categories and their associated phrases are strong predictors surrounding disparate social media events. The language used by millions of people on Twitter has considerable information about an event’s credibility.

Reflections

With the ever increasing doubt on the credibility of information found on social media, it is important for both citizens and platforms to identify such non-credible information. My intuition before even completing the paper was that the type language used in Twitter posts could be used an indicator to capture the credibility of an event. Furthermore, even though not all non-credible events can be captured just by language, we could still be able to capture a subset. Interestingly enough, the authors indeed verify this hypothesis. This is important in the sense that we can capture non-credible posts with a parsimonious model through a first “screening model”. Then, after discarding these posts we could proceed to more complex models to add additional “filters” that detect non-credible posts. One of the dangers that I find is to make sure not to eliminate credible posts, a false positive error, with “positive” being non-credible error. The second important contribution is that instead of retrospectively identifying whether the information is credible or not, they use CREDBANK in order to overcome dependent variable bias. The choice of Pca Likert scale renders the results interpretable. In order to make sure that the results make sense, they compare this index with hierarchical agglomerative clustering. After comparing the two methods, they find high agreement between our Pca based and HAC-based clustering approaches.

Questions

As the authors discuss, there is no broad consensus of the meaning of “credibility”. In this case credibility is the accuracy of the information.  In turn the accuracy of information is examined by instructed raters. The authors use an objective definition of credibility that is dependent on the instructed raters. Are there other ways to assess “credibility” based on “information quality”? Would that yield different results?

 

Garrett, R. Kelly, and Brian E. Weeks. “The promise and peril of real-time corrections to political misperceptions.”

Summary

This paper presents an experiment comparing the effects of real-time corrections to corrections that are presented after a short distractor task. Closer inspection reveals that this is only true among individuals predisposed to reject the false claim. The authors find that individuals whose attitudes are supported by the inaccurate information distrust the source more when corrections are presented in real time, yielding beliefs comparable to those never exposed to a correction.

Reflections

I find it interesting Providing factual information is a necessary, but not sufficient, condition for facilitating learning, especially around contentious issues and disputed facts. Furthermore, the authors claim that individual are affected by a variety of biases and that can lead them to reject carefully documented evidence, and correcting misinformation at its source can actually augment the effects of these biases. In Behavioral Economic there is a term that describes this biases. It is called “Bounded Rationality”. Furthermore, economic models used to assume that humans make rational choices. This “rationality” was formalized mathematically and then Economists create optimization problems that takes into account human behavior. However, new Economic models take into account the concept of bounded rationality into their Economic models through various ways. Perhaps it could be useful for the authors to draw some information from this literature.

Question?

1. Would embedding the concept of “Bounded Rationality” provide a theoretical framework for a possible extension of this study?

Leave a Reply

Your email address will not be published. Required fields are marked *