Reflection #4 – [01/30] – [Jiameng Pu] | CS6724 Spring18: Computational Social Science

A parsimonious language model of social media credibility across disparate events

Summary:

With social media’s dominance of people’s acquisition of news and events, the credibility of content on different platforms tend to be less rigorous than that of content on traditional journalists. In this paper, the author conducts research on analyzing the credibility of content on social media and proposes a parsimonious model that can map language cues to perceived levels of credibility. The model is presented based on examining the credibility corpus of Twitter messages corresponding to 15 theoretically grounded linguistic dimensions. It turns out that there are considerable indicators of events’ credibility in the language people use.

Reflection:

The paper inspires me a lot by leading me to go through a whole research process from idea creation to idea implementation. In the data preparation phase, it’s a very common task to annotate with ordinal values on the content we are studying, the proportion-based ordinal scale PCA was used in this paper can help compromise extreme conditions, which is like a trick that I can take away and try in other studies. The author uses logistic regression as the classifier, I think neural networks should also be a good choice to make classifications. Specifically, neural networks used for classifying usually has inputs of feature number and outputs of class number. Potentially, neural networks might help us derive better-classifying performance, which then helps the analysis of feature contribution.

The Promise and Peril of Real-Time Corrections to Political Misperceptions

Summary:

Computer scientists create real-time correction systems to label inaccurate political information with the purpose of warning users of inaccurate information. However, the author is skeptical about the efficiency of the real-time correction strategy and leads an experiment to compare the effects of real-time correction to non-real-time correction. In the design phase of the comparative experiment, the researchers conduct a control variable method and assess participants’ perceptions of facts through questionnaires. Then they construct a linear regression model to analyze the relationship between belief accuracy of participates and the correction strategy. The paper concludes that real-time corrections are modestly more effective only among individuals predisposed to reject the false claim.

Reflections:

This paper conducts a comparative study about effects of immediate and delayed corrections on readers’ belief accuracy. Generally, one of the most important parts of comparative research is to design a reasonable and feasible experimental scheme. Although lots of big data research needs to collect ready-made data for preprocessing, some research requires researchers themselves to “produce” data. Thus the design scheme of getting data has a significant impact on subsequent experiments and analysis.
In the first survey-based step for data collection, choice of survey samples, setting of control variables, and evaluation methods are main points that can greatly affect the experimental results, such as the diversity of participants (race, gender, age), the design of delayed correction, and the design of the questionnaire.

Particularly, in order to achieve the delay correction, the author employed a distraction task—participates were asked to complete a three-minute image-comparison task. Although this task can achieve the purpose desired, this is not the only strategy we can perform. For example, the duration of the distraction task may have a different impact on the participants’ cognition of the news facts, so researchers can try multiple durations to observe whether there is different impact. In analyzing section, linear regression is one of the most common models used in result analysis. However, for some complex issues without a strict rule, the error of a linear regression model is potentially larger than that of a nonlinear regression model. Nonlinear regression with appropriate regularization is also an option to choose.

Question:

As analyzed in the limitation section, although the author tries to make the best possible experimental design, there are still many design decisions that affect the experimental results. How can we do this to minimize the error?
Intuitively, it is more proper to study mainstream reading group on the Internet, the average age of the study object is too large under this circumstances.

Reflection #4 – [01/30] – [Jiameng Pu]

jiameng

Leave a Reply Cancel reply