Reflection #3 -[01/25]- [Vartan Kesiz-Abnousi]

Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.” ICWSM. 2015.

Summary

The authors of the article aim to study antisocial behavior in online discussion communities, as the title suggests. They use data from three online discussion-based communities, Breitbart.com. CNN.com and IGN.com.  Specifically, they use the comments on articles that are posted on the websites. The data covers a period of over 18 months, 1.7 million users who have contributed nearly 40 million posts. The authors characterize antisocial behavior by comparing the activity of users who are later banned from a community, namely Future-Banned Users (FBUs), with that of users who were never banned, or Never-Banned Users (NBUs). They find significant differences between the two groups. For instance, FBUs tend to write less similarly to other users, and their posts are harder to understand according to standard readability metrics. In addition, they are more likely to use language that may stir further conflict. FBUs also make posts that tend to be concentrated in individual threads rather than spread out across several. They also receive more replies than average users. In the longitudinal analysis, the authors find that of an FBU worsens over their active tenure in a community. Moreover, they show not only do they enter a community writing worse posts than NBUs, but the quality of their posts also worsens more over time. They also find that the distribution of users’ post deletion rates is bimodal, where some FBUs have high post deletion rates, while others have relatively low deletion rates. Finally, they demonstrate that a user’s posting behavior can be used to make predictions about who will be banned in the future with over 80% AUC.

Reflections

User-generated content has become important for the success of websites. As a result, maintaining a civil environment is important. Anti-social behavior includes trolling, bullying, and harassment. Therefore, platforms implement mechanisms designed to discourage antisocial behavior, such as moderation, up and down voting, reporting posts, mute functionality and blocking users’ ability to post.

The design of the platform might render the results non-generalizable or more platform specific. It would be interesting to see whether the results hold for other different platforms. There are cases of discussion platforms where the moderators have the option of issuing a temporary ban. Perhaps this could work as a mechanism to “rehabilitate” users. For instance, the authors find there are two groups of users where some FBUs have high post deletion rates, while others have relatively low deletion rates. It should be noted that the authors excluded users who were banned multiple times so as not to confound the effects of temporary bans with behavior change.

In addition, it should be stressed that these specific discussion boards have an idiosyncrasy. The primary function of these websites is to be a news network, not a discussion board. This is important, because the level of scrutiny is different in such platforms. For instance, they might choose banning opposing views expressed in an inflammatory language more frequently, to support their editors or authors. The authors write that “In contrast, we find that post deletions are a highly precise indicator of undesirable behavior, as only community moderators can delete posts. Moderators generally act in accordance with a community’s comment policy, which typically covers disrespectfulness, discrimination, insults, profanity, or spam”. The authors do not provide evidence to support this position. This does not necessarily mean they are wrong, since their criticism for other methods is valid.

However, the authors propose a method that explores this problem by measuring text-quality. They do this by sending a sample of posts to Amazon Turk. Then they take this sample and run a classification model to generalize text-quality results for their sample.

They find some interesting results. Deletion rate increase by time for FBUs, but it is constant for NBUs. In addition, they find that text-quality decreases for both groups. This could be attributed either to a decrease in the posting quality, which would explain the deletion, or community bias. Interestingly enough the authors find evidence that supports both hypotheses. For the community bias hypotheses, the authors use propensity score matching and they find that early posts (first 10% of time) compared to later posts (last 10% of time) for the same text-quality are more likely to be deleted for FBUs but not NBUs. They also find that excessive censorship cause users to write worse.

Questions

  1. How would a mechanism of temporary bans affect the discussion community?
  2. The primary purpose of these websites is to deliver news to their target audience. Are the results same for websites whose primary purpose is to provide a discussion platform, such as discussion boards?
  3. Propensity Score matching is biased if there are unobserved variables. This is usually the case in non-experimental, observational, studies. A nearest neighbor matching with Fixed Effects to control for contemporaneous trend, or by matching users by time, in addition to text-quality, might be a better strategy.

Read More

Reflection #2 – [01/23] – [Vartan Kesiz Abnousi]

Tanushree Mitra and Eric Gilbert. 2014. The language that gets people to give: phrases that predict success on kickstarter. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing (CSCW ’14). ACM, New York, NY, USA, 49-61. DOI=http://dx.doi.org/10.1145/2531602.2531656

 

Summary

The authors of this paper aim to find the type of language, used in crowdfunding projects, which leads to successful funding. The raw dataset is from the crowdfunding website Kickstarter. The raw data has 45K crowdfunded projects, analyzing 9M phrases and 59 other variables commonly present on crowdfunding sites between June to August 2012. The authors use a statistical model. The response variable is whether the project was funded or not. The predictive variables are partitioned into two broad categories. First, control variables such as project goal, project duration and others. Second, the predictive variables of interest, which are phrases scraped from the textual content from its Kickstarter homepage.  The statistical model the authors use is penalized logistic regression that aims to predict whether the project was funded. Moreover, the preferred model is LASSO on the grounds that it is parsimonious. The result of this model has about 2.41% of cross-validation error and 2.24% prediction error. The addition of the phrases decreases the predictive error by 15%. Subsequently, the authors find that phrases have a significant predictive power and proceed to rank the coefficients, “weights”, from highest to lowest.  Furthermore, they group the phrases under categories by using the Linguistic Inquiry and Word Count program (LIWC). Then they compare the non-zero β coefficient phrases to the Google 1T corpus data. They find a subset of 494 positive and 453 negative predictors by a series of statistical tests. Finally, authors discuss the theoretical implications of the results. They argue that phrases that indicate reciprocity, scarcity, social identity, liking and authority are more likely to be funded.

 

Reflection

This paper demonstrates the power of big data in dealing with research questions that researchers were not able to explore until a few years ago. Moreover, not only it analyzes a large amount of data, 9 million phrases, but it selects a subset and then groups them into meaningful categories. Finally, theories from social psychology are used to draw conclusions that could generalize the results. In addition, businesses that opt to crowdfunding could use more of these phrases to receive funding. Interestingly enough, most of the limitations that the authors mentioned are inherent to big data problems, as discussed in the previous lecture.

I find the “all or nothing” funding principle interesting. I think this should be highlighted, because it means that businesses should make sure that they choose their project goal and duration carefully to ensure funding. As the literature review suggests, projects with higher duration and goals are less likely to be funded. Both project goal and project duration are controlled in the model.

In addition, it should be noted that the projects belong to 13 distinct categories. It would be interesting to know the demographics of the people who fund the projects. This could answer a number of questions, such as whether every project is funded by a specific demographic category, or whether some phrases are more appealing to a specific demographic. Perhaps the businesses would prefer to have their funding from the same demographic category that they target as their future clients or customers.

Another information that could be interesting is to know how “concentrated” are the funds to a specific number of people. Was 90% of the funding for a given project from one person and the rest from hundreds of people? Furthermore, there is a heterogeneity in the sources of funding that has an effect on the dependent variable, whether it is funded or not, that is not captured.

The authors chose the LASSO because it is parsimonious. An additional advantage of using the LASSO model is that it gives us a narrower subset of non-zero coefficients for further analysis, since it works a model selection technique. For example, if ridge was used, the authors would have to analyze more phrases, most of which would probably not be important.  However, a problem with the penalized regression approaches is that there are problems in their interpretation. For instance, the coefficient of a classical logistic regression indicates the likelihood that a project can be funded or not, if a specific phrase is used, ceteris paribus. However, LASSO is still preferable than artificial neural networks, because the authors are not only interested in the predictive power of the model, but ultimately in interpreting the results. Perhaps using a decision tree approach would also be useful, because it also selects a subset of variables and allows for interpretations.

 

Questions

  • Would using other statistical models improve the performance the predictive performance?
  • Can we find information about the demographics of the people who fund the projects? Is there a way we can find the demographics of the donors? We could then link the phrases to demographics. For instance, are some phrases more effective based on the gender?

Read More

Reflection #1 – [01/18] – [Vartan Kesiz Abnousi]

Danah Boyd & Kate Crawford: Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society. https://doi.org/10.1080/1369118X.2012.678878

David Lazer and Jason Radford: Data ex Machina: Introduction to Big Data. Annual Review of Sociology.

https://doi.org/10.1146/annurev-soc-060116-053457

 

Summary

The articles aim to provide a critical analysis of  “Big Data”. Moreover, there is a brief historical account on how the term “Big Data” was born and its definition. They stress that Big Data is more than large datasets, it is about a capacity to search, aggregate, and cross-reference large data sets. They underlie the great potential of Big Data in solving problems. However, they also warn about the dangers that it brings. One danger is that it could be perceived as a panacea for all research problems and reduce the significance of other fields, like the humanities. It is being argued that the tools of Big Data do necessarily offer an “objective” narration of the reality, as some of its champions boast. For instance, the samples of Big Data are often not representative to the entire population. Therefore, making generalization for all groups of people solely based on Big Data is erroneous. As the authors argue “Big Data and whole data are also not the same “.  It is not necessary to have more data to get better insight regarding a particular issue. In addition, there are many steps in the analysis of the raw Big Data, such as “data cleaning”, where human judgement is required. As a result, the research outcome could be dramatically different due to the subjective choices that were made in the process of data analysis. In addition, concerns regarding human privacy are raised. The human subjects may either not be aware or give their consent to have their data collected. What is private and public information has become more obscure. State institutions might use the data in order to curtail individual civil liberties, a phenomenon known as Big Brother. A particularly important problem is that of research inequality, which takes numerous forms. For example, companies do not provide full access of the collected data to public research institutions. As a result, the few privileged who are within the companies and have complete access can find different, more accurate, results. In addition, those companies usually partner or provide access of their data to specific elite universities. As a result, the students of these schools will have a comparative advantage in their skills compared to rest. This sharpens both the social and research inequalities. The very definition of knowledge is changed. People now get large volumes of epidemiological data without even designing an experiment.  As the authors argue, “it is changing the objects of knowledge”. The authors also argue that Big data is vast and heterogeneous. They classify the data into three sources, digitalized life, digital traces and digital life. As digital life they refer to Twitter, Facebook, and Wikipedia which are all platforms where behaviors are all online. The authors argue that these platforms can either be viewed as generalizable microcosms of society or as distinctive realms in which much of the human experience now resides. Digital traces include information collected from sources such as phone calls, while an example of digitalized life are the video recordings of the individuals.

Reflections

Both articles are very well written. I agree with the points that the articles raise. However, I am particularly cautious about the notion of viewing digital life as a microcosm of our society. Moreover, such a generalization is more than just an abstract, subjective, idea. It is rigorously defined in probability theory. There are mathematical rules on whether a sample is representative or not. A famous example are the 1948 US presidential elections when Truman won, at the time all the elections polls were wrong because of sampling errors. I am also worried that some of these digital platforms bolster a form of herd behavior that renders individuals less rational. This herd behavior that has been studied by social scientists such as Freud and Jung, among many, has been argued that was one of the causes for the rise of Fascism.

Finally, I have some questions that could develop into research ideas such as:

  1. Does not the nature of the digital platform i.e. Twitter change an individual’s behavior? If yes, then how?
  2. Is the increasing polarization in the United States related to these digital platforms?
  3. Does digital anonymity alter someone’s behavior?
  4. Do people behave the same way across different digital platforms?
  5. Can we, as researchers, develop a methodology to render digital platforms, traces, representative to the population?

Read More