Reflection #2 – [1/23] – Aparna Gupta

Paper: The Language that Gets People to Give: Phrases that Predict Success on Kickstarter.

Summary:

The paper talks about the Crowdfunding websites like Kickstarter where entrepreneurs and artists look for the internet for funding. This paper explores the factors which lead to successful funding a crowdfunding project and looks to answer the question – “What makes a project succeed in order to get funded”. The presented work focuses more on the predictive power of content, and more precisely the words and phrases project creators use to pitch their projects. The authors have analyzed 45K Kickstarter projects. To ensure generalization, they have used phrases which occurred >50 times in all the projects under consideration. The paper concludes with the citing that projects which shows reciprocity, Scarcity, Social Proof, Authority, Social Identity and Liking are more likely to get funded.

Reflection:

The paper gives a hang of some of the important phrases which can determine the probability of getting successfully funded on Kickstarter. However, the question which stuck my mind is “Can these phrases be generalized across various genres?”. The paper states that by analyzing the project content and most commonly occurring phrases one can understand the social reaction of an individual. However, I feel that this can be biased, in the sense that the reasoning behind specific reactions cannot be known. It might happen that a project does not get the funding because the person listening to the pitch does not have interest in the field. How should reactions(maybe biased) be interpreted or taken into consideration?

The paper lists the factors like –  project goal, duration, category, the presence of a video etc. which plays a significant role in predicting whether a project will get funded. I agree to these since presenting a video can explain a concept better. Visualizations expedite the understanding process of the viewers. However, I am curious to understand if “what the product is about and how useful it’ll be in future, can also be served as a feature in determining the getting funded status of a project

The statistical analyses explained in the paper depicts the amalgamation of modeling and sociology. The authors have used ‘LASSO’ to determine the feature importance. Can other statistical models be used as well, in this scenario? The modeling results, however, highlight phrases like ‘good karma’, ‘used in a’ etc, which were contained in the funded projects looked misplaced. The authors have also raised a similar question for a phrase like ‘cat’ being present in most projects which got funded. What intrigues me is: Although a lot of research has already been conducted to understand sociology using statistical modeling, there are still some facts about social behavior which are unexplored and difficult to understand.

This paper overall explores a challenging question of determining what features, language and English phrases compel people to invest in a project.

Read More

Reflection#1 – [1/18] – [Aparna Gupta]

Summary:

Both papers talk about what Big Data means in today’s world and how the definition of Big Data changes as per the field being studied. They put more focus on the vulnerabilities and pitfalls of Big Data and also how combining data from various sources pose a challenge to the researchers in every field. They also present the enormous institutional challenges to sociology. The first paper ‘Critical Questions for Big Data’ focuses more on the emerging significant questions in the Big Data field and the six provocations to spark conversations about the issues of Big Data. The second paper ‘Data Ex Machina’ focuses more on the evolution of Big Data and the vulnerability issues.

Reflection:

The most interesting part which I liked is how Big Data has evolved over time and how it is proliferated across society. However, the manifestation varies for each field of study, for e.g; In astronomy, it takes the form of images, in humanities, it can take the form of the digitized book and in social science, it takes the form of data across various social media website like Twitter, Facebook etc.

What caught my interest is how data from various social media platforms can be combined together and applied in a field like ‘sociology’ to analyze and understand human behavior. The transition to the digital life and how entire Internet may be viewed as an enormous platform for human expression. However, I wonder how accurately what happens on these platforms (like, tweets and posts) can be applied to social sciences to infer human experiences.

What I found disappointing was the ‘The Ideal User Assumption’ explained in Lazer’s article which raises concern and draws attention towards how data collected from a specific type of unique people can be fake. How various organizations and government agencies are using bots to achieve surreptitious goals which further poses a validity threat. According to Xinhua (2016), the Chinese peer-to-peer lending platform Ezubao was revealed to be a Ponzi scheme in which 95% of the proposals were fake. Although committees like IRBs exist to ensure the ethics in particular line of research inquiry (like human subjects).

Questions:

  1. How can we ensure the data integrity across various social media platforms?
  2. Reiterating, who is responsible for making certain that individuals and communities are not hurt by the research process?
  3. How can we claim what our data represents?
  • Does it represent a group of individuals or just an individual?
  • Can insights be drawn on data collected from an individual be generalized, especially in the field of social science?

Read More