Summary:
Both papers talk about what Big Data means in today’s world and how the definition of Big Data changes as per the field being studied. They put more focus on the vulnerabilities and pitfalls of Big Data and also how combining data from various sources pose a challenge to the researchers in every field. They also present the enormous institutional challenges to sociology. The first paper ‘Critical Questions for Big Data’ focuses more on the emerging significant questions in the Big Data field and the six provocations to spark conversations about the issues of Big Data. The second paper ‘Data Ex Machina’ focuses more on the evolution of Big Data and the vulnerability issues.
Reflection:
The most interesting part which I liked is how Big Data has evolved over time and how it is proliferated across society. However, the manifestation varies for each field of study, for e.g; In astronomy, it takes the form of images, in humanities, it can take the form of the digitized book and in social science, it takes the form of data across various social media website like Twitter, Facebook etc.
What caught my interest is how data from various social media platforms can be combined together and applied in a field like ‘sociology’ to analyze and understand human behavior. The transition to the digital life and how entire Internet may be viewed as an enormous platform for human expression. However, I wonder how accurately what happens on these platforms (like, tweets and posts) can be applied to social sciences to infer human experiences.
What I found disappointing was the ‘The Ideal User Assumption’ explained in Lazer’s article which raises concern and draws attention towards how data collected from a specific type of unique people can be fake. How various organizations and government agencies are using bots to achieve surreptitious goals which further poses a validity threat. According to Xinhua (2016), the Chinese peer-to-peer lending platform Ezubao was revealed to be a Ponzi scheme in which 95% of the proposals were fake. Although committees like IRBs exist to ensure the ethics in particular line of research inquiry (like human subjects).
Questions:
- How can we ensure the data integrity across various social media platforms?
- Reiterating, who is responsible for making certain that individuals and communities are not hurt by the research process?
- How can we claim what our data represents?
- Does it represent a group of individuals or just an individual?
- Can insights be drawn on data collected from an individual be generalized, especially in the field of social science?