First paper : CRITICAL QUESTIONS FOR BIG DATA
Second paper : Data ex Machina: Introduction to Big Data
Summary
Both papers start with the introduction to big data. One of main points raised is the changing definition of big data – earlier it was about size of data, now it is about search, aggregate and processing the data. First paper directly jumps to the critical analysis of big data phenomenon. Second paper, on the other hand, takes a more comprehensive and structured approach to the analysis of big data. Both paper discuss the source of data acquisition, emergence of an entire new set of digital data, data privacy concerns, its analysis and conclusion. They show the changing approach to research due to big data and how abundance of data gathered might not lead to the true representation of the whole problem sample. Even conclusions provided from balanced representative data might also be subjective and biased. The mention of “Big Data Hubris” shows that having large volume of data does not correlate to better results. While the first paper continues critical analysis of big data, second paper provides future trends where volume of data will grow in size and diversity of platforms and more generic data models will take over.
Reflection
The first paper does a really good job of raising important questions related to big data, starting with its definition. The second paper is more of an introduction to big data and only provides a high level view of issues related to big data. For the initial reading, second paper can be used to provide overview of the big data industry, followed by the first paper for discussions related to its most important issues. Issues like privacy are unethical data collection are central to this debate as individuals, corporations or even governments can easily misuse the data.The contrast between the nature of the two papers is quite evident. The second paper mentions the problems faced by big data field today and its future trends whereas the first paper discusses over the problems caused due to big data and critically analyses its aspects. Aspects like uneven access to digital data and poorly understood analysis results can have ramifications on large sections of society and the first paper raises the right kind of questions for civil discussions.
Questions
1) Since the volume of data generated will continue to grow, how the government can ensure its protection and ethical treatment ?
2) Who should actually own the digital footprint data – individual or the respective companies collecting it ?
3) Is the data-driven approach is the right approach for all kind of social problems ? Will it lead to less focus towards areas where it is inherently difficult to generate large volume of data ?