In the last decade, the evolution of digital systems and connectivity has created a gushing stream of data that contains latent treasures. Multiple domains such as biology, astronomy, and social science are facing an unprecedented challenge; how to deal with Big Data? Big data is perhaps one of the most resonant scientific terms in the last decade. A specific field of study in computer science has been established to develop hardware and software systems that scale with big data.
In social science, big data gushing out of online social platforms represents an invaluable gold mine. In their work titled Data ex Machina: Introduction to Big Data, David Lazer et al. explore the opportunities and challenges of using big data to study and analyize different social phenomena. From their perspective, big social data comes from three sources; digital life ( e.g. online social platforms ), digital traces ( e.g. call records ), and digitized life ( e.g. digitizing old books and newspapers ).
Many opportunities exist in analyzing data that are generated from the aformentioned sources. These data could reflect actual patterns of social activities that are hard to extract from research surveys and questionnaires. It also creates the opportunity to analyze social interactions with breaking events as they are happening, rather than performing a retrospective analysis on past events.
From my perspective, archived data from social platforms could also serve as a treasure to reptrospectively analyze some special events. An example that always influences my ideas is the Arab Spring uprisings. For example, the Egyptian people have lived a very precious experience between 2011 and 2013. In some places in the world, online social platforms could be the best places to record the traces of some events.
Online social platforms could also serve as a natrual experiment field, a great opportunity but with great ethical concerns. An example could be Facebook’s experiment on social influence and emotion contagion, a very promising experiment that raised a huge ethical debate.
David Lazer et al. also explore a set of challenges and vulnerabilities. Those challenges include the generalizability of analysis performed on an individual data source ( e.g. Twitter ). From this point, they shed the light on the problem of social activities of individuals are spread across multiple social platforms. They also talk about the the credibility and legitimacy of data given the widespread of bots and fake indentities.
From my view, I believe that no matter how hard are the aformentioned challenges, the continuous research efr fort to understand and solve these problems will likely converge at some point. The harder challenge that is likely to linger is the ethical challenge. David Lazer et al. shed the light on the research ethics. How to guarantee the rights of a human subject research, have an informed concent in place, and still guarantee the quality and benefits of collecting and analyzing his online social data. Some people commented on social platforms being free by saying “if it is offered to you for free, then probably you are the goods being sold“. From my perspective, establishing and applying rules that preserve the rights of every user, and being trasnparent about everything, is a lingering challenge.