Reflection #13 – [11/29] – [Mohammad Hashemian]

Data ex machina: Introduction to big data – David Lazer and Jason Radford

 

Although the dynamic features of social media and Big Data provide a great research opportunity for social network researchers to analyze human behaviors, there are many challenges the world envisaged associated with analysis of Big Data. In this study, some Big Data projects considering their Big Data challenges are reviewed.

Big Data in this review paper is broken down into three domains of digital life, digital traces, and digitalized life. What I found out from this categorization is that digitization of human life has made it much easier to analyze her social behavior than before. As it is mentioned in this paper, it is possible for not only social network platforms owners but third parties to harvest data from these platforms. It may be thought that this is always in the interest of users, because researchers can use these valuable information to study human behavior and the result of this research are ultimately useful for the users. But unfortunately, it is not always the case.

Nowadays, one of the most common ways of finding people is through social media sites (digital life), which is a valuable feature for debt collectors because they can use the social networks to find people who owe money. They call this action skip tracing which means tracking down a debtor when there is no information about their current address, phone number, or place of employment. Researchers in these kinds of companies use many state-of-the-art data analytics techniques to obtain valuable information, including freely available information on social media sites, when they want to find someone in order to collect a debt. What was mentioned, was one of the applications of Big Data which is one of the results of digital life and is not accepted by the public. But if I just want to focus on Big Data and its challenges researchers are dealing with, I think some other challenges can be pointed out which are not mentioned in this research.

In the part “The Ideal User Assumption: Bots, Puppets, and Manipulation” authors demonstrate the importance of user’s true identity by focusing only on vulnerability of data because of existence of Bots, Puppets, and Manipulation. However, there is another research challenge in mapping Big Data related to identity of the users.

Many Big Data sources do not contain detailed demographic information unlike traditional data collection methods that result in comprehensive user profiles. Without knowing users’ information, Big Data research may be biased. For instance, the actual users of social media services are generally from younger generation[1]. Thus, data collected from social media represent a small sample of the whole population. It seems more research to understand the user profiles in Big Data sources are needed.[2]

Another Big Data problem is that researchers cannot re-run the published Big Data research. Generally, published scientific research from previous publications can be verified by other researchers by different/same methods with the same data (which is one of the significant features of scientific research). But, many Big Data like social media messages and mobile phone data sets are proprietary. Therefore, researchers unable to re-test the most of the recent published social media research. As an example, consider the Twitter API. Based on the Twitter API agreement, the original raw tweets cannot be distributed by researchers to anyone except for their research groups. Many researchers can only retrieve 1% of randomly sampled data via public APIs for their research. This Big Data problem can be an important obstacle to the advancement of social media research in the future.

 

 

[1] 40% of Twitter users are between the ages of 18 and 29, 25% users are 30-49 years old (https://www.omnicoreagency.com/twitter-statistics/)

[2] https://www.tandfonline.com/doi/full/10.1080/15230406.2015.1059251

Leave a Reply

Your email address will not be published. Required fields are marked *