Reflection #13 – [11/29] – [Deepika Rama Subramanian]

This article, surprisingly originating from a sociology journal, seems to succinctly talk about everything we spoke about this semester in our class. They speak about big data’s enormous use in understanding human behaviour. Especially behaviours/behaviour patterns that are difficult to record or are often misreported in self-reports.

As is fit by a sociology journal, they cast the world of big data into various manifestations – digital life, digital traces, digitalized life and instrumentation of human behaviour. When the authors say that digital life can be viewed as generalizable microcosms of society, I wonder if this is appropriate. Given the fluid and incremental way in which platforms grow, does it seem appropriate to give and keep a label?

As we’ve seen several times before in this class, one worries about the ethical conundrum in mining big data to gauge insights in sociology. While the tracking of phones of several students to understand the ties between friends (from Facebook), the students were aware of this. We previously read a paper about Facebook (without the knowledge of its users) tweaking its feed to gain knowledge about emotional reactions to the posts on their feed.

We are also surrounding ourselves with gadgets and objects that are constantly providing information that can be exploited possibly for things we would not consent to. In a sense, we are creating our own surveillance state. Dr. Michael Nelson, during a recent seminar here, spoke about this very problem. By using fitness tracking devices, virtual assistants, we are unknowingly and unwittingly providing valuable data for analysis. The fitness app Strava faced some flak for this when US soldiers in Afghanistan used the app for tracking while running around army bases there. The soldiers unbeknownst to themselves were clearly giving away locations of secret army bases. Obviously, we are still having some trouble keeping all the data being misused.

Another issue with big data that was mentioned by the authors that seems prevalent is the issue of inclusivity. During the talk given by Dr. Rajan Vaish, I remember his mentioning that even during their study (Whodunit?), they found it difficult to involve the rural population in India. Ofcourse this is in part due to the expensive, metered internet connections but this is also a question of interest (in case of other major online platforms) for the community.

The authors themselves have outlined several other issues that come with big data analytics in sociology. One of the more familiar ones is the issue of bots, puppets and manipulation. We have, over the course of this course, seen several ways we can curb this behaviour making the data that is available to us more meaningful. But this problem is present for now skewing a lot of the ongoing analyses.

Finally the authors talk about qualitative approaches to big data and I almost laughed out loud! We were battling this issue with a much much much smaller data set until even a few days ago. The authors promise computationally enabled qualitative analyses that will help us analyse data that is beyond the capacity of armies of grad students to read. To this we say thank you!

This article was a fitting way to end the semester and the course. It was in itself a summary of sorts making it slightly difficult to summarise and reflect on.

Read More

Reflection #12 – [10/23] – [Deepika Rama Subramanian]

[1]- Robert M. Bond, Christopher J. Fariss, Jason J. Jones, Adam D. I. Kramer, Cameron Marlow, Jaime E. Settle and James H. Fowler, “A 61-million-person experiment in social influence and political mobilization”

[2]- Adam D. I. Kramer, Jamie E. Guillory, and Jeffrey T. Hancock, “Experimental evidence of massive-scale emotional contagion through social networks”

SUMMARY:
Both these papers talk about influence of social networks on individual’s decision making. The first one talks about ‘desirability’ during the US election: I Voted Button whereas a page that gave users information regarding the election (polling booths location,etc) to check user engagement. It was no surprise that they found the I Voted Button more prominent. The second paper manipulates (in a controlled environment) the number of user posts in sad and happy moods and then studies the kind of post a user, under this influence posts. They found that the users who had been exposed to positive posts tended to put in positive posts themselves.

 

REFLECTIONS:

Mob mentality has been a problem since the time of Caesar and the power to influence and sway it comes with much responsibility. The first question that one would ask is if the second study ([2]), while within the User Terms and Agreement on Facebook, an ethical one? Should it be ok to toy with users’ emotions in the name of science? In our everyday lives, we may be led to read various kinds of posts that may influence to behave one way or the other, and while this is again a huge bone of contention, should we allow social media giants to control what comes on our feed to give us a largely positive experience?  Also, in their experiments, the authors don’t seem to consider how positive or negative a post is. An intense positive post can often drain out a lot of negative emotion. This can also be harnessed to useful ends by charities and other NGOs that aim at helping people. Many sad stories followed up by a positive one (caused by users online) can influence other people to help and also ‘restore faith in humanity’.

The first study([1]) on the other hand has probably given rise to a lot of positive movements on Facebook. When people see their friends donating and supporting causes and movements, they feel the need to do it themselves as well, whatever the motivation may be.

Read More

Reflection 10 – [10/02] – [Deepika Rama Subramanian]

[1] Starbird,Kate – Examining the Alternative Media Ecosystem through the Production of Alternative Narratives of Mass Shooting Events on Twitter

SUMMARY

This paper examines the alternative media systems and their role in propagating fake news. This is done by creating network graphs based on tweets related to alternative narratives of mainstream events based on media type, narrative stance and political leaning.

REFLECTION

This paper chooses a dataset from twitter that is both may be slightly small and limited to a couple of incidents that people/media have alternative explanations for. Examining the r/conspiracy subreddit for more events may give more definitive results. Here we may also be able to answer some of the other questions that come to mind –

  1. What kind of news’ alternative narrative becomes most popular – based off of science, politics, disasters both man-made and natural and terrorist attacks.
  2. While there may not be sufficient information to determine these links – what other things are popular conspiracy theorists also interested in? Are they in any way a result of their interest in conspiracy theories or fueling their fantasies in any way?

What surprised me in the paper initially was the inclusion of mainstream media in their network graphs at such a large scale – The Washington Post, The New York Times, etc. However, this made me realize that anything could seem like a clue to a person who is willing to believe in something. Though this would be tough and highly subjective, we could study if people predisposed to believing conspiracy theories – is it something that they already believe before they join communities, i.e., start using social media to read about them? Or is it the result of influence from joining and reading many alternative narratives? How about the timing of these conspiracy theories? Does it take a while for people to come up with alternative narratives or are these theories generated as the events unfold? Another interesting thing to find out is whether these theories have begun coming out at shorter and shorter intervals with respect to the event that they have narratives about. We would then know if theorists are ‘learning’ or are influenced by past theories and are coming up with new ones in shorter periods of time.

I next wanted to see how these alternative narratives tied in with other concepts we’ve previously examined in class – anonymity, sock-puppets and the filter bubble. It seems evident that all of the aforementioned issues could cause or inflame cases where people are leaning towards believing in conspiracy theories. Sockpuppets are particularly useful in pushing your propaganda while the filter bubble is going to help you by making sure users are bombarded with “false” narratives. The anonymity of the users who are propagating alternative narratives doesn’t seem to be an issue for those willing to subscribe since the ‘news sells itself’.

Read More

Reflection #9 – [09/27] – [Deepika Rama Subramanian]

Dr. Talia Stroud spoke at length about partisanship in the media.  I’ve jotted down the following things that occurred to me as I watched the lecture:

  1. To keep a check on the degree of partisanship, I would want to study if the degree has increased after the 2016 elections. The public in the America have been more divided than ever after 2016. If we study the degree of incivility in discussion forums before and after the 2016 elections, we may be able to tell conclusively that the political temperature can learn to more divide in public opinion. For a fixed active users, we can check to see if they have flipped their stance (in public atleast) since before and after the elections on any major issues – gay marriage, abortion, etc.
  2. One could imagine that there would be a lot of trolls in the mix just to cause the confusion and have more people engage in the (sometimes) pointless conversations. If we are able to efficiently identify trolls and sockpuppets in the comments (automatically) we may be able to, to a degree, control the off topic conversations in the forums. Does this reduce the amount of polarization on the forums? If it does, this may imply that many people want to put their points forth civilly but are incited into wars in the comments section. We must note that not all news houses have the resources to have 13 people weed out unwanted comments.
  3. While this may not be possible with larger news organizations, small organizations with left-right leaning partisan following can feature stories from the opposition news organization of similar scale. We can have the partisans visit their preferred websites and check to see if this has improved tolerance in the two parties. I would like to mention a study undertaken by the Duke University Polarization Lab (https://www.washingtonpost.com/science/2018/09/07/bursting-peoples-political-bubbles-could-make-them-even-more-partisan/) that suggests that the attempt to burst people’s political bubbles could make them even more partisan. This means that pushing people to acknowledge that they’re living in a bubble could be counter-productive.
  4. There was something that Dr. Stroud mentions that caught my attention- that we might have to feed information to the general populace without them being aware of this. The Emmy nominated sitcom ‘Blackish’ currently streams on Hulu deals with an African American family and their take on America’s social and political fabric. While this show received a lot of flak for their presentation of their issues, this is a way to inject information into the public without this being explicitly called ‘news’. After I binge-watched the series, I realised that they tried their best to give the most balanced information they could and it was quite effective.
  5. Design-wise we can have a ‘The Balancer’ style widget that can help us display the bias in the comment section as users post. By gently asking the user if he wants to post a heavily polarizing comment, we may be able to guilt some users into not posting their comment.

Read More

Reflection #8 – [09/25] – [Deepika Rama Subramanian]

R.Kelly Garrett. “Echo chambers online?: Politically motivated selective exposure among Internet news users.”

Paul Resnick et al. “Bursting your (filter) bubble: strategies for promoting diverse exposure.”

The assigned readings for this week speaks about the filter bubble that we’ve previously spoken about in this class. Garrett’s paper talks about the likelihood of an individual to pick a news article and the amount of time that he would spend on it depending on their ideological point. He hypothesised and concluded that individuals were more like to look at opinion-reinforcing news and would spend more time reading it if it agreed with their view point strongly. He also concluded that the more opinion-challenging information the reader anticipates in a story, the less likely he was to read it. However, he also realized that the opinion-challenging information had less effect than opinion-reinforcing information.  Resnick’s work talks about the various ways to get around the filter bubble – to be aware of them and to overcome its effects.

Many of Resnick’s proposed methods involved keeping the individual informed of the kind of news that they were reading whether it leaned left or right. In other cases, where motivated information processing was at work, his methods encourage us to identify and understand the arguments posed by the another individual with opposing views. This still does not give us a way to successfully pass on all the information that is available to us. I wonder if the most effective way to deliver such news is to present it through mediums that don’t know partisanship yet. Social and political commentary is often offered by popular sitcoms.

A dent in our hopes of eliminating partisanship through more exposure is dealt by a recent study at Duke University Polarization Lab[1]. They designed an experiment to disrupt people’s echo chambers on Twitter by having Republicans and Democrats follow accounts (automated) that retweeted messages from the opposition. After a month, he discovered that the Republicans exposed to the Democratic account became much more liberal and the democrats who had been exposed to the Republican tweets became slightly more liberal.

 

[1] https://www.washingtonpost.com/science/2018/09/07/bursting-peoples-political-bubbles-could-make-them-even-more-partisan/

Read More

Reflection #6 – [09/13] – [Deepika Rama Subramanian]

  1. Sandvig, Christian, et al. “Auditing algorithms: Research methods for detecting discrimination on internet platforms.”
  2. Hannak, Aniko, et al. “Measuring personalization of web search.”
  3. These papers deal with the algorithms that, in some sense, govern our lives. Sandvig et al. talk about how biases are programmed into algorithms for the benefit of the algorithm’s owner. Even organizations that wish to keep their algorithms open to public cannot do so in its entirety because of miscreants. The authors speak about the various ways the algorithms can be audited – code audit, non-invasive user audit, scraping audit, sock puppet audit, and collaborative crowdsourced audit. Each of these methods had their upsides and downsides, the worst of them being legal issues. It does seem like good PR to have an external auditor come by and audit your algorithm for biases. O’Neil Risk Consulting & Algorithmic Auditing calls their logo the ‘organic’ sticker for algorithms.

    While I wonder why larger tech giants tend not to do this, I realize we are already severely reliant on their services. For example, Google search for online shopping redirects us through one of Google Ad services to a website, or maybe we search for a location and we have it on Maps and News. They’ve made themselves almost indispensable to our lives that the average user doesn’t seem to mind that their algorithms may be biased towards their own services. Google was also recently slapped with a 5 billion dollar fine by the European Union for breaking anti-trust laws – amongst other things, they were bundling its search engine and Chrome apps into the Android OS.

    While in the case of SABRE, the bias was programmed into their system, many of today’s systems acquire the bias from the environment and the kind of conversations that they are exposed to. Tay was a ‘teen girl’ twitter bot set up by Microsoft’s Technology and Research division who went rogue in under 24 hours. Since people kept tweeting offensive material to her, she transformed into a Hitler loving, racially abusive, profane bot who had to be taken offline. Auditing and controlling bias in systems such as this will require a different line of study.

    Hannak et al. speak about the personalization of search engines. Search engines are such an integral part of our lives and there is a high possibility that we may lose some information simply because the engine is trying to personalize the results to us. One of the main reasons (it seems) like this study was carried out was to avoid filter bubble effects. However, the first thing that I felt about personalization in search engines is that it is indeed very useful – especially respect to our geographical location. When we’re looking for shops/outlets or even a city, the search engine always points us to the closest one, the one we’re most likely looking for. Also, the fact that they correlate searches with previous searches also seems to make a fair bit of sense. This study also shows that searches where the answers are somewhat strictly right or wrong (medical pages, technology, etc.) don’t have a great degree of personalization.  But as far as search engine results go, personalization should be the last thing to be factored into ranking pages after trust, completeness of information, and popularity of the page.

Read More

Reflection #5 – [09/10] – [Deepika Rama Subramanian]

Eslami, Motahhare, et al. “I always assumed that I wasn’t really that close to [her]: Reasoning about Invisible Algorithms in News Feeds.”

Bakshy, Eytan, Solomon Messing, and Lada A. Adamic. “Exposure to ideologically diverse news and opinion on Facebook.”

SUMMARY

Both the papers deal with Facebook’s algorithm and how it influences people in its everyday life.

The first paper deals with the ‘hidden’ news feed curation algorithm employed by Facebook. Through a series of interviews, they do a qualitative analysis of:

  • Algorithm Awareness – Whether users are aware that an algorithm is behind what they see on their news feed and how they found out about this
  • Evaluation (user) of the algorithm – The study tested if the users thought that the algorithm was providing them with what they needed/wanted to see
  • Algorithm Awareness to Future Behaviour – The study also asked their users if, after they discovered the algorithm and the possible parameters, if they tried to manipulate it in order to personalise their own view or to boost their posts on the platform

The second paper deals with how Facebook’s newsfeed algorithm’s bias leads to the platform being an echo chamber, i.e., where your ideas are going to reinforced with no challenges because you tend to engage with posts that you believe in.

REFLECTION

Eslami et al.’s work shows how a majority of users are unaware that an algorithm controls what they see on their newsfeed. In turn, they will believe that either Facebook is blocking them out or their friends are blocking them out. It is possible to personalize the Facebook feed extensively under News Feed Preferences – prioritize what we see first and the choice to unfollow people and groups. The issue with the feed algorithm is that the ‘unaware participants’ who form a large chunk of the population don’t know that they can tailor their experience. If it is let known, through more than a small header under settings, that an algorithm is tailoring the newsfeed, it would be more helpful and they are less likely to cause an outrage among their users. Placing the News Feed Preferences on either side of the newsfeed itself is a good option.

There was a recent rumour in January that had users believe that Facebook was limiting their feed to 25 friends. Many users were asked to copy-paste a message against this so that Facebook took notice and made alterations to their algorithm. Twitter has made sure that their newsfeed posts are in reverse-chronological order from followed accounts and occasionally the suggested tweets that is liked by someone else you follow. Reddit has two newsfeeds of sorts – best and hot. Best contains posts that are tailored to your tastes based on how you engaged with the posts, hot on the other hand shows the posts trending worldwide. This gives an eclectic and obvious mix, therefore, making sure it doesn’t become an echo-chamber.

Most recently, Zuckerberg had announced that Facebook’s goal was now not ‘helping you find relevant content’ but to ‘have more meaningful interactions’. Facebook tried the Reddit style two newsfeed model in an experiment. They removed posts from reputed media houses and placed them in an explore feed. This was to ensure that (the social media site) promoted interactions, i.e., increase organic content (not just those that were shared from other sites). They also hoped to do away with their platform acting like an echo chamber. This experiment was run in six small countries – Sri Lanka, Guatemala, Bolivia, Cambodia, Serbia and Slovakia. Following this major news sites in these countries (especially Bolivia and Guatemala) showed a sharp decrease in traffic. Unfortunately, this means that Facebook has become one of the biggest sources of news making it a ripe platform to spread fake news (for which, currently, it has limited or no checks).

However, I wonder how Facebook now is responsible for producing complete news, views from both sides. It began purely to support interactions between individuals and has evolved to its current form. Its role in news providing has not become entirely clear yet. However, as far as echo chambers go, this isn’t new. Print media, TV, talk show hosts – their ideologies influence the content they provide. People only tend to watch and enjoy shows that agree with them in general.

Read More

Reflection #4 – [09/06] – [Deepika Rama Subramanian]

Kumar, Srijan, et al. “An army of me: Sockpuppets in online discussion communities.”

SUMMARY & REFLECTION

This paper deals with the identification of sockpuppets and groups of sockpuppets. This particular paper defines sockpuppets simply as multiple accounts controlled by a single user, they don’t assume that this is always with malicious intent. The study uses nine different online discussion communities that have varied interests. The study also helped identify types of sockpuppetry based on deceptiveness (pretenders vs non-pretenders) and supportiveness (supportive vs. dissenter).

In order to identify sockpuppets, the authors identified four factors – the posts should come from the same IP address, in the same discussion, be similar in length and posted closer together in time. However, they had eliminated the top 5% of all users who posted from similar IP addresses that could have come from behind a nationwide proxy. If the puppetmaster was backed by an influential group that was able to set up such a proxy in order to propagate false information, these cases would be eliminated right up front.  Is it possible that the most incriminating evidence is being excluded?

Further, the linguistic traits that were identified and considered in this study were largely those used in the previous discussions about antisocial behaviour in online communities. Even the frequency of posting of sockpuppets versus ordinary users and the fact that they participate in more discussions than they start make these user accounts similar to trolls.

In pairs/groups, the sockpuppets tend to interact with one another more than with any other user in terms of replies or upvotes. The paper states in the beginning that it is harder to find the first sockpuppet account and when one is found, the pair or the group are easily identified. Cheng et al. in their paper ‘Antisocial Behaviour in Online Communities’ have already spoken about a model that would be able to weed out anti-social users early on in their lives. Once they have been identified, we could apply the non-linguistic criterion outlined in this paper to identify the rest of the sockpuppets.

A restrictive way of solving this issue of sockpuppet accounts could be to have users tie their discussion board accounts not only to an email id but also to a phone number. The process of obtaining a phone number is longer and also involves submitting documentation that can tie the account to the puppetmaster firmly. This would discourage multiple accounts being spawned by the same individuals.

The author’s classification of the sockpuppets gives us some insight into the motives of the puppetmasters. While the supporters are in the majority, they don’t seem to have much credibility, most supporters being pretenders. However, how could puppetmasters use dissent effectively to spread consensus on issues they are concerned about? One way could be to have their sockpuppets disagree with one other until the dissenter gets ‘convinced’ by the opinion of the puppetmaster. Ofcourse, this would require longer posts that are uncharacteristic of sockpuppets in general. So why do people jump through such hoops when they are highly likely to be flagged by the community over time? I wonder if the work in sockpuppets is a sort of introductory work on spambots because a human puppetmaster could hardly wreak the same havoc that bots could in online platforms.

Read More

Reflection #3 – [9/4] – [Deepika Rama Subramanian]

  1. Mitra, G. P. Wright, and E. Gilbert, “A Parsimonious Language Model of Social Media Credibility Across Disparate Events”

SUMMARY:

This paper proposes a model that aims in classifying the credibility level of a post/tweet as one of Low, Medium, High or Perfect. This is based on 15 linguistic measures including modality, subjectivity which are lexicon-based measures and questions, hashtags that are not lexicon based. The study uses a CREDBANK corpus which contains events, tweets and crowdsourced credibility annotations. It not only takes into consideration the original tweet but also retweets and replies to the original tweet and other parameters such as tweet length. The penalized ordinal regression model shows that several linguistic factors have an effect on perceived credibility most of all subjectivity followed by positive and negative emotions.

REFLECTION:

  1.  The first thing that I was concerned about was tweet length, this was set as a control. We have, however, in the past, discussed as to how shorter tweet lengths tend to be perceived as truthful because the tweeter wouldn’t have much time to type in a tweet while in the middle of a major event. The original tweet length itself negatively correlated with perceived credibility.
  2. The language itself is constantly evolving, wouldn’t we have to continuously train with newer lexicon as time goes by? 10 years ago, the word ‘dope’ and ‘swag’ (nowadays used interchangeably with amazing or wonderful) would have meant some very different things.
  3. A well-known source is one of the most credible ways of getting news offline. Perhaps combining the model with one that tests perceived credibility based on source could give us even better results. Twitter has some select verified accounts that have higher credibility than others. The platform could look to assign something akin to karma points for accounts that have in the past given out only credible information.
  4. This paper has clearly outlined that some words evoke the sense of a certain tweet being credible more than some others. Could these words be intentionally used by miscreants to seem credible and spread false information? Since this model is lexicon based, it is possible that the model cannot automatically adjust for it.
  5. One observation that initially irked me in this study was that the negative emotion was tied to low credibility. This seems about correct when we think about how the Kubler-Ross model’s first step is denial. If this is the case, I first wondered how anyone was going to be able to deliver bad news to the world ever. However, taking a closer look at the words that have a negative correlation specifically are ones that seem negatively accusatory (cheat, distrust, egotist) as against sad (missed, heartbroken, sobbed, devastate). While we may be able to get the word out about say a tsunami and be believed, outing someone to be a cheat may be a little more difficult.

Read More

Reflection #2 – [8/30] – [Deepika Rama Subramanian]

Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. “Antisocial Behavior in Online Discussion Communities.”

Summary
In this paper, Cheng et al. exhaustively study antisocial behaviour in online communities. They classify their dataset into Future Banned Users (FBU) and Never Banned Users (NBU) for the purpose of comparing the difference in their activities in the following factors – post content, user activity, community response and actions of the community moderators. The paper suggest that the content of the posts by FBUs tend to be difficult to understand and full of profanity, they tend to attract more attention to themselves and engage/instigate pointless arguments. With such users even tolerant communities over time begin to penalise FBUs more harshly than they did in the beginning. This maybe because the quality of the FBUs posts have degraded or simply because the community no longer wanted to put up with the user. The paper points out, after extensive quantitative analyses, it is possible for FBU users to be identified as early as 10 posts into their contribution to discussion forums.

Reflection
As I read this paper, there are a few questions that I wondered about:
1. What was the basis of the selection of their dataset? While trolling is prevalent in many communities, I wonder if Facebook or Instagram may have been a better place because trolling is at its most vitriolic when the perpetrator has some access to your personal data.
2. One of the bases for the classification was the quality of the text. There are several groups of people who have reasons other than trolling for the quality of text viz. non-native speakers of English, teens who have taken to unsavoury variations of words like lyk(like), wid (with), etc.
3. Another characteristic of anti-social users online was people who led other users of the community into pointless and meaningless discussions. I have been part of a group that was frequently led into pointless discussions by legitimate well-meaning members of the community. In this community ‘Adopt a Pet’, users are frequently outraged by the enthusiasm that people show in adopting pedigrees versus local mutts. Every time there is a post about pedigree adoptions, there are always a number of users who will be outraged. Are these users considered anti-social?
4. The paper mentions that some NBUs have started out being deviant but had improved over time. If as this paper proposes, platforms begin banning members based on a couple of posts soon after they join, wouldn’t we be losing on these users? And as suggested by the paper, users that believe they have been wrongly singled out (in deleted posts whereas other posts with similar content were not deleted) tend to become more deviant. When people feel like they’ve been wrongly characterised, based on a few posts, wouldn’t they come back with a vengeance to create more trouble on the site?
5. Looking back at the discussion in our previous class, how would this anti-social behaviour be managed in largely anonymous websites like 4chan? It isn’t really possible to ‘ban’ any member of that community. However, maybe because of the ephemerality of the website, if the community ignores trolls, the post may disappear on its own.
6. If we look at communities where deviant behaviour is welcome. If visitors who visit say r/watchpeopledie reports a post to the mod as would the moderator have to delete the post given that it is the norm on that discussion board?

Read More