Reflection #6 – [09/13] – [Neelma Bhatti]

  1. Sandvig, Christian, et al. “Auditing algorithms: Research methods for detecting discrimination on internet platforms.” Data and discrimination: converting critical concerns into productive inquiry (2014): 1-23.
  2. Hannak, Aniko, et al. “Measuring personalization of web search.” Proceedings of the 22nd international conference on World Wide Web. ACM, 2013.

Summary

Both papers set to explore the invisibility and resulting personalisation (in some cases, discrimination) of recommendation, search and curation algorithms.

Sandvig et al. map the traditional auditing studies for determining racial discrimination in housing to finding recommendation and searching bias faced by users of e-commerce, social and searching websites. Hannak et. al. develop and apply a methodology for measuring personalization in web search results presented to users, and the features driving that personalization.

Reflection:

  • Having read the “Folk theories” [1] paper, one can’t help but wonder if search engines also use”narcissism” as a personalisation feature, other than several others perceived by Hannak et. al. The narcissism itself can be based on inferring similarities between web traces and search history, demographics etc of different users.
  • It would be interesting to quantitatively assess the if the level or type of personalisation differed based on the device used to log into a certain account (be it Amazon, Google, Facebook). I know for a fact that it does.
  • Algorithmic audits can also be used to investigate fallacious shortage of products on e-commerce websites to generate false “hype” about certain products.
  • As the saying goes: “If you’re not paying for it, you become the product”. So are we (the products) even eligible to question what is being displayed to us? (especially while using free social media platforms).   Nothing that we consume in this world from food to news to entertainment content, has a cost  associated with it, maybe in case of using these services, the cost  is our personal data and right to access information deemed irrelevant by the algorithm.
  • Seemingly harmless, in fact beneficial personalization strategy results in a more serious problem than filter bubbles. Relevant products being shown on the newsfeed based on what we talk with friends (I once mentioned going to Starbucks and had offers related to Starbucks all over my newsfeed) or what we see (The exact products from an aisle I stood in front of for more than 5 minutes on my newsfeed) invade user privacy. If algorithmic audits need to adhere to Terms and Conditions of  a website, I wonder if any Terms and Conditions about not invading the personal space of a user to the point of creepiness exist.
  • A study to determine if an algorithm is possibly rigged¹ can have users tweak the settings publicly available to change the working of an algorithm and see if the results still favors the ownership.

¹which probably all algorithms are, to a certain extent (quoting “Crandall’s complaint”:” Why would you build and operate an expensive algorithm if you can’t bias it in your favor?”)

[1] Eslami, M., Karahalios, K., Sandvig, C., Vaccaro, K., Rickman, A., Hamilton, K., & Kirlik, A. (2016, May). First i like it, then i hide it: Folk theories of social feeds. In Proceedings of the 2016 cHI conference on human factors in computing systems(pp. 2371-2382). ACM.

 

Read More

Reflection #6 – [09/13] – [Lindah Kotut]

  • Sandvig et. al. “Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms”
  • Hannak et. al. “Measuring Personalization of Web Search”

The two papers can be connected by using the categorization provided by Sandvig to describe the work done by Hannak. Sandvig proposed a “systematic statistical investigation” (which is done by Hannak) in order to audit search algorithms. Two proposed categorizations fit Hannak’s work: Noninvasive user audit (with their use of Amazon Mechanical Turkers to probe for volatility in search results) and Sock Puppet Audit (using browser agents to mimic actual users and allow for scalable repeated perturbation of the search results).

1. User Preference
If I was to probe this question of auditing algorithm, I would start with a simple question/premise:
Why do people prefer Google as a search engine? This is an especially prescient question to ask since both of the papers agree on the dominance of Google as a search engine. Unlike previous work considering social media, where there is an involved investment by the user in creating and maintaining an account, there is no such burden required to use the search engine, with Bing and Yahoo among other search engines are being worthy competitors.

Understanding the users’ preferences whether overt or hidden, would give an added depth in considering then how the search engine either panders or accounts for personal… foibles in presenting results, and in Hannak’s case, what other axis of measurement is available to determine this.

It stands to reason, as Hannak points out, that given Google constantly tweaks how the algorithm works – It is a not an unreasonable deduction to conclude that part of the reason it does this is to account for hidden patterns in search habits that scale across user base.

This question can be asked retroactively as well: Has the user perception of search engine results changed if proof of filter-bubbles is presented? Or will they be very grateful for only receiving relevant search results? If a key to a successful business is about knowing your customer, then Google really knows its customers.

2. Volatility, Stability, Relevance … and censorship.
Both papers consider results returned by web search and claim that both approaches scale to other web searches. Image search included? For image search differs from web search. Case in point, the campaign by comedian John Oliver to subsume a tobacco company’s mascot with a… less glamorous version, which led to the rising of the “new” mascot to the top of image search (web search remained largely unchanged but for news articles).

Hannak’s work also note that the scope of their work is limited to US version of the search and the English language. This version can be served in another country however (by manually changing the extension back to .com). If we use the case of comparing volatility of the same search engine but different countries (one with censorship laws) Can this case be used to measure for censorship (and is censorship a form of bias)? — Because a measure for censorship can reveal which features (other than keywords) are used in the decision to censor and we can use this extrapolation to also consider other form of bias, intentional or not.

3.  Search Engine Optimization (and Bias)
The SEO, the process by which web presence can be “optimized” to appear high in search rankings with the use of tags, designed for mobile etc, so as to ensure that the page gets ranked favorably/contextually is a layman’s measure of auditing algorithms. Sandvig’s example of YouTube “reply girls” fits this description.

Thus, knowing that this deductive knowledge can be misused by those with the expertise to shape their websites to fit a particular demography — or as has been proven, successfully (and unethically), do this with targeted advertisements, raises the question of:

4. Who bears the responsibility?

Sandvig’s “Reply Girls” example was used to showcase how an algorithm can be subsumed to be an agent of discrimination. If proven to be the case, who is to be punished? If EU’s intention of assigning blame to platforms for the users who upload copyrighted content is anything to go by, then the blame will be laid on the algorithm owners in our case. But there is a trade-off to this, and it rounds back to the first point in this reflection — does the user care?

Read More

Reflection #6 – [09/13] – [Bipasha Banerjee]

Readings assigned

[1] Sandvig, Christian et al.  (2014) – “Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms” – Paper presented at the “Data and Discrimination: Converting Critical Concerns into Productive Inquiry,” a preconference at the 64th Annual Meeting of the International Communication Association.

[2] Hannak, Aniko et al. (2013) – “Measuring Personalisation of Web Search” – Procedings of International World Wide Web Conference Committee. (527-537)

Summary

The readings assigned to us mainly talks about the algorithmic audits. The first paper which discussed about how algorithms are rigged and used to favor or bias in many instances. They have discussed the example of the computer system named SABRE which was created by American Airlines and IBM for the ease of airline reservation system. It was later found out that the results often were bias and American Airlines results were given an unfair priority as a result of the search. This led to the government to intervene to make the system much more transparent and accessible to other airlines when information was sought. Algorithms can also be rigged and this how the “reply girls” of YouTube became popular. The authors proposed five different algorithmic audit methods and their respective effectiveness and limitations. The second paper on the other hand focused mainly on the effects personalisation of an account or profile has on search engines web results. They conducted an experiment which focused on features like basic cookie tracking, browser user-agent, the geolocation and the google account attributes. They recruited Amazon Mechanical Turks and observed about 11.7% of the search results being different due to the personalisation.

Reflection

The need to understand how algorithms are designed and how they affect our interaction in the social media and internet in general is immense. The paper by Motahhare Eslami [3] discussed how algorithms are there in place to give a personalized news feed for Facebook. Algorithms are used to in all possible ways, be it sorting the search result, the priority of the results, filtering etc. However, to those who are not in the particular company the term algorithm is just another ambiguous one. The non-disclosure clause of many companies leads them to be so mysterious to the outer world.

The authors of the first paper highlighted several algorithmic audits namely, the code audit, the non-invasive user audit, scraping audit, sock puppet audit and the crowdsourcing audit. They mentioned the challenges each method. In my opinion, looking at only one method is not going to be optimal. If, we combine two or more audit methods, and give a weight to each method (sort of a weighted mean) and at the end compute the “audit score”. The weight being assigned to each of the audits needs to be adjusted depending upon the priority of the audit method. Thus, the audit score generated would give a comprehensive idea about the algorithm in place being a “rigged” one or not.  Algorithmic audits can also be used to depict the fairness of businesses. [4].

The second paper talks about the ways that might lead to the search results being altered. Although personalisation leads to search results being altered, however it was also found that majority of the time the results were not altered. This result seems quite shocking to me. I believe a global data sampling would depict a completely different story. Comparing the both would give a better picture about exactly how personalisation is taken into consideration. Web search personalisation is an important and effective way to ensure a great user experience. However, the user needs to be aware exactly how and where their data is being used. This is where the companies need to be open and transparent.

[3] Eslami, Motahhare et al.- “I always assumed that I wasn’t really that close to [her]”: Reasoning about invisible algorithms in the news feed”

[4]https://www.wired.com/story/want-to-prove-your-business-is-fair-audit-your-algorithm/

Read More

Reflection #6 – [09/13] – [Deepika Rama Subramanian]

  1. Sandvig, Christian, et al. “Auditing algorithms: Research methods for detecting discrimination on internet platforms.”
  2. Hannak, Aniko, et al. “Measuring personalization of web search.”
  3. These papers deal with the algorithms that, in some sense, govern our lives. Sandvig et al. talk about how biases are programmed into algorithms for the benefit of the algorithm’s owner. Even organizations that wish to keep their algorithms open to public cannot do so in its entirety because of miscreants. The authors speak about the various ways the algorithms can be audited – code audit, non-invasive user audit, scraping audit, sock puppet audit, and collaborative crowdsourced audit. Each of these methods had their upsides and downsides, the worst of them being legal issues. It does seem like good PR to have an external auditor come by and audit your algorithm for biases. O’Neil Risk Consulting & Algorithmic Auditing calls their logo the ‘organic’ sticker for algorithms.

    While I wonder why larger tech giants tend not to do this, I realize we are already severely reliant on their services. For example, Google search for online shopping redirects us through one of Google Ad services to a website, or maybe we search for a location and we have it on Maps and News. They’ve made themselves almost indispensable to our lives that the average user doesn’t seem to mind that their algorithms may be biased towards their own services. Google was also recently slapped with a 5 billion dollar fine by the European Union for breaking anti-trust laws – amongst other things, they were bundling its search engine and Chrome apps into the Android OS.

    While in the case of SABRE, the bias was programmed into their system, many of today’s systems acquire the bias from the environment and the kind of conversations that they are exposed to. Tay was a ‘teen girl’ twitter bot set up by Microsoft’s Technology and Research division who went rogue in under 24 hours. Since people kept tweeting offensive material to her, she transformed into a Hitler loving, racially abusive, profane bot who had to be taken offline. Auditing and controlling bias in systems such as this will require a different line of study.

    Hannak et al. speak about the personalization of search engines. Search engines are such an integral part of our lives and there is a high possibility that we may lose some information simply because the engine is trying to personalize the results to us. One of the main reasons (it seems) like this study was carried out was to avoid filter bubble effects. However, the first thing that I felt about personalization in search engines is that it is indeed very useful – especially respect to our geographical location. When we’re looking for shops/outlets or even a city, the search engine always points us to the closest one, the one we’re most likely looking for. Also, the fact that they correlate searches with previous searches also seems to make a fair bit of sense. This study also shows that searches where the answers are somewhat strictly right or wrong (medical pages, technology, etc.) don’t have a great degree of personalization.  But as far as search engine results go, personalization should be the last thing to be factored into ranking pages after trust, completeness of information, and popularity of the page.

Read More

Reflection #6 – [09/13] – [Prerna Juneja]

Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms by Sandvig et al

Measuring Personalization of Web Search by Hannak et al

Summary:

In the first paper authors believe that every algorithm deserves scrutiny as it might be manipulated and discriminatory. They introduce the social scientific audit study used to test for discrimination in housing etc. and propose a similar idea of “algorithm audits” to find out algorithmic bias. They then outline five algorithmic audit designs namely code audit, non-invasive user audit, scraping audit, sock puppet audit and collaborative or crowdsourced audit and discuss the advantages and disadvantages of each. They find the ‘crowdsourced audit technique’ to be most promising.

In the second paper, authors study personalization in algorithms using the “crowdsourced audit” technique described in the first paper. They propose a methodology to measure personalized web results and apply it on 200 mechanical turks and observe that 11.7% of search results show differences due to personalization. Only two factors- being logged in to google account and geographic location leads to measurable personalization. Queries associated with ‘politics’ and ‘companies’ were associated with highest personalization.

Reflection:

Companies have always been altering algorithms to their benefit. Google takes advantage of its market share to promote its services like ‘google maps’. One will almost never find search results containing URLs to MapQuest, Here WeGo and yahoo news. Market monopoly can be a dangerous thing. It kills competition. Who knows if google starts charging for its free services in the future when we all get used to its products.

A case of gender discrimination was found in Linked where in response to search for a female contact name you’ll be prompted male versions of that name. Example: “A search for “Stephanie Williams brings up a prompt asking if the searcher meant to type “Stephen Williams” instead.”[1]. While google example shows an intentional bias which although is not harming the users directly but is killing the market competition, the linkedin incident seems to be an unintentional bias that cropped up in their algorithm since it depended on relative frequencies of words appearing in the queries. So probably ‘Stephan’ was searched more than ‘Stephanie’.  Citing their spokesperson “The search algorithm is guided by relative frequencies of words appearing in past queries and member profiles, it is not anything to do [with] gender.” So authors are right when they say that no algorithm should be considered unbiased.

Some companies are building Tools to detect bias in their AI algorithms like Facebook (Fairness Flow)[2], Microsoft [3] and Accenture[4]. But the problem is that just like their algorithms these tools will be a black box for us. And we will never know if these companies found bias in their algorithms.

Privacy vs personalization/convenience:  Shouldn’t users have the control over their data. Of what they want to share with the companies? Google was reading our mails for almost a decade for personalised advertisements before it stopped that in 2017 [5]. It still reads them though. It knows about our flight schedules, restaurant reservations. My phone number get distributed to so many retailers, I wonder who is selling them this data

In the second paper the authors mention that once the user logs in to one of the google services they are automatically logged-in to all. So does that mean my YouTube search affects my Google search?

According to an article [6] google autocomplete feature is leading to spread of misinformation. The first suggestion that comes up when you type “climate change is” comes out to be “climate change is a hoax”. How is misinformation and conspiracy theories ranking up on these platforms?

Determining bias seems like a very complex problem with online algorithms changing everyday. And there could be multiple dimensions to bias: gender, age, economic status, language, geographical location etc. The collaborative auditing seems to be a good way of collecting data provided it is done systematically and testers are chosen properly. But then again, how many turkers one should hire? Can a few 100 represent the billion population that is using the internet?

[1] https://www.seattletimes.com/business/microsoft/how-linkedins-search-engine-may-reflect-a-bias/

[2] https://qz.com/1268520/facebook-says-it-has-a-tool-to-detect-bias-in-its-artificial-intelligence/

[3] https://www.technologyreview.com/s/611138/microsoft-is-creating-an-oracle-for-catching-biased-ai-algorithms/

[4] https://techcrunch.com/2018/06/09/accenture-wants-to-beat-unfair-ai-with-a-professional-toolkit/

[5] https://variety.com/2017/digital/news/google-gmail-ads-emails-1202477321/

[6] https://www.theguardian.com/technology/2016/dec/16/google-autocomplete-rightwing-bias-algorithm-political-propaganda

 

Read More

Reflection #6 – [09/13] – [Subil Abraham]

  1. Sandvig, Christian, et al. “Auditing algorithms: Research methods for detecting discrimination on internet platforms.” Data and discrimination: converting critical concerns into productive inquiry (2014): 1-23.
  2. Hannak, Aniko, et al. “Measuring personalization of web search.” Proceedings of the 22nd international conference on World Wide Web. ACM, 2013.

Both papers deal with the topic of examining personalization and recommendation algorithms that lie at the heart of many online businesses, be it travel booking, real estate, or your plain old web search engine. The first paper “Auditing algorithms” brings up the potential for bias creeping in or even the intentional manipulation of the algorithms to advantage or disadvantage someone. It talks about the need for transparency and proposes that algorithms should be examined via audit study and proposes several methods for doing so. The second paper attempts to identify the triggers of and measure the amount of personalization in a search engine, while controlling for other aspects that could change results but is not relevant to personalization.

I think the first problem one might run into when thinking about attempting an audit study of the algorithms of an enormous entity like Google, Facebook or Youtube is the sheer scale of the task ahead of you. When you are talking about algorithms serving millions, even billions of people across the world, you have algorithms that are working with thousands or tens of thousands of variables and it is working towards finding the optimum values for each individual user. I speculate that slight change in the user’s behavior might set of a chain reaction of variable changes in the algorithm. At this scale, human engineers are no longer in the picture and also the algorithm is evolving on its own (thanks machine learning!) and it is possible that even the people who created the algorithm no longer understand how it works. Why do you think Facebook and Youtube are constantly fighting PR fires? They don’t have as much knowledge or control of their algorithms as they might claim. Even the most direct method of a code audit might see the auditors make some progress before they lose it all because the algorithm changed out from under them. How do you audit an ever shifting algorithm of that much size and complexity? The only thing I can think of is use another algorithm that audits the first algorithm since humans can’t do it at scale. But now you run into the problem of possible bias in the auditor algorithm. It’s turtles all the way down.

Even if we are talking about auditing something of a smaller scale, an audit study is still not a perfect solution because of the possibility of things slipping through the cracks. Linus’s law “Given enough eyeballs, all bugs are shallow” doesn’t really work even when everything is out in the open for scrutiny. OpenSSL was open source and a critical piece of infrastructure but the Heartbleed bug lay there unnoticed for two years regardless of many people looking for bugs. What can we do to improve the audit study methods to catch all instances of bias without allowing the study to become impractically expensive?

Coming to the second paper, I find it fascinating the vast difference in how much the later rank results change compared to rank 1. What I want to know is why are the rank 1 results so relatively stable? Is it simply a matter of having a very high pagerank and being of maximum relevance? Are there cases where a result is hard coded in for search queries (like how you often see a link to wikipedia as the first or second result in many search results)? I think focusing specifically on the rank 1 results would be an interesting experiment. Tracking the search results over a longer period of time and looking at the average time periods between rank 1 results changing and also looking at what kind of search queries see the most volatility in rank 1 results.

Read More

Reflection #6 – [09/12] – [Vibhav Nanda]

Readings:

[1] Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms

Summary:

The central question that this paper addressed was the mechanisms that are available to scholars and researchers to determine the operation of the algorithms. The authors started talking about the traditional reasons why an audit was carried out, explained how audits were carried out traditionally and why it was ok to cross some ethical borders to find answers for the greater good of the public. They went on to detail how overly restrictive laws (CFAA) and scholarly guidelines are a serious impediment today for a similar study that would not be bound by such laws and guidelines in the 1970’s, consequentially hindering social science researchers from finding answers to the problems they need to solve. Throughout the paper the authors profiled and detailed five varying algorithm audit designs including code audit, noninvasive user audit, scraping audit, sock puppet audit, and collaborative audit or crowdsourced audit.

Reflection/Questions:

Through the entirety of the paper the authors addressed algorithms as something that has conscience and this method of addressing algorithms bothered me, for instance the last question that the author poses  “how do we as a society want these algorithms to behave?”. Usage of the word behave was not apropos according to me and a better fitting word would have been function, so something along the lines “how do we as a society want these algorithms to function?” The authors of this paper also addressed various issues regarding algorithmic transparency that I brought up in my previous blog and in class — ” On many platforms the algorithm designers constantly operate a game of cat-and-mouse with those who would abuse or “game” their algorithm. These adversaries may themselves be criminals (such as spammers or hackers) and aiding them could conceivably be a greater harm than detecting unfair discrimination in the platform itself”. Within the text of the paper the authors contradicted themselves by first saying that audits are carried out to find out trends and not punish any one entity, howbeit later in the paper they say that algorithmic audits on a wide array of algorithms will not be possible and ergo the researchers would have to resort to targeting individual platforms. I disagree that algorithms can incur any sort of bias since biases are based out of emotions, and pre-conceived notions which are a part of human conscience and algorithms don’t have emotions. On that end, let’s say that research finds a specific algorithm on a platform to be biased, who is accountable ?  the company ? the developer/ the developers who created the libraries? the manager of the team?  Lastly, according to me googles “screen science” was perfectly acceptable — one portion of the corporation supporting another portion, just like the concept of a donor baby.

 

[2 ]  Measuring Personalization of Web Search

Summary:

In this paper the authors detail their methodology for measuring personalization in web searches, apply their methodology to numerous users, and finally dive into cause of personalization on web. The methodology created by the researchers exposed that 11.7% searches were personalized, mainly caused due to geographic location of the user, and the users account being logged in. The method for finding out personalization also controlled for various noise sources, hence delivering more accurate results. The authors acknowledged the drawback in their methodology — which will only identify positive instances of personalization and will not identify absence of personalization.

Reflection/Questions:

Filter bubble’s and media go hand in hand. People consume what they want to consume. Like I have previously said, personalizing search outputs isn’t the evil of all societal problems. According to me it almost seems as if personalization is being associated with manipulation, which is not the same. If search engines do not personalize, the users get frustrated and find a place that will deliver them the content that they want. I would say there are two different types of searches: factual searches, and personal searches. Factual searches include searches which have a factual answer and there is no way that can be manipulated/personalized, however personal searches include things about feelings, products, ideas, perceptions, etc. and these results are personalized, which I think should rightly be.  Authors also write that there is a “possibility that certain information may be unintentionally hidden from users,” which is not a draw back of personalization but reflective and indicative of real life, where  a person is never exposed to all the information on one topic. Howbeit, the big questions I think about personalization are what is the threshold of personalization ?  At what point is the search engine a reflection of our personality and not an algorithm anymore ? At what point does the predictive analysis of searches becomes creepy ?

Read More

Reflection #6 – [09/12] – [Shruti Phadke]

Paper 1: Sandvig, Christian, et al. “Auditing algorithms: Research methods for detecting discrimination on internet platforms.” Data and discrimination: converting critical concerns into productive inquiry (2014): 1-23.

Paper 2: Hannak, Aniko, et al. “Measuring personalization of web search.” Proceedings of the 22nd international conference on World Wide Web. ACM, 2013.

It is safe to say that the issue of algorithmic bias and personalization was once celebrated as a “smartness” and high service quality of internet information providers. In a study by Lee et. al. [1] the research participants attributed algorithms’ fairness and trustworthiness to its perceived efficiency and objectivity. This means reputed and widely acknowledged algorithms such as Google search appear to be more trustworthy. This makes it particularly severe when algorithms give out discriminatory information or make decisions on user’s behalf. Christian et. al.’s paper reviews the implications of discrimination on internet platforms along with the auditory methods that researchers can use to detect its prevalence and effect. Amongst the several methods proposed, Hannak et. al. in paper two utilize sockpuppet audit and crowdsourcing audit methods to measure personalization on Google search. From the review of two papers, it can be said that the bias in algorithms can occur either from the code or from the data.

  1. Bias in code can be attributed to financial motives. The examples given in the paper 1 about American Airlines, Google Health and product ranking highlight this fact. But, how is it different from getting store brand at a low price in any of the supermarkets? On surface, both enterprises are utilizing the platform they created to best sell their products. However, the findings in the paper 2 prove that website ranking (either having AA flight ads at the top or having Google service links at the top) is what separates fair algorithms from the unfair algorithm. (Unlike biased information providers Kroger displays its store brand at the same front as other brands). There is a clear difference between change in search rank between AMT results and the control results.
  2. Bias in data, I believe, is mainly caused due to the user’s personal history and the dominance of a particular type of information available. Getting similar type of information based on history can lead to echo chambers of ideologies as seen in the previous paper. There is also another type of bias in data that informs algorithms in the form of word embeddings in automatic text processing.  For example, in the paper “Man is to Computer Programmer as Woman is to Homemaker”, [2] Bolukbasi et. al. state that the historical evidence of embeddings of computer science being closer to male names than female names will make search engines rank male computer scientist web pages higher than female scientists. This type of discrimination can not be blamed on a single entity but just the prevalence of biased corpus and the entire human history!

Further, I would like to comment briefly on the other two research design methods suggested in paper 1. Scraping audit can have unwanted consequences. What happens to the data that is scraped and later blocked (or moderated) by the service provider? Recently, Twitter suspended  Alex Jone’s profile but his devoted followers were able to rebuild a fake profile with real tweets based on the data collected from the web crawlers and scrapers. Also, noninvasive user audits, even though completely legal can be ineffective with poor choice of experts.

Finally, given the recent events, it can be valuable to research how algorithms share information across platforms. It is common to see ads of hotels and restaurants on Facebook after booking flight tickets with Google flights. Is “Google personalization” only limited to Google?  

[1] Lee, Min Kyung. “Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management.” Big Data & Society 5.1 (2018): 2053951718756684.

[2] Bolukbasi, Tolga, et al. “Man is to computer programmer as woman is to homemaker? debiasing word embeddings.” Advances in Neural Information Processing Systems. 2016.

Read More

Reflection #5 – [9/11] – [Karim Youssef]

The amount and sources of information, news, and opinions that we get exposed to every day have significantly increased thanks to online social media platforms. With these platforms serving as a mediator for sharing information between their users, multiple questions arise regarding how their design and function affect the information that reaches an end-user.

In designing an online platform with an overwhelming amount of information flowing every day, it may make sense to design some personalization techniques to optimize the user’s experience. It might also make sense for example for an advertising platform to optimize the display of adds in a way that affects what an end-user see. There are many design goals that may result in the end-user receiving filtered information, however, the lack of transparency of these techniques to end-users, as well as the effect of filtering information on the quality and diversity of the content that reaches a user are significant concerns that need to be addressed.

In their work Exposure to ideologically diverse news and opinion on Facebook, Eytan Bakshy et al. study the factors affecting the diversity of the content that Facebook users are exposed to. They conducted a data-driven approach to analyze the proportion of content from a different ideology versus content from an aligning ideology that a user sees in his Facebook newsfeed. They inferred that the most contributing factors to limiting the diversity of the newsfeed content of a user are the structure of the user’s friends network and what a user chooses to interact with. The study found that the newsfeed ranking algorithm affects the diversity of the content that reaches a user, however, this algorithm is adaptable to the behavior and interactions of the user. From these perspectives, they concluded that “the power to expose oneself to perspectives from the other side in social media lies first and foremost with individuals” as mentioned in the paper.

I agree to some extent with the findings and conclusions of the study discussed above. However, one major concern is the question of to what extent are Facebook users aware of these newsfeed ranking algorithms? Eslami et al. try to answer this critical question in their work “I always assumed that I wasn’t really that close to [her]”: Reasoning about invisible algorithms in the news feed. They conducted a qualitative study that included information from 40 Facebook users about their awareness of the newsfeed curation algorithms. The study showed that the majority are not aware of these algorithms and that there is a large misinterpretation of the effect of these algorithms among users. Although after becoming aware that Facebook controls what they see, a majority of users appreciated the importance of these algorithms, the initial response to knowing that these algorithms exist was highly negative, it also revealed how people are making wrong assumptions when for example not seeing a post from a friend for a while.

I’ll imagine myself as a mediator between users and designers of a Facebook-like social platform, trying to close the gap. I totally agree that every user has the complete right to know how their newsfeed work. And every user should feel that he/she is in full control over what they see and that any hidden algorithm is only helping them to personalize their newsfeed. On the other hand, it is a hard design problem for the platform designers to reveal all their techniques to end-users, simply because the more complex it becomes to use the platform, the more likely users will abandon it to more simple platforms. 

If I imagine being hired to alter the design of a social platform to make users more aware of any hidden techniques. I would start with a very simple message conveyed through an animated video that raises the users’ awareness of how their newsfeed work. This could be by simply saying that “we are working to ensure you the best experience by personalizing your newsfeed, we would appreciate your feedback”. For having a user’s feedback, they could see occasional messages that ask them simple questions like “you’ve been interacting with x recently, to see more posts from x, you can go to settings and set this and that”. After a while, users will become more aware of how to control what they see on their newsfeed. Also taking continuous feedback from users on their satisfaction levels with their experience with the platform will help to improve the design over time.

I understand that it is more complex and challenging to address such a problem and that there may be hundreds of other reasons why there are some hidden algorithms that control what an end-user receives. However, ensuring a higher level of transparency is crucial to the healthiness and user satisfaction with online social platforms.

 

Read More

Reflection #5 – [09/10] – [Viral Pasad]

PAPERS:

  1. Eslami et al., ““ I always assumed that I wasn’t really that close to [ her ]”: Reasoning about invisible algorithms in the news feed,” 2015.

Bakshy, “Exposure to ideologically diverse news and opinion on Facebook,” vol. 348, no. 6239, pp. 1130–1133, 2015.

 

SUMMARY:

The first paper deals with Facebook’s ‘algorithm’ and how it’s operations are opaque and yet the power they posess in influencing user’s feeds and thereby virtual interactions and perhaps opinions. Using FeedVis, they do a longitudinal study of the algorithm and how the awareness of the algorithm influences the way people interact with Facebook and feel satisfied or otherwise.

The second platform deals with the algorithm’s nature to make the platform, an echo chamber owing to the homophilly patterns among users with a certain ideologies or opinions.

 

REFLECTION:

  • The paper itself mentions and I agree, that the results from the FeedVis study were logitudinal and more knowledge about the user patterrns could be achieved by taking into account ethnography and how the algorithm works for different for different users.
  • Another factor which the paper briefly touches upon is that the users try to speculate about this ‘opaque’ algorithm and find ways to understand and ‘hack’ into the algorithms and thus the respective feeds of their followers.
    • One such example is the entire YouTube and Instagram community trying to constantly figure out the algorithms of the respected platforms and adjusting their activities online in accordance with those.
    • Further, the lack of communication of such algorithms often demotes the feeling of a community among users and thereby affecting the user ‘patriotism’ towards the platform.
      • This was observed in the YouTube Demonitization case where several YouTubers, due to lack of effective communication from YouTube, felt less important and thus changed their online activities.
  • Furthermore, I would have liked if these studies were conducted in today’s times, mentioning Dark Posts or Unpublished Posts and how the ‘algorithm’ treats them and how is bolsters the homophily (often political) in users.
  • The use of Dark Posts is very unethical as it promotes the ‘echo chambers’ on social media sites. Not only that, the users, differing in ideologies to a certain demographic will not even see the post organically due to the “unpublished ness” of the post. Allegedly, even a link to that post will not take a user to that post if the user’s interests have been profiled different from the targeted audience of the Dark Post. Dark Posts can not only be used for targeted marketing but also for certain other unethical areas. [1]

 

[1] – http://adage.com/article/digital/facebook-drag-dark-posts-light-election/311066/

Read More