Reflection #7 – [09/18] – [Shruti Phadke]

Paper 1: Erickson, Thomas, and Wendy A. Kellogg. “Social translucence: an approach to designing systems that support social processes.” ACM transactions on computer-human interaction (TOCHI) 7.1 (2000): 59-83.

Paper 2: Donath, Judith, and Fernanda B. Viégas. “The chat circles series: explorations in designing abstract graphical communication interfaces.” Proceedings of the 4th conference on Designing interactive systems: processes, practices, methods, and techniques. ACM, 2002.

Social cues and signals are the most important part of any communication. Erickson et. al.’s paper discusses how subtle details and peculiarities of offline communication can improve the coherence of the online communication. They categorize the present digital conversational systems in three parts–Realist, Memetic and Abstract. Donath et. al.’s paper presents one such abstract concept–The chat circle series–to illustrate how graphical interface can make online systems more engaging and sociable. Since both the papers are considerably old, I am structuring my review as the comparison between points they make and the current systems.

  1. Realist Platforms: The realist systems consist of combination of physical and digital signal, for example, video conferencing, teleconferencing etc. Authors mention that such systems are expensive and require infrastructure. But, currently robust and dependable systems such as Zoom, Google Meet, Skype, Facebook Messenger, WhatsApp Video Call  are available for free and can be used on any platform for individual or group conferencing. Some of them also include presentation, side-chatting and commenting tools. Features such as sharing emoticons (hand raising, request for increasing volume, thumbs-up) mimic social cues. Additionally, cross domain realist systems such as watching movies together remotely (https://www.rabb.it) , listening to music remotely (https://letsgaze.com/#/) provide real time multimedia sharing experience. One concern that still remains is that, due to the quality of such video/audio calls, often, subtle expressions and poses go unnoticed.
  2. Mimetic Platforms: These platforms include online personas and virtual reality systems where the user has to setup their own avatars to conduct the conversations. Authors mention that it takes a conscious and continuous effort on user’s part to manipulate such systems. With the advance in sersor fusions and Augmented reality, the mimetic systems have traveled a long distance. For example, systems like Room2Room (https://www.microsoft.com/en-us/research/publication/room2room-enabling-life-size-telepresence-in-a-projected-augmented-reality-environment/) are able to facilitate a life sized telepresence for conversations.
     
    Such systems can be very impactful in establishing realistic social cues and interactions, digitally.
  3. Abstract Systems: Perhaps the most unexplored area in interactive system is the abstract systems. Donath et. al.’s paper describes “The Chat Circle” series which is designed to increase the expressiveness of the digital communication. The key element of such designs is to enable users to form impressions of the other users based on additional features provided in the graphical interface. Although the design is innovative and take insights from the real world, such designs are not widely used, at-least not yet.

 

[Note: I wanted to write more about using the Chat Circle, but the link is expired now and the system has no access]  

Read More

Reflection #7 – [09/18] – [Neelma Bhatti]

  1. Donath, Judith, and Fernanda B. Viégas. “The chat circles series: explorations in designing abstract graphical communication interfaces.” Proceedings of the 4th conference on Designing interactive systems: processes, practices, methods, and techniques. ACM, 2002.
  2. Erickson, Thomas, and Wendy A. Kellogg. “Social translucence: an approach to designing systems that support social processes.” ACM transactions on computer-human interaction (TOCHI) 7.1 (2000): 59-83.

Reading reflections

Both the papers are significantly old, and there been advancement in terms of Social translucence. Applications, specifically social systems are have made their communication interfaces significantly “visible”, resulting in a more “aware” user. Examples include the adaptive chat windows where one can see if someone is typing, has read our message, the message is still pending or failed to deliver. The idea of “Given clues that useful knowledge is present, interested parties could request summaries of the topic, petition for admission to the community, or simply converse with some of the community members” is also effectively implemented in Facebook groups now.

The idea of digital spaces having graphical wear showing who has been doing what while being there seemed really novel. But come to think of it, internet is haven for the introverts, and the ability to interact privately is one of the reasons why it is so. Such participants won’t be fond of the social system maintaining their conversation history for transparency or transforming the temporal dimension into depth.

Some thoughts while reading the papers are as follows:

  • Quoting from the paper by Erickon et. al. “in the digital world we are socially blind.” However, I tend to disagree with this statement as we are now more socially aware in a digital world than ever. In a physical setting, it is hard to locate a restaurant, a phone booth or a grocery store which is out of our sight unless we have been there already. The digital world not only helps us locating the service of our choice, but helps us with finding alternatives, displays a list of people who have used the service and what they say about it, and also if there is a perceived obstacle (bad weather the next day, an alternate route in effect because of construction, a traffic jam, working hours etc.) about the service which help us reaching to a conclusion.  It not only makes us better sighted, but helps us in reaching a decision by being well equipped far ahead of time, unlike the “crowded parking lot” example quoted in the paper.
  • Users in a digital world have the liberty to initiate and carry on multiple conversations simultaneously, without one interrupting the other, unlike real world. Having textual conversations with several people simultaneously in a digital space doesn’t hinder communication since their voice doesn’t overlap , neither does it offend the participants in the conversation if the participant turns away from them temporarily, since most of the times it is unnoticeable. It also has to do with the fact that users tend to make the most of the time in the digital world, and it doesn’t require them to be physically present at one place. Although the whole concept of depicting real world interactions in terms of hearing range, action traces, speaking rhythms and other behavioral representations is appealing, it only makes the user able to strike one conversation at a time.

 

Read More

Reflection #7 – [09/18] – [Subil Abraham]

  1. Donath, Judith, and Fernanda B. Viégas. “The chat circles series: explorations in designing abstract graphical communication interfaces.” Proceedings of the 4th conference on Designing interactive systems: processes, practices, methods, and techniques. ACM, 2002.
  2. Erickson, Thomas, and Wendy A. Kellogg. “Social translucence: an approach to designing systems that support social processes.” ACM transactions on computer-human interaction (TOCHI) 7.1 (2000): 59-83.

 

We have finally arrived at talking about design and its effects on interaction in social systems. The two papers closely deal with the idea of Social Translucence – making visible the social actions that are typically unseen in digital interactions – and the effects it has on the participants. Erickson et al. introduces the idea and proposes a system for knowledge sharing driven by conversations. Its aims are more formal, targeting knowledge gathering and interaction in organizations and also introduces a social proxy system that maps how conversations are going. Donath et al. covers similar ideas of Translucence through the lens of an informal chat system, which uses animated circles in a space and allows moving around and joining conversations, somewhat simulating real-life interactions.

I think the chat circles are interesting in that it seems to simulate a real-life social function while stripping away all customization, which means you can’t be judged by your appearance but only by your words and actions. But there is still a section of people who won’t want to use it: the introverts. Imagine you’re at a party. You arrive somewhat late and so almost everyone is in groups engaged in conversation. You don’t want to be that outcast standing alone at the back but you don’t want to try joining in a conversation because you start thinking: what if they don’t want me in? What should I even say? Will I just end up standing there and not get a word out and look dumb? Can I even hold a conversation? Will they think I’m weird? Are they thinking that now? They probably are, aren’t they? I don’t want to be here anymore. I want to go home…

The chat circle space is not that different from a party. But you want everyone to engage and have fun. How can the system serve the shy ones better? Perhaps allow the circles to change their color to green or something to show that they are open to conversation. Perhaps encourage the circles in the space to go talk to circles who seem to be by themselves. Any number of ways that can make the place be more welcoming.

Another problem we might encounter is excessive gatekeeping. This could happen in both the knowledge system and chat circles. Erickson’s knowledge system already has features for requesting entry into a community. At the same time, you don’t want those protections to be used for preventing, say, newbies who are just trying to gain knowledge or are interested. You don’t want the admins to throw their weight around the powerless. StackOverflow is already suffering from the problem of being very unwelcoming to newcomers and old hands alike that they are trying to fix [1]. The same problem could occur among congregations in chat circles where the circles in a group can tell well-meaning people off. How can one be more welcoming is a very broad question that affects a lot of social systems, not just the ones described in the papers. I don’t think there is an algorithmic solution so the best solution right now is to have community guidelines and enforce them well.

One last thing I’d like to point out is that the idea of Teledirection was very prophetic. It describes to a T what happens in IRL streaming, a genre of live streaming popularized on Twitch where the streamer (the tele-actor) goes about their day in real life outside their home and the chat (the tele-directors) make requests or gives directions on what the streamer should do. The limits on what can be done need to be enforced by the streamer. A very famous example is IcePoseidon, an IRL streamer with a rabid fanbase who cause havoc wherever he goes. His presence at any place triggers the fanbase to start disturbing the business, making prank calls, attacking the business on review sites. I find it fascinating how the paper managed to predict it so well.

[1] https://stackoverflow.blog/2018/04/26/stack-overflow-isnt-very-welcoming-its-time-for-that-to-change/

 

Read More

Reflection #6 – [09/13] – [Karim Youssef]

Personalization in online platforms could be considered a double-edged sword. At first sight, personalization looks beneficial to individual users to give them a better experience when surfing the web. It also seems to make sense since the amount of accessible information is overwhelming. Taking a deeper look, personalization and other hidden filtering algorithms raise many questions.  From the fear of Filter Bubbles to the potential implicit discrimination, it became a public interest issue to scrutinize these black boxes that are deciding on our behalf what we may want to see online.

Revealing the functionality of hidden filtering algorithms is a challenging process. In their work Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms, Christian Sandvig et al. propose multiple research methods for auditing online algorithms. These methods include auditing the algorithm’s code, surveying users of a specific online platform, scraping data, using sockpuppets, or crowdsourcing. Every proposed technique faces a specific set of challenges between ethical challenges, legal challenges, and/or insufficient data or knowledge. This work analyses closely every technique and the challenges it faces, which inspires a lot of research in this area.

A few general takeaways from the work presented by Christian Sandvig et al. are:

  • As mentioned in the paper, it is hard to generalize the results of an audit without being subjective to a specific platform. These subjective audit studies could give advantages to some competitors of the studied platform unless there is a regulation that ensures fairness in studies across different platforms providing the same service.
  • There are a lot of legal restrictions to performing such studies. Whether workarounds are considered ethical or not depends on the importance of the results and the right of the wide base of users to know what happens behind the scene.
  • Combining two or more techniques from those mentioned in the paper could lead to more beneficial results, such as combining crowdsourcing with sockpuppet accounts to design more controlled experiments. Or if possible, combining the code auditing with crowdsourcing could help in reverse-engineering the parts that are not clear.
  • Finally, algorithm auditing is becoming highly important and it is necessary to open the way and relax some conditions that allow for more efficient auditing to ensure transparency of different online platforms.

One of the valuable algorithm audit studies is the one performed by Aniko Hannak et al. , presented in their work Measuring Personalization of Web Search. This work presents a well-designed study that analyses how Google search engine personalizes search results. The beauty of this work lies in the interpretability of their experimental setup and their results, as well as the generality of their approach. This work studies the factors that contribute to the personalization of Google search results. They analyzed the similarity between search results for queries made at the same time using the same keywords and studied the effect of multiple factors such as geolocation, demographics, search history, and browsing history. They quantified the personalization for these different factors as well as for different search categories.

Some of the takeaways from this work could be:

  • This work serves as a first step towards building a methodology that measures web search personalization quantitatively. Although there could be more parameters and conditions to look at, the method presented by this work is a guiding step.
  • The generality of their approach backs the previous point. their method could be applied to different online platforms to reveal initial traits about hidden ranking algorithms such as searching for products on e-commerce websites or displaying news in a newsfeed.
  • As they mention, their findings reflect the most obvious factors that drive personalization algorithms. Starting from their work, a deeper analysis could be done to reveal other hidden traits that may carry any sort of discrimination or limit the exposure to some information.

As we mentioned at the beginning, there could be various benefits from a personalization algorithm, however, auditing these algorithms is necessary to ensure that there are no undesirable effects that could result from using them.

Read More

Reflection #6 – [09/13] – [Nitin Nair]

  1. Sandvig et. al. “Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms”
  2. Hannak et. al. “Measuring Personalization of Web Search”

Our lives have become so intertwined with systems like web search that we fail to understand the potential foibles of such systems. The trust we have towards such systems are not questioned until a major flaw with it is uncovered. But, given the black-box nature of these systems how does one even understand the shortcomings of such systems. One of the ways this could be achieved is through a process called “Algorithmic Audit.” The importance of these audits are ever more important given the way of the world at the moment.
In paper [1], the author first talks about the concept of “screen science” coined by American Airlines team who created the SABRE flight reservation system to refer to human biases in the way we choose things given in a list. He then goes on to conclude righly, how important “algorithmic audits” are , while introducing the said concept. He argues that understanding of systems that we interact with daily is paramount given how firms have acted in the past. The author points out various types of audits, namely, code audit, non-invasive user audit, scraping audit, sock puppet audit, and collaborative crowdsourced audit. The author then finally examines what would be needed in order for us to conduct such audits in terms of legal or financial support while advocating a shift of perspective to “accountability by auditing.”
In this paper [2], the author tries to measure the basis and the extent of personalization in modern search engines through the use of benign sock-puppet auditing. The major contributions the author makes are as follows: a methodology is created to measure personalization in web searches, this methodology is then used to measure personalization in google and the causes behind personalization is investigated. Couple of issues the author’s methodology has mitigated are issues related to temporal changes in search index, issue of consistency with respect to search indices being distributed and mitigate the use of A/B testing by the search provider.
One of the first thoughts that popped up in my mind is how the results of the audit might be different if conducted at the moment. The push towards more personalization through the use of machine learning models and the push in doing so by all major internet firms and in this case, Google, might render different results. This which would an interesting project to explore.
Are certain language constructs due to it being favored by certain groups found crudely by optimizing on the click rate being used in search results? This is an interesting question one could try to answer extending the insights that [2] gives. It would also be interesting to compare the description of the person constructed by the system to that of the actual person, see how optimizing on different factors have an impact on the system’s ability to create a true to life description.
Although the author shows that carry-over effect becomes negligible after 10 minutes the long term effects of the profiling due to being able to understand the user, through their understanding their behaviour and preferences thoroughly are not explored in the work. The challenge in identifying this would be the same issues the author pointed out, changing search indices and not being able switch off personalization during searches.
Given how important understanding of these systems are and their impact on our understanding of the world, it would be a worthwhile action to perform to conduct such audits on algorithms, data used to create ML models by unbiased agencies to track the reliability and biases of such systems while giving enough room to keep the privacy of such algorithms built by these providers safe. Having checks on these systems will ultimately ground the expectations of us users on these systems. If any malevolent actions are found, legal actions could be called for against these service providers to foster accountability.

Read More

Reflection #6 – [09/13] – [Eslam Hussein]

  1. Christian Sandvig, Kevin Hamilton, Karrie Karahalios, and Cedric Langbort, “Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms
  2. Anikó Hannák, Piotr Sapiezynski, Arash Molavi Kakhki, David Lazer, Alan Mislove, and Christo Wilson, “Measuring Personalization of Web Search

 

Summary:

Both papers are almost about the same topic Algorithmic Audits. In the first paper, the authors discuss different research methods used in auditing algorithms on the internet in order to detect misbehaviors such as discrimination and misinformation. They gave a brief history of the term Audit Study, then mentioned five different auditing methods:

  • Code Audits
  • Noninvasive User Audit
  • Scraping Audit
  • Sock Puppet Audit
  • Crowdsourced Audit / Collaborative Audit

And they described different examples of each method and some of their drawbacks and limitations.

 

In the second paper, the authors used 2 of the auditing methods described in the first paper in order to measure personalization in web search engines, specifically Google Web Search, Microsoft Bing and DuckDuckGo. Those methods are:

  • Sock Puppet Audit, by creating artificial accounts on each platform and manually crafting them. They used those account as control accounts in experimenting different features that might affect the search engine under investigation.
  • Crowdsourced Audit, by using the services provided by the Amazon mechanical Turks. They employed 300 workers in order to simulate different real time users of those platforms.

They discovered that there is about an average of 11.7% variation on Google and 15.8% on Bing. Those differences of search results are due to different parameters such as user’s accounts features, IP addresses, search history, cookies … etc.

 

Reflection:

  • Although lots of researchers do a very good job in designing their experiments and surveys using crowdsourcing platforms such as Amazon Mechanical Turks, finding a true representative sample using those platforms is still questionable. What if the study requires samples from different socioeconomic statuses or different educational levels? We cannot find people from different financial status on such platforms, and that is due to its low paid income which will attract mostly low income workers (about 52% of workers gain less than $5 per hour).
  • Another raised issues, is auditing special category of algorithms that depends heavily on machine learning models. Those algorithms may misbehave and produce harmful results such as misinformation and discrimination. Who might be hold accountable for those algorithms? Is it the machine learning engineer, the decision makers or the algorithm itself?
  • Personalization on web search engines could result in a filter-bubble effect which might lead to discrimination of information quality provided to users based on their location, language or preferences. Some might receive correct legit information and others may get flooded by rumors and misinformation. How could we prevent such behavior? I guess we first need to detect those features that greatly affect the results returned by those systems. Then we could find a mechanism to help the user reach more correct information. We might provide the user with results for the same search query without any filtration or personalization after running the same query in a sandbox environment.
  • From another point of view, we could consider personalization produces better results for the user not a source of information discrimination. How could we set apart which is which? Discrimination and misinformation or a better user experiment? I guess the best who can judge are the users themselves, offering them both results before and after personalization and they decide which is better. I guess this would be an interesting study.

 

Read More

Reflection #6 – [09/13] – [Neelma Bhatti]

  1. Sandvig, Christian, et al. “Auditing algorithms: Research methods for detecting discrimination on internet platforms.” Data and discrimination: converting critical concerns into productive inquiry (2014): 1-23.
  2. Hannak, Aniko, et al. “Measuring personalization of web search.” Proceedings of the 22nd international conference on World Wide Web. ACM, 2013.

Summary

Both papers set to explore the invisibility and resulting personalisation (in some cases, discrimination) of recommendation, search and curation algorithms.

Sandvig et al. map the traditional auditing studies for determining racial discrimination in housing to finding recommendation and searching bias faced by users of e-commerce, social and searching websites. Hannak et. al. develop and apply a methodology for measuring personalization in web search results presented to users, and the features driving that personalization.

Reflection:

  • Having read the “Folk theories” [1] paper, one can’t help but wonder if search engines also use”narcissism” as a personalisation feature, other than several others perceived by Hannak et. al. The narcissism itself can be based on inferring similarities between web traces and search history, demographics etc of different users.
  • It would be interesting to quantitatively assess the if the level or type of personalisation differed based on the device used to log into a certain account (be it Amazon, Google, Facebook). I know for a fact that it does.
  • Algorithmic audits can also be used to investigate fallacious shortage of products on e-commerce websites to generate false “hype” about certain products.
  • As the saying goes: “If you’re not paying for it, you become the product”. So are we (the products) even eligible to question what is being displayed to us? (especially while using free social media platforms).   Nothing that we consume in this world from food to news to entertainment content, has a cost  associated with it, maybe in case of using these services, the cost  is our personal data and right to access information deemed irrelevant by the algorithm.
  • Seemingly harmless, in fact beneficial personalization strategy results in a more serious problem than filter bubbles. Relevant products being shown on the newsfeed based on what we talk with friends (I once mentioned going to Starbucks and had offers related to Starbucks all over my newsfeed) or what we see (The exact products from an aisle I stood in front of for more than 5 minutes on my newsfeed) invade user privacy. If algorithmic audits need to adhere to Terms and Conditions of  a website, I wonder if any Terms and Conditions about not invading the personal space of a user to the point of creepiness exist.
  • A study to determine if an algorithm is possibly rigged¹ can have users tweak the settings publicly available to change the working of an algorithm and see if the results still favors the ownership.

¹which probably all algorithms are, to a certain extent (quoting “Crandall’s complaint”:” Why would you build and operate an expensive algorithm if you can’t bias it in your favor?”)

[1] Eslami, M., Karahalios, K., Sandvig, C., Vaccaro, K., Rickman, A., Hamilton, K., & Kirlik, A. (2016, May). First i like it, then i hide it: Folk theories of social feeds. In Proceedings of the 2016 cHI conference on human factors in computing systems(pp. 2371-2382). ACM.

 

Read More

Reflection #6 – [09/13] – [Lindah Kotut]

  • Sandvig et. al. “Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms”
  • Hannak et. al. “Measuring Personalization of Web Search”

The two papers can be connected by using the categorization provided by Sandvig to describe the work done by Hannak. Sandvig proposed a “systematic statistical investigation” (which is done by Hannak) in order to audit search algorithms. Two proposed categorizations fit Hannak’s work: Noninvasive user audit (with their use of Amazon Mechanical Turkers to probe for volatility in search results) and Sock Puppet Audit (using browser agents to mimic actual users and allow for scalable repeated perturbation of the search results).

1. User Preference
If I was to probe this question of auditing algorithm, I would start with a simple question/premise:
Why do people prefer Google as a search engine? This is an especially prescient question to ask since both of the papers agree on the dominance of Google as a search engine. Unlike previous work considering social media, where there is an involved investment by the user in creating and maintaining an account, there is no such burden required to use the search engine, with Bing and Yahoo among other search engines are being worthy competitors.

Understanding the users’ preferences whether overt or hidden, would give an added depth in considering then how the search engine either panders or accounts for personal… foibles in presenting results, and in Hannak’s case, what other axis of measurement is available to determine this.

It stands to reason, as Hannak points out, that given Google constantly tweaks how the algorithm works – It is a not an unreasonable deduction to conclude that part of the reason it does this is to account for hidden patterns in search habits that scale across user base.

This question can be asked retroactively as well: Has the user perception of search engine results changed if proof of filter-bubbles is presented? Or will they be very grateful for only receiving relevant search results? If a key to a successful business is about knowing your customer, then Google really knows its customers.

2. Volatility, Stability, Relevance … and censorship.
Both papers consider results returned by web search and claim that both approaches scale to other web searches. Image search included? For image search differs from web search. Case in point, the campaign by comedian John Oliver to subsume a tobacco company’s mascot with a… less glamorous version, which led to the rising of the “new” mascot to the top of image search (web search remained largely unchanged but for news articles).

Hannak’s work also note that the scope of their work is limited to US version of the search and the English language. This version can be served in another country however (by manually changing the extension back to .com). If we use the case of comparing volatility of the same search engine but different countries (one with censorship laws) Can this case be used to measure for censorship (and is censorship a form of bias)? — Because a measure for censorship can reveal which features (other than keywords) are used in the decision to censor and we can use this extrapolation to also consider other form of bias, intentional or not.

3.  Search Engine Optimization (and Bias)
The SEO, the process by which web presence can be “optimized” to appear high in search rankings with the use of tags, designed for mobile etc, so as to ensure that the page gets ranked favorably/contextually is a layman’s measure of auditing algorithms. Sandvig’s example of YouTube “reply girls” fits this description.

Thus, knowing that this deductive knowledge can be misused by those with the expertise to shape their websites to fit a particular demography — or as has been proven, successfully (and unethically), do this with targeted advertisements, raises the question of:

4. Who bears the responsibility?

Sandvig’s “Reply Girls” example was used to showcase how an algorithm can be subsumed to be an agent of discrimination. If proven to be the case, who is to be punished? If EU’s intention of assigning blame to platforms for the users who upload copyrighted content is anything to go by, then the blame will be laid on the algorithm owners in our case. But there is a trade-off to this, and it rounds back to the first point in this reflection — does the user care?

Read More

Reflection #6 – [09/13] – [Bipasha Banerjee]

Readings assigned

[1] Sandvig, Christian et al.  (2014) – “Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms” – Paper presented at the “Data and Discrimination: Converting Critical Concerns into Productive Inquiry,” a preconference at the 64th Annual Meeting of the International Communication Association.

[2] Hannak, Aniko et al. (2013) – “Measuring Personalisation of Web Search” – Procedings of International World Wide Web Conference Committee. (527-537)

Summary

The readings assigned to us mainly talks about the algorithmic audits. The first paper which discussed about how algorithms are rigged and used to favor or bias in many instances. They have discussed the example of the computer system named SABRE which was created by American Airlines and IBM for the ease of airline reservation system. It was later found out that the results often were bias and American Airlines results were given an unfair priority as a result of the search. This led to the government to intervene to make the system much more transparent and accessible to other airlines when information was sought. Algorithms can also be rigged and this how the “reply girls” of YouTube became popular. The authors proposed five different algorithmic audit methods and their respective effectiveness and limitations. The second paper on the other hand focused mainly on the effects personalisation of an account or profile has on search engines web results. They conducted an experiment which focused on features like basic cookie tracking, browser user-agent, the geolocation and the google account attributes. They recruited Amazon Mechanical Turks and observed about 11.7% of the search results being different due to the personalisation.

Reflection

The need to understand how algorithms are designed and how they affect our interaction in the social media and internet in general is immense. The paper by Motahhare Eslami [3] discussed how algorithms are there in place to give a personalized news feed for Facebook. Algorithms are used to in all possible ways, be it sorting the search result, the priority of the results, filtering etc. However, to those who are not in the particular company the term algorithm is just another ambiguous one. The non-disclosure clause of many companies leads them to be so mysterious to the outer world.

The authors of the first paper highlighted several algorithmic audits namely, the code audit, the non-invasive user audit, scraping audit, sock puppet audit and the crowdsourcing audit. They mentioned the challenges each method. In my opinion, looking at only one method is not going to be optimal. If, we combine two or more audit methods, and give a weight to each method (sort of a weighted mean) and at the end compute the “audit score”. The weight being assigned to each of the audits needs to be adjusted depending upon the priority of the audit method. Thus, the audit score generated would give a comprehensive idea about the algorithm in place being a “rigged” one or not.  Algorithmic audits can also be used to depict the fairness of businesses. [4].

The second paper talks about the ways that might lead to the search results being altered. Although personalisation leads to search results being altered, however it was also found that majority of the time the results were not altered. This result seems quite shocking to me. I believe a global data sampling would depict a completely different story. Comparing the both would give a better picture about exactly how personalisation is taken into consideration. Web search personalisation is an important and effective way to ensure a great user experience. However, the user needs to be aware exactly how and where their data is being used. This is where the companies need to be open and transparent.

[3] Eslami, Motahhare et al.- “I always assumed that I wasn’t really that close to [her]”: Reasoning about invisible algorithms in the news feed”

[4]https://www.wired.com/story/want-to-prove-your-business-is-fair-audit-your-algorithm/

Read More

Reflection #6 – [09/13] – [Deepika Rama Subramanian]

  1. Sandvig, Christian, et al. “Auditing algorithms: Research methods for detecting discrimination on internet platforms.”
  2. Hannak, Aniko, et al. “Measuring personalization of web search.”
  3. These papers deal with the algorithms that, in some sense, govern our lives. Sandvig et al. talk about how biases are programmed into algorithms for the benefit of the algorithm’s owner. Even organizations that wish to keep their algorithms open to public cannot do so in its entirety because of miscreants. The authors speak about the various ways the algorithms can be audited – code audit, non-invasive user audit, scraping audit, sock puppet audit, and collaborative crowdsourced audit. Each of these methods had their upsides and downsides, the worst of them being legal issues. It does seem like good PR to have an external auditor come by and audit your algorithm for biases. O’Neil Risk Consulting & Algorithmic Auditing calls their logo the ‘organic’ sticker for algorithms.

    While I wonder why larger tech giants tend not to do this, I realize we are already severely reliant on their services. For example, Google search for online shopping redirects us through one of Google Ad services to a website, or maybe we search for a location and we have it on Maps and News. They’ve made themselves almost indispensable to our lives that the average user doesn’t seem to mind that their algorithms may be biased towards their own services. Google was also recently slapped with a 5 billion dollar fine by the European Union for breaking anti-trust laws – amongst other things, they were bundling its search engine and Chrome apps into the Android OS.

    While in the case of SABRE, the bias was programmed into their system, many of today’s systems acquire the bias from the environment and the kind of conversations that they are exposed to. Tay was a ‘teen girl’ twitter bot set up by Microsoft’s Technology and Research division who went rogue in under 24 hours. Since people kept tweeting offensive material to her, she transformed into a Hitler loving, racially abusive, profane bot who had to be taken offline. Auditing and controlling bias in systems such as this will require a different line of study.

    Hannak et al. speak about the personalization of search engines. Search engines are such an integral part of our lives and there is a high possibility that we may lose some information simply because the engine is trying to personalize the results to us. One of the main reasons (it seems) like this study was carried out was to avoid filter bubble effects. However, the first thing that I felt about personalization in search engines is that it is indeed very useful – especially respect to our geographical location. When we’re looking for shops/outlets or even a city, the search engine always points us to the closest one, the one we’re most likely looking for. Also, the fact that they correlate searches with previous searches also seems to make a fair bit of sense. This study also shows that searches where the answers are somewhat strictly right or wrong (medical pages, technology, etc.) don’t have a great degree of personalization.  But as far as search engine results go, personalization should be the last thing to be factored into ranking pages after trust, completeness of information, and popularity of the page.

Read More