- Sandvig et. al. “Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms”
- Hannak et. al. “Measuring Personalization of Web Search”
Our lives have become so intertwined with systems like web search that we fail to understand the potential foibles of such systems. The trust we have towards such systems are not questioned until a major flaw with it is uncovered. But, given the black-box nature of these systems how does one even understand the shortcomings of such systems. One of the ways this could be achieved is through a process called “Algorithmic Audit.” The importance of these audits are ever more important given the way of the world at the moment.
In paper [1], the author first talks about the concept of “screen science” coined by American Airlines team who created the SABRE flight reservation system to refer to human biases in the way we choose things given in a list. He then goes on to conclude righly, how important “algorithmic audits” are , while introducing the said concept. He argues that understanding of systems that we interact with daily is paramount given how firms have acted in the past. The author points out various types of audits, namely, code audit, non-invasive user audit, scraping audit, sock puppet audit, and collaborative crowdsourced audit. The author then finally examines what would be needed in order for us to conduct such audits in terms of legal or financial support while advocating a shift of perspective to “accountability by auditing.”
In this paper [2], the author tries to measure the basis and the extent of personalization in modern search engines through the use of benign sock-puppet auditing. The major contributions the author makes are as follows: a methodology is created to measure personalization in web searches, this methodology is then used to measure personalization in google and the causes behind personalization is investigated. Couple of issues the author’s methodology has mitigated are issues related to temporal changes in search index, issue of consistency with respect to search indices being distributed and mitigate the use of A/B testing by the search provider.
One of the first thoughts that popped up in my mind is how the results of the audit might be different if conducted at the moment. The push towards more personalization through the use of machine learning models and the push in doing so by all major internet firms and in this case, Google, might render different results. This which would an interesting project to explore.
Are certain language constructs due to it being favored by certain groups found crudely by optimizing on the click rate being used in search results? This is an interesting question one could try to answer extending the insights that [2] gives. It would also be interesting to compare the description of the person constructed by the system to that of the actual person, see how optimizing on different factors have an impact on the system’s ability to create a true to life description.
Although the author shows that carry-over effect becomes negligible after 10 minutes the long term effects of the profiling due to being able to understand the user, through their understanding their behaviour and preferences thoroughly are not explored in the work. The challenge in identifying this would be the same issues the author pointed out, changing search indices and not being able switch off personalization during searches.
Given how important understanding of these systems are and their impact on our understanding of the world, it would be a worthwhile action to perform to conduct such audits on algorithms, data used to create ML models by unbiased agencies to track the reliability and biases of such systems while giving enough room to keep the privacy of such algorithms built by these providers safe. Having checks on these systems will ultimately ground the expectations of us users on these systems. If any malevolent actions are found, legal actions could be called for against these service providers to foster accountability.