- Christian Sandvig, Kevin Hamilton, Karrie Karahalios, and Cedric Langbort, “Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms”
- Anikó Hannák, Piotr Sapiezynski, Arash Molavi Kakhki, David Lazer, Alan Mislove, and Christo Wilson, “Measuring Personalization of Web Search”
Summary:
Both papers are almost about the same topic Algorithmic Audits. In the first paper, the authors discuss different research methods used in auditing algorithms on the internet in order to detect misbehaviors such as discrimination and misinformation. They gave a brief history of the term Audit Study, then mentioned five different auditing methods:
- Code Audits
- Noninvasive User Audit
- Scraping Audit
- Sock Puppet Audit
- Crowdsourced Audit / Collaborative Audit
And they described different examples of each method and some of their drawbacks and limitations.
In the second paper, the authors used 2 of the auditing methods described in the first paper in order to measure personalization in web search engines, specifically Google Web Search, Microsoft Bing and DuckDuckGo. Those methods are:
- Sock Puppet Audit, by creating artificial accounts on each platform and manually crafting them. They used those account as control accounts in experimenting different features that might affect the search engine under investigation.
- Crowdsourced Audit, by using the services provided by the Amazon mechanical Turks. They employed 300 workers in order to simulate different real time users of those platforms.
They discovered that there is about an average of 11.7% variation on Google and 15.8% on Bing. Those differences of search results are due to different parameters such as user’s accounts features, IP addresses, search history, cookies … etc.
Reflection:
- Although lots of researchers do a very good job in designing their experiments and surveys using crowdsourcing platforms such as Amazon Mechanical Turks, finding a true representative sample using those platforms is still questionable. What if the study requires samples from different socioeconomic statuses or different educational levels? We cannot find people from different financial status on such platforms, and that is due to its low paid income which will attract mostly low income workers (about 52% of workers gain less than $5 per hour).
- Another raised issues, is auditing special category of algorithms that depends heavily on machine learning models. Those algorithms may misbehave and produce harmful results such as misinformation and discrimination. Who might be hold accountable for those algorithms? Is it the machine learning engineer, the decision makers or the algorithm itself?
- Personalization on web search engines could result in a filter-bubble effect which might lead to discrimination of information quality provided to users based on their location, language or preferences. Some might receive correct legit information and others may get flooded by rumors and misinformation. How could we prevent such behavior? I guess we first need to detect those features that greatly affect the results returned by those systems. Then we could find a mechanism to help the user reach more correct information. We might provide the user with results for the same search query without any filtration or personalization after running the same query in a sandbox environment.
- From another point of view, we could consider personalization produces better results for the user not a source of information discrimination. How could we set apart which is which? Discrimination and misinformation or a better user experiment? I guess the best who can judge are the users themselves, offering them both results before and after personalization and they decide which is better. I guess this would be an interesting study.