02/26/2020 – Bipasha Banerjee – Explaining Models: An Empirical Study of How Explanations Impact Fairness Judgment

Summary 

The paper highlights one of the major problems that the current digital world faces, algorithmic bias, and fairness in AI. They point out that often ML models are trained on data that in itself is bias and, therefore, may result in amplification of the existing bias. This often results in people not trusting AI models. This is a good step towards explainable AI and making models more transparent to the user. The authors used a dataset that is used for predicting the risk of re-offending, and the dataset is known to have a racial bias. Global and local explanations were taken into account across four types of explanations styles, namely, demographic-based, sensitivity based, influence based, and case-based. For measuring fairness, they considered racial discrimination and tried to measure case-specific impact. Cognition and an individuals’ prior perception of fairness of algorithms were considered as measures of individual difference factors. Both qualitative and quantitative methods were taken into account during the evaluation. They concluded that a general solution is not possible but depends on the user profile and fairness issues. 

Reflection 

The paper by Dodge et al. is a commendable effort towards making algorithms and their processing more clear to humans. They take into account not only the algorithmic fairness but also the humans’ perception of the algorithm, various fairness problems, and individual differences in their experiments. The paper was an interesting read, but a better display of results would make it easier for the readers to comprehend. 

In the model fairness section, they are considering fairness in terms of racial discrimination. Later in the paper, they do mention that the re-offending prediction classifier has features such as age included. Additionally, features like gender might play an important role too. It would be interesting to see how age and other features as a fairness issue perform on maybe other datasets where such biases are dominant. 

The authors mentioned that a general solution is not possible to be developed. However, is it possible for the solution to be domain-specific? For example, if we change the dataset to include other features for fairness, we should be able to plug in the new data without having to change the model.

The study was done using crowd workers and not domain experts who are well knowledgeable with the jargon and are used to being unbiased. Humans are prone to be biased with/without intentions. However, people who are in the legal paradigm like judges, attorneys, paralegals, court reporters, law enforcement officers are more likely to be impartial because either they are under oath or years’ of practice and training in an unbiased setup. So, including them in the evaluation and utilizing them as expert crowd workers might yield better results.

Questions

  1. A general solution for a domain rather than one size fits all?
  2. Only racial discrimination is considered as a fairness issue. Are other factors only used as a feature to the classifier? How would the model perform on a varied dataset with other features like gender as a fairness issue?
  3. The authors have used the dataset for the judicial system, and they mentioned their goal was not to study the users. I am curious to know how they anonymize the data, and how was the privacy and security of individuals handled here?

One thought on “02/26/2020 – Bipasha Banerjee – Explaining Models: An Empirical Study of How Explanations Impact Fairness Judgment

  1. Apart from considering a different sensitive attribute, I feel the researchers should also take into account multiple sensitive attributes at the same time. It would be interesting to observe how the analysis changes.

Leave a Reply