02/26/2020 – Nurendra Choudhary – Explaining Models: An Empirical Study of How Explanations Impact Fairness Judgment

Summary

In this paper, the authors design explainable Machine Learning models to enhance their fairness perception. In this case, they study COMPAS, a model that predicts a criminal’s chance of reoffending. They explain the drawbacks and fairness issues with COMPAS (overestimates the chance for certain communities) and analyze the significance of change that Explainable AI (XAI) can bring to this fairness issue. They generated automatic explanations for COMPAS utilizing previously developed templates (Binns et al. 2018). The explanations are based on 4 templates: Sensitivity, Case, Demographic and Input-Influence. 

The authors hire 160 MT workers with certain criterias such as US residence and MT expertise. The workers are a diverse set but show no significant impact on the results’ variance. The experimental setup is a questionnaire that judges the worker’s criteria for making fairness judgements. The results show that the workers have heterogeneous criteria for making fairness judgements. Additionally, the experiment highlights two fairness issues: “unfair models (e.g., learned from biased data), and fairness discrepancy of different cases (e.g., in different regions of the feature space)”. 

Reflection

AI works in a very stealthy manner. The reason is that most of the algorithms detect patterns in a latent space that is incomprehensible to humans. The idea of using automatically generated standard templates to construct explanations to AI behaviour should be generalized to other AI research areas. The experiments show the change in human behavior with respect to explanations. I believe such explanations could not only help the general population’s understanding but also help researchers in narrowing down the limitations of these systems.

From the case of COMPAS, I question the future roles that interpretable AI makes possible. If AI is able to give explanations for its prediction, then I think it shall play the role of an unbiased judge better than humans. Societal biases are embedded in humans and they might subconsciously affect our choices. Interpreting these choices in humans is a complex self-criticism endeavour. But, for AI, systems as given in the paper can generate human comprehensible explanations to validate their predictions. Thus, making AI an objectively fairer judge than humans.

Additionally, I believe evaluation metrics for AI lean towards improving their overall prediction. However, I believe that comparable models that emphasize interpretability should be given more importance. But, a drawback to such metrics is the necessity of human evaluation for interpretability. This will impede the rate of progress in AI development. We need to develop better evaluation strategies for interpretability. In this paper, the authors hired 160 MT workers. Given it is a one-time evaluation, this study is possible. However, if this needs to be included in the regular AI development pipeline, we need more scalable approaches to avoid prohibitively expensive evaluation costs. One method could be to rely on a less-diverse test set for the development phase and increase diversity according to the real-world problem setting.

Questions

  1. How difficult is it to provide such explanations for all AI fields? Would it help in progressing AI understanding and development?
  2. How should we balance between explainability and effectiveness of AI models? Is it valid to lose effectiveness in return for interpretability?
  3. Would interpretability lead to adoption of AI systems in sensitive matters such as judiciary and politics?
  4. Can we develop evaluation metrics around suitability of AI systems for real-world scenarios? 

Word Count: 567

Leave a Reply