02/26/20 – Lee Lisle – Explaining Models: An Empirical Study of How Explanations Impact Fairness Judgment

Summary

            Dodge et al. cover a terribly important issue with artificial intelligence programs and biases from historical datasets, and how to mitigate the inherent racism or other biases within. They also work understand how to better communicate why AIs reach the recommendations they do and how. In an experiment, they look at communicating outcomes from a known biased ML model for predicting recidivism amongst released prisoners called COMPAS. They cleaned the ML model to make race less impactful to the final decision, and then produced 4 ways of explaining the result of the model to 160 mTurk workers: Sensitivity, Input-influence, Case, and demographic. “Input” emphasizes how much each input affected the results, “Demographic” describes how each demographic affects the results, “Sensitivity” shows what flipped demographics would have changed the results, and “Case” finds the most similar cases and details those results. They found that local-based explanations (case and sensitivity) had the largest impact on perceived fairness.

Personal Reflection

This study was pretty interesting to me based on it actually trying to adjust for the biases of input data as well as understanding how to better convey insights from less-biases systems. I am still unsure that the authors removed all bias from the COMPAS system but seeing that they did lower the coefficient significantly shows that it was working on it. In this vein, the paper made me want to read the paper they cited as how they could mitigate biases in these algorithms.

I found their various methods on how to communicate how the algorithm came to its recommendation to be rather incisive. I wasn’t surprised that people found that when the sensitivity explanation said that if the individual’s race was flipped the decision would be flipped lead to more perceived issues with the ML decision. That method of communication seems to lead people to see issues with the dataset more easily in general.

The last notable part of the experiment is that they didn’t give a confidence value for each case – they stated that they could not control for it and so did not present it to participants. That seems like an important part of making a decision based on the algorithm. If the algorithm is really on the fence, but has to recommend one way or the other, it might make it easier to state that the algorithm is biased.

Questions

  1. Would removing the race (or other non-controllable biases) coefficient altogether affect the results too much? Is there merit in zero-ing out the coefficient of these factors?
  2. Having an attention check in the mTurk workflow is, in itself, not surprising. However, the fact that all of the crowdworkers passed the check is surprising. What does this mean for other work that ignores a subset of data assuming the crowdworkers weren’t paying attention? (Like the paper last week that ignored the lowest quartile of results)
  3.  What combination of the four different types would be most effective? If you presented more than one type, would it have affected the results?
  4. Do you think showing the confidence value for the algorithm would impact the results significantly?  Why or why not?

2 thoughts on “02/26/20 – Lee Lisle – Explaining Models: An Empirical Study of How Explanations Impact Fairness Judgment

  1. Hello, Lee.

    Your reflection agrees with a lot of my reflection on the same paper. I did, however, think that the sensitivity method is doing more than helping people see the issues. To me, it felt like the explanation was trying to make the users make the opposite decision by making people think the system is biased.

    I agree with you on the problem of not having confidence scores stated for each case. This relates to the problem with sensitivity. If the confidence levels were stated instead of straight-up stating that the AI would have made a different choice, it would have been a much subtle way of conveying the same information.

    Regarding your question 1, while the race condition (after being adjusted the coefficient) would not directly affect the decision, I think it may still have had some effect on the neural network nodes here and there. So I think the adjustment procedure is trying to capture that small detail without bias.

  2. To your fourth question, I believe the authors addressed that question in their follow-up work i.e. “Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making”. They show that confidence score can help calibrate people’s trust in an AI model. However, I believe it depends on individual biases and the task if showing the confidence score can create more biases or make the whole decision-making process bias-free.

Leave a Reply