The paper explores how people make fairness judgments of ML systems and the impact that different explanations can have on these fairness judgments. The paper also explores how providing personalized and adaptive explanations can support such fairness judgments of ML systems. It is extremely important to ensure algorithm fairness and there is a need to consciously work towards avoiding the risk of amplifying existing biases. In this context, providing explanations can be beneficial in two aspects, not only do they help in providing implementation details which would otherwise be a “black box” to a user, but they also facilitate better human-in-the-loop experiences by enabling people to identify fairness issues. The COMPAS recidivism data was utilized for the study and four different explanations styles were examined: input-influence based, demographic-based, sensitivity-based, and case-based. Through the study, it is highlighted that there is no one-size-fits-all solution for an effective explanation. The dataset, context, kinds of fairness issues, and user profiles vary and need to be addressed individually. The paper proposes providing hybrid explanations as a solution to address this problem thereby providing both an overview of the ML model and information about specific cases to help aid accurate fairness judgment.
While there has been a lot of research focus on developing non-discriminatory ML algorithms, this paper specifically deals with the human aspect which is necessary to identify and remedy fairness issues. I feel that this is equally important and is often overlooked. It was interesting to note that they auto-generated the explanations, unlike previous studies.
With respect to the different explanation styles used, I found the sensitivity-based explanation particularly interesting since it clearly shows the difference in the prediction result if certain attributes were modified. According to me, this form of explanation, out of the four proposed, is extremely effective in bringing out any bias that may be present in the ML system.
I felt that the input-influence based explanation was also effective since it had the +/- markers corresponding to features that match the particular case and this gives the users a clearer picture of which attributes specifically influenced the result thereby providing the implementation details to a certain extent.
The study results documents various insights from participants, and I found some of them to be extremely fascinating. While some believed that certain predictions were biased, others found it normal for that verdict to be predicted. It truly captured the diversity in opinions and perspectives of the same ML system based on the different explanations provided.
- Through this study, it is revealed that the perception of bias is not uniform and is extremely subjective. Given this lack of agreement on the definition of moral concepts, how can a truly unbiased ML system be achieved?
- What are some practices that can be followed by ML model developers to ensure that the bias in the input dataset is identified and removed?
- Apart from gender-bias and ethnic-bias, what are some other prevalent biases in existing ML systems that need to be eradicated?
I agree with your deduction that sensitivity based explanation is effective in bringing out any biases that would be inherently present in the ML system. In the figure shown in the paper, sensitivity includes race, age, gender, which, in my opinion, are the three most important factors contributing to bias.