Summary
Bansal et al.’s paper “Updates in Human-AI Teams” explores an interesting problem — the influence of updates to an AI system on the overall team performance. Nowadays, AI systems have been deployed to support human decision making in high-stakes domains including criminal justice and healthcare. In the working process of a team of humans and AI systems, humans make decisions with a reference to AI’s inferences. A successful partnership requires that the human develops an understanding into the AI system performance, especially its error boundary. Updates with algorithms of higher performance can potentially increase the AI’s predictive accuracy. However, that may require humans to regain interactive experiences and rebuild their confidence in the AI system, the adjusting process of which may actually hurt team performance. The authors introduce the concept of compatibility between an AI update and prior user experience and present methods for studying the role of compatibility in human-AI teams. Extensive experiments on three high-stakes classification tasks (recidivism, credit risk, and mortality) demonstrate that current AI systems are not provided with compatible updates, resulting in decreased performance after updating. To improve the compatibility of an update, the authors propose a re-training objective by penalizing new failures from AI systems. Their proposed compatible updates achieve a good balance of the performance and compatibility trade-off in different tasks.
Reflection
I think making AI and humans as a team to take full advantage of the collaboration is a pretty neat idea. Humans are born with the ability to adapt in the face of an uncertain and adverse world with the capacity of logic reasoning. Machines cannot perform well in those areas but can achieve efficient computation and free people for higher-level tasks. Understanding how machines can efficiently enhance what humans perform best and how humans can augment the work scope of machines is the key to rethink and redesign current decision making system.
What I find interesting about the research problem discussed in this paper is that the authors focus on the idea of unifying a decision made by humans and machines but not merely on the performance in tasks to recommend updates. In machine learning with no human involved, the goal is usually to achieve better and better performance which is evaluated by metrics such as accuracy, precision, recall, etc. The compatible updates can be seen as the machine learning algorithms with similar decision boundaries but better performance, which seems to be an even more difficult task to accomplish. To get there, humans need to perform crucial roles. Firstly, humans must train machines to achieve good performance on certain tasks. Next, humans need to understand and be able to explain the outcomes of those tasks, especially where AI systems fail. That requires an interpretability component in the system. As AI systems are increasingly drawing conclusions through opaque processes (also-called black-box problem), there is a large demand of human experts in the field to explain model behavior to non-expert users. Last but not least, humans need to sustain the responsible use of AI systems by, for example, updating for better decision making discussed in the paper. That would require a large body of human experts who continually work to ensure that AI systems are functioning properly, safely, and responsibly.
The above discussion is one side of a coin, focusing on how humans can extend what machines can achieve. The other side is comparatively less discussed in the current literature. Except for extending physical capabilities, how humans can learn from the interaction with AI systems and enhance individual abilities is an interesting question to explore. I would imagine, in an advanced Human-AI team, that humans and AI systems communicate in a more interactive way which allows for collaborative learning from their own mistakes and the rationale of the correct decisions made by each other. That leads to another question, if AI systems can exceed or rival humans in high-stake decision making such as recidivism and underwriting, how risky is it to handle the tasks to machines? How can we decide when to let humans take control?
Discussion
I think the following questions are worthy of further discussion.
- What can humans do that machines cannot and vice versa?
- What is the goal of decision making and what factors are stopping humans or machines from making good decisions?
- In the Human-AI teams discussed in the paper, what can humans benefit from the interaction with the AI systems?
- The partnership introduced by the authors is more like a human-assisting-machine approach. Can you provide some examples of machine-assisting-human approaches?
3. Humans are not capable of identifying complex patterns. AI is able to do that. The human mental model tries to adapt to the process in which AI identifies patterns. Current AI systems, for example, recognize patterns in a statistical space incomprehensible to humans. Once the AI identifies these, humans use their abilities to make sense of why such prediction was made. Basically, humans try to recognize AI patterns. But improvement in AI might lead to change in such patterns and learning such dynamic patterns might improve human cognition.
I like the questions you posed for this paper. I believe that the answer to your second question was addressed in the CAJA experimental design – neither party knows the full information. In the real-world example, doctors would know things like “the patient is a flake about taking their medication,” whereas the machine would assume the patient would follow all instructions completely. While this example is focusing on mistrust and/or profiling, these are areas that the AI will be blind towards.