02/19/2020 – Nan LI – Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff

Summary:

In this paper, the author presented that it is prevailing nowadays human and AI form a team to make decisions in a way that AI provides recommendations and humans decide to trust or not. In these cases, successful team collaboration mainly based on human knowledge on previous AI system performance. Therefore, the author proposed that though the update of the AI system would enhance the AI predictive precision, it will hurt the team performance since the updated version of the AI system usually not compatible with the mental model that humans developed with previous AI systems. To solve this problem, the author introduced the concept of compatibility of AI update with prior user experience. To examine the role of this compatibility in Human-AI teams, the author proposed methods and designed a platform called CAJA to measure the impact of updates on team performance. The outcomes show that team performance could be harmed even the updated system’s predictive accuracy improved. Finally, the paper proposed a re-training objective that can promote the compatibility of updates. In conclusion, to avoid diminish team performance, the developer should build a more compatible update without surrendering performance. 

Reflection:

In this paper, the author talked about a specific interaction, which is AI-advised human decision making. As the example presented in the paper–Patient readmission system. In these cases, an incompatible update of the AI system would indeed harm the team performance. However, I think the extent of the impact largely depends on the correlation between the human and AI systems.

If the system and the user have a high grade of interdependence, both are not specialists on a task, the system prediction accuracy and user knowledge have the same impact on the decision result, the incompatible update of the AI system will weaken the team performance. Even though this effect can be eliminated by the running-in of the user and the system later, the cost for the decision in the high-stakes domain will be very large.

However, if the system interacts with users frequently, but the system’s predictions are only one of the concerns for humans to make decisions and cannot directly affect the decision, then the impact of incompatible updates on team performance will be limited.

Besides, if humans are more expertise on the task, and can validate the correctness of the recommendation promptly, then both the improvement of the system performance and the new errors caused by the system update will not have much impact on the results. On the other hand, if the error caused by the update does not affect team performance, then when updating the system, we do not need to consider compatibility but only need to consider the improvement of system performance. As a conclusion, if there is not enough interaction between the user and the system, and the degree of interdependence is not high, or the system only serves as an auxiliary or double-check, then the system update will not have a great impact on team performance.

A compatible update is indeed helpful for users to quickly adapt to the new system, but I think the impact of update largely depends on the correlation between the user and the system, or the proportion of the system’s role in teamwork.

Besides, design a compatible update version also requires extra cost. Therefore, I think we should consider minimizing the impact of system errors on the decision-making process when designing the system and establishing human-AI interaction.

Question:

  1. What do you think about the concept of compatibility of AI update?
  2. Do you have any human-AI system examples that apply this author theory?
  3. Under what circumstances do you think the author’s theory is the most used and when it is not applicable?
  4. When we need to update the system frequently, do you think it is better to build a compatible update or to use an alternative method to solve the human adaptation costs?
  5. In my opinion, Huaman’s degree of adaptation is very high, and the cost required for humans to adapt is much smaller than the cost of developing a compatible update version. what do you think?

Word Count: 682

2 thoughts on “02/19/2020 – Nan LI – Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff

  1. Great discussion Nan! Here I mainly address your questions 1, 4, and 5. I think the concept of compatibility of AI update is a neat idea, but that raises high requirements for AI system development. To achieve the desired compatibility, the updates need to meet at least two conditions: 1. the algorithm can achieve higher performance or better interpretability (if not, there seems no need to update the system) 2. the algorithm has similar decision boundary (if not, it may take longer to adjust to new errors). I agree with you that it may be easier for humans to adjust to the new systems than updating the system in the desired way. Another question to explore is what is the expectation of AI’s role in the teamwork. If we expect to relying on the AI inferences for decision making most of the time, the performance (e.g. accuracy, F1 score) improvement of the system is the key. In that case, we may allow a longer time and more efforts to let human adjust the updated systems to achieve to goal of freeing humans in the task.

  2. I agree with you. For the fourth question, I think this method is feasible, and a compatible alternative eliminates the process of repeated retraining. Actually, I think the best use case for compatible updates is in such frequent update scenarios.

Leave a Reply