02/19/2020 – Ziyao Wang – Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff

The authors introduced the fact that in human-AI hybrid decision making system, the updates, which aiming at improving accuracy of AI system, may bring harmful effect to the teamwork. For experienced workers who are advised by AI system, they have built a mental model for the AI system, which will improve the correctness of teamwork’s results. However, the updates which will improve the accuracy of the AI system, may result in the difference between the updated model and the worker’s mental model. Finally, the user cannot make appropriate decisions with the help of AI system. In this paper, the researchers proposed a platform named CAJA, which can help to evaluate the compatibility between AI and human. With the results from experiments using CAJA, developers can learn how to make updates compatible while being still of high accuracy.

Reflection:

Before reading this paper, I kept the thought that it is always good to have a AI system with higher accuracy. However, this paper provides me a new point of view. Instead of only the performance of systems, we should also consider the cooperation between the system and human workers. In this paper, the updates in AI system will destroy the mental system in human mind. The experienced workers should have built a good cooperate system with the AI tools. They know about which advices should be taken and which ones may contain errors. If the patch makes the system to be accurate while reducing the correctness rate of the part which is trusted by human, the accuracy of the whole hybrid system will also be reduced. Human may not trust the updated system until they got a new balance with the updated system. During this period, the performance of this hybrid system will be reduced to a low level which is even worse than keeping the previous system which is not updated. For this reason, the developers should also try to maximize the performance of the system before release the application to the users. As a result, new updates will not make large changes to the system, and human can be more familiar to the updated system.

We can learn from this fact that we should never ignore the interaction between human and AI system. A good design of the interaction can contribute to the improvement of the performance of the whole system. In the meantime, a system with poor human-AI interaction may be harmful to the whole system. When we try to implement a system which needs both human affordance and AI affordance, we should pay more attention to the cooperate between human and AI. We should leverage the affordance from both sides, instead of only focusing on the AI system. We should put us in the position in the designer of the whole system with the view of overall situation rather than just consider ourselves as programmer and only focus on the program.

Questions:

What’s the criteria for deciding whether the updates are compatible or not?

Will releasing instructions for each update to the users valuable to reduce the harm of updates?

If we have a new version of system which will improve the accuracy greatly, however the users’ mental model is totally different from it,  how to reach a balance which will maximize the performance of the whole hybrid system?

3 thoughts on “02/19/2020 – Ziyao Wang – Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff

  1. As for the first question, I think the most intuitive judgment is the accuracy rate. After the update, if the accuracy rate is not lost too much, thus the update was successful, also keep people familiar with the new version.

  2. I also was of the opinion that it is “always good to have an AI system with higher accuracy” and before this paper, had never considered the system as a whole. I like the point that you make that a good design can help facilitate better interaction and ensure that the system as a whole (human and machine) will have better performance.

    With respect to your second question, I do believe that having some summary of updates made and releasing a set of easy instructions for each update might in fact help reduce the harm that the updates to the AI model may have caused.

  3. To answer your first question, I think the criteria will be the user acceptance rate. If the user acceptance rate is 95% on the old system and it is reduced to 90% then that is an indicator that the system is not well compatible.

Leave a Reply