Summary
Improvements in artificial intelligence systems are normally measured alone without taking into consideration the human element. In this paper, the authors try to measure and evaluate the human-AI team performance by designing an interactive visual conversational agent that involve both human and AI to solve a specific problem. The conversational agent assigns the AI system a secret image with caption which is not known by the human, and the human start rounds of questions to guess the correct image from pool of images. The agent maintains an internal memory of questions and answers to help maintaining the conversation.
The authors use two version of AI systems, the first one is trained using supervised learning and the second is trained using reinforcement learning. The second system outperforms the first, but the improvement doesn’t translate well when interacting with human which proves that advances in AI system doesn’t necessarily means advances in the human-AI team performance.
Reflection
I found the idea of running two AI systems with the same human to be very interesting. Normally we think that advances in AI system can lead to better usage by the human, but the study shows that this is not the case. Putting the human in the loop while improving the AI system will give us the real performance of the system.
I also found the concept of reinforcement learning in conversational agents to be also very interesting. Using online learning by assigning a positive and negative rewards can help to improve the conversation between human and AI system, which can prevent the system from getting stuck on the same answer if the human ask the same question.
The work in somehow like the concept of compatibility. When human makes a mental model about the AI system. Advances in AI system might not be translated into a better usage by the human, and this is what was proven by the authors when they use two AI systems and one is better than the other, but improvement not necessarily translate to better performance by the users.
Questions
- The authors proved that improvement in AI system alone doesn’t necessarily leads to a better performance when using the system by human, can we involve the human in the process of improving the AI system to lead to a better performance when the AI system get improved?
- The authors use a single secret image known by the AI system but not known by the human, can we make the image unknown to the AI system too by providing a pool of images and the AI system select the appropriate image? And can we do that with acceptable response latency?
- If we have to use a conversational agents like bots in production setting, do you think the performance of an AI system trained using a supervised learning can response faster than a system trained using a reinforcement learning giving that the reinforcement learning will need to adjust it is behavior based on the reward or feedback?