The variability of current AI designs as well as automated inferences of failures – ranging from the disruptive or confusing to the more serious – calls for creating more effective and intuitive user experiences with AI. The paper “Guidelines for Human-AI interaction” enriches the ongoing conversation on heuristics and guidelines towards human-centered design for AI systems. In this paper, Amershi et al. identified more than 160 potential recommendations for Human-AI interaction from respected sources that ranged from scholarly research papers to blog posts and internal documents. Through a 4-phase framework, the research team systematically distilled and validated the guideline candidates into a unified set of 18 guidelines. This work empowers the community by providing a resource for designers working with AI and facilitates future research into the refinement and development of principles for human-AI interaction.
The proposed 18 guidelines in the paper are grouped into four sections that prescribe how an AI system should behave upon initial interaction, as the user interacts with the system, when the system is wrong, and over time. As far as I can see, the major research question is how to control automated inferences to some extent when they are performing under uncertainty. We can imagine that it would be extremely dangerous in scenarios in which humans are unable to intervene when AI makes incorrect decisions. Take autonomous vehicles for example, AI may behave abnormally under the real-world situations that it has not faced in its training. How to integrate efficient dismissal or correction is an important question to consider in the initial design of the autonomous system.
Also, we need to be aware of that while the guidelines for Human-AI Interaction are developed to support design decisions, they are not intended to be used as a simple checklist. One of the important intentions is to support and stimulate conversations between user experience and engineering practitioners that lead to better AI design. Another takeaway from this paper is that there will always be numerous situations where AI designers must consider trade-offs among guidelines and weigh the importance of one or more over others. Beyond the 4-phase framework presented in the paper, I think there are at least two points worth of discussion. Firstly, the 4-phase framework is more like a narrowing down process, while no open-ended questions are raised in the feedback circle. The functioning and goals of apps in different categories may vary. Rising capabilities and use cases may suggest there is a need for additional guidelines. As the AI design advances, we may need more innovative ideas about the future AI design instead of constraining to the existing guidelines. Secondly, it seems all the evaluators participated in the user study are in the domain of HCI and a number of them gain years of experience in the field. I’m wondering if the opinions of end users without HCI experience need to be considered as well and how a wider involvement would impact the final results. I think the following questions are worthy of further discussion.
- Which of the 18 proposed design guidelines are comparatively difficult to employ in AI designs? Why?
- Besides the proposed guidelines, are there any design guidelines worthy of attention but not discussed in the paper?
- Some of the guidelines seem to be of greater importance than others in user experience of specific domains. Do you think the guidelines need to be tailored to the specific categories of applications?
- In the user study, do you think it would be important to include end users who actually use the app but without experience studying on HCI?