Paper: Ewa Luger and Abigail Sellen. 2016. “Like Having a Really Bad PA”: The Gulf between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI ’16). Association for Computing Machinery, New York, NY, USA, 5286–5297.
Summary: This paper presents findings from 14 semi-structured interviews conducted with users of existing conversational agent systems, and highlights four key areas where current systems fail to support user interaction. It details how conversational agents are increasingly used within online systems, such as banking systems, as well as by larger companies like Google, Facebook, and Apple. The paper attempts to understand end-users and their experiences in dealing with conversational agents, as well as the challenges that they face in doing so—both from the user and the conversational agents’ side. The findings highlight how conversational agents are used by end-users for play, hands-free speech when they are unable to type, for making specific and formal queries, and for simple tasks such as finding out the weather. The paper also talks about how, in most instances, conversational agents are unable to fill the gap between users’ expectations and the actual way the conversational agent behaves, and that incorporating playfulness may be useful. Finally, the paper uses Norman’s gulf of execution and evaluation to provide implications for designing future systems.
Reflection:
This paper is very interesting, and I have had similar thoughts when using conversational agents in day to day life. I also appreciate the use of semi-structured interviews to get at users’ actual experiences of using conversational agents and how it differed from their expectations prior to using these CAs.
This work also adds on to prior work, confirming the existence of this gap or gulf between expectations and reality, and that users constantly expect more from CAs than CAs are capable of providing. The paper also speaks to the importance of designing conversational agents where user expectations should be set rather than having users set their own expectations, as we saw in some papers from previous weeks. The authors also discuss emphasizing interaction and constant updates with the CA to improve end-user expectations.
The paper also suggests ways to hold researchers and developers accountable for the promises that they make when designing such systems, and overhauling the system based on user feedback.
However, rather than just focusing on where conversational agents failed to support user interaction, I wish the paper had also focused on where the system successfully supports user interaction. Further, I wish they had sampled users who were not only novices but also experts, who might have had different expectations. It might be interesting to scale up this work as a survey to see how users’ expectations differ based on the conversational agent that is being used.
Questions:
- How would you work to reduce the gulf between expectation and reality?
- What are the challenges to building useful and usable conversational AIs?
- Why are conversational AIs sometimes so limited? What affects their performance?
- Where do you think humans can play a role? I.e. as humans-in-the-loop?
Hi Sukrit, great point! Here I mainly address your second question. There are several challenges that conversational AI is currently facing. The first I can think about is to communicate with an understanding of emotions. Just as human-human communication, not only what content is delivered, but also how it is delivered plays an important role in successful communication. To achieve that goal, the conversational AI agents need to be trained with the ability to understand different sentiments and respond to them in an appropriate way. Secondly, robustness is another challenge. When a person is giving instructions to conversational AI, surrounding noises and other people’s voices may affect the AI system’s recognition of the instructions. In this case, the AI agent must be able to differentiate the command or question from the rest. Last but not least, the data transmitted during the conversation must be securely processed and stored, especially when it comes to personal confidential information.
Hi Sukrit, to answer your third question, I think there are technical limitation preventing the conversational agents from having a natural interaction with humans. These agents need to respond to users queries immediately and contacting remote AI systems to process each request can results in even more frustration if the response time is more than few seconds.