Paper: Ewa Luger and Abigail Sellen. 2016. “Like Having a Really Bad PA”: The Gulf between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI ’16). Association for Computing Machinery, New York, NY, USA, 5286–5297.
Summary: This paper presents findings from 14 semi-structured interviews conducted with users of existing conversational agent systems, and highlights four key areas where current systems fail to support user interaction. It details how conversational agents are increasingly used within online systems, such as banking systems, as well as by larger companies like Google, Facebook, and Apple. The paper attempts to understand end-users and their experiences in dealing with conversational agents, as well as the challenges that they face in doing so—both from the user and the conversational agents’ side. The findings highlight how conversational agents are used by end-users for play, hands-free speech when they are unable to type, for making specific and formal queries, and for simple tasks such as finding out the weather. The paper also talks about how, in most instances, conversational agents are unable to fill the gap between users’ expectations and the actual way the conversational agent behaves, and that incorporating playfulness may be useful. Finally, the paper uses Norman’s gulf of execution and evaluation to provide implications for designing future systems.
Reflection:
This paper is very interesting, and I have had similar thoughts when using conversational agents in day to day life. I also appreciate the use of semi-structured interviews to get at users’ actual experiences of using conversational agents and how it differed from their expectations prior to using these CAs.
This work also adds on to prior work, confirming the existence of this gap or gulf between expectations and reality, and that users constantly expect more from CAs than CAs are capable of providing. The paper also speaks to the importance of designing conversational agents where user expectations should be set rather than having users set their own expectations, as we saw in some papers from previous weeks. The authors also discuss emphasizing interaction and constant updates with the CA to improve end-user expectations.
The paper also suggests ways to hold researchers and developers accountable for the promises that they make when designing such systems, and overhauling the system based on user feedback.
However, rather than just focusing on where conversational agents failed to support user interaction, I wish the paper had also focused on where the system successfully supports user interaction. Further, I wish they had sampled users who were not only novices but also experts, who might have had different expectations. It might be interesting to scale up this work as a survey to see how users’ expectations differ based on the conversational agent that is being used.
Questions:
- How would you work to reduce the gulf between expectation and reality?
- What are the challenges to building useful and usable conversational AIs?
- Why are conversational AIs sometimes so limited? What affects their performance?
- Where do you think humans can play a role? I.e. as humans-in-the-loop?