03/25/20 – Lulwah AlKulaib- AllWorkNoPlay

Summary

The paper studies a field deployment of a question and answer chatbot in the field of human resource. They focus on the users’ conversational interactions with the chatbot. The HR chatbot provided company related information assistance to 377 new employees for 6 weeks. The author’s motivation was that studying conversational interactions and the rich signals which are used for inferring user status. These signals would be utilized to develop adapting agents in terms of functionality and interaction style. By contrasting the signals, they show the various functions of conversational interactions. The authors discuss design implications for conversational agents, and directions for developing adaptive agents based on users’ conversational behaviors. In their paper, they try to address two main research questions:

• RQ1: What kinds of conversational interactions did users have with the QA agent in the wild?

• RQ2: What kinds of conversational interactions can be used as signals for inferring user satisfaction with the agent’s functional performance, and playful interactions?

They answer RQ1 by presenting a characterization of the users’ conversational input and high level conversational acts. Then after providing a characterization of the conversational interactions, the authors study what signals exist in them for inferring user satisfaction (RQ2).

Reflection

In the paper, the authors study and analysis of conversations as signals of user satisfaction (RQ2). I found that part most interesting as their results show that users were fairly divided in terms of opinion when it comes to the chatbot’s functionality and playfulness. This means that there’s a need for adapting system functions and interaction styles for different users. 

This observation makes me think of other systems where there’s a human in the loop interaction and how would system functions and interaction styles affect users satisfaction. And in systems that aren’t chatbot based, how is that satisfaction measured?  Also, when thinking of systems that handle a substantial amount of interaction, would it be different? Does it matter if satisfaction is self reported by the user? Or would it be better to measure it based on their interaction with the system? 

The paper acknowledges that the results are based on a survey data as a limitation. The authors mention that they had a response rate of 34% and that means that they can’t rule out self-selection bias. They also acknowledge that some observations might be specific to the workplace context and the user sample of the study. 

The results in this paper provide some understanding of functions of conversational behaviors with conversational agents derived from human conversations. I would love to see similar resources for other non conversational systems and how user satisfaction is measured there.

Discussion

  • Is user satisfaction an important factor/evaluation method in your project?
  • How would you quantify user satisfaction in your project?
  • Would you measure satisfaction using a self reported survey by the user? Or would you measure it based on the user’s interaction with the system? And why?
  • Did you notice any other limitations in this paper other than the ones mentioned?

3 thoughts on “03/25/20 – Lulwah AlKulaib- AllWorkNoPlay

  1. In response to your third question, if I needed to measure satisfaction, I would try to measure it indirectly, without asking the participants. I think that people are oftentimes unreliable sources of information, even when that information is their own thoughts on something. People may not remember their interactions correctly, they may be influenced by the way in which you ask the question, they may lie, or they may just not be sure of their own feelings. I have taken many surveys and have found it hard to put my own feeling into numbers or words. I just think it is much more reliable to find other ways to measure satisfaction.

  2. Hi Lulu, of course, user satisfaction is an important factor to be considered in our project. In the microblog retrieval task, user satisfaction can be measured by a self-reported relevance score of the retrieved results, which can be seen as explicit feedback. Analyzing the user’s interaction with the system is more of implicit feedback, which is widely used to improve AI systems, such as watch history and like/dislike to certain movies in a movie recommender system. In short, user satisfaction plays an important role in AI systems and the appropriate method to measure the satisfaction would be shaped by the function and design of an AI system itself.

  3. It would be good if we could measure user satisfaction indirectly, but unfortunately due to limitations of automated systems, that might not necessarily convey a clear picture. I think a combination of both might help.

Leave a Reply