Summary
The paper by Liao et al. talks about conversational agents (CAs) that are used to answer two research questions. The first was to see how CAs interact with users, and the second was to see what kind of conversational interactions could be used for the CAs to gauge user satisfaction. For this task, the authors developed a conversational agent called Cognitive Human Interface Personality (CHIP). The primary function of the agent is to provide HR assistance to new hires to a company. For this research, 377 new employees were the users, and the agent provided support to them for six weeks. CHIP would answer questions related to the company, which is quite natural for newly employed individuals to have. IBM Watson Dialog package has been utilized to incorporate the conversations collected from past CA usages. They made the process iterative, where 20-30 user interaction was taken into account in the development process. The CA was aimed to be conversational and social. In order to do so, users were assisted with regular reminders. Participants in the study were asked to use a specific hashtag, namely, #fail, to provide feedback and consent to the study. The analysis was done using classifiers to provide a characterization of user input. It was concluded that signals in conversational interactions could be used to infer user satisfaction and further develop chat platforms that would utilize such information.
Reflection
The paper does a decent job of investigating conversational agents and finding out the different forms of interactions users have with the system. This work gave an insight into how these conversations could be used to identify user satisfaction. I particularly was interested to see the kind of interaction the users had with the system. It was also noted in the paper that the frequency of usage of the CA declined within two weeks. This was natural when it comes to using the HR system. However, industries like banking, where 24-hour assistance is needed and desired, would have consistent traffic of users. Additionally, it is essential to note how they maintain the security of users while such systems use human data. For example, HR data is sensitive. The paper did not mention anything about how do we actually make sure that personal data is not transferred or used by any unauthorized application or humans.
One more important factor, in my opinion, is the domain. I do understand why the HR domain was selected. New hires are bound to have questions, and a CA is a perfect solution to answer all such frequently asked questions. However, how would the feasibility of using such agent change with other potential uses of the CA? I believe that the performance of the model would decrease if the system was to be more complex. Here the CA mostly had to anticipate or answer questions from a finite range of available questions. However, a more open-ended application could have endless questioning opportunities. To be able to handle such applications would be challenging.
The paper also only uses basic machine learning classifiers to answer their first research question. However, I think some deep learning techniques like those mentioned in [1] would help classify the questions better.
Questions
- How would the model perform in domains where continuous usage is necessary? Examples are the banking sectors.
- How was the security taken care of in their CA setup?
- Would the performance and feasibility change according to the domain?
- Could deep learning techniques improve the performance of the model?
References