03/25/20 – Myles Frantz – Evaluating Visual Conversational Agents via Cooperative Human-AI Games

Summary

Regardless of anyone’s personal perception of chatbots, with around 1.4 billion people using chatbots (smallbizgenius) they’re impact cannot be ignored. With the intention of answering rudimentary questions (often duplicated) many of these chatbots are focused in the Question-and-Answer (QA) domain. Throughout these, the usages and feelings towards chatbots varies usually based on the user, the chatbot, and the overall interaction. Focusing more on the human-centric aspect of the conversation the team proposed a Conversation Agent (CA, a chatbot within QA) with a method to inspect sections of the conversation and determine whether the user enjoyed the conversation. Within introducing a hierarchy of specific natural language classifiers (NLC), the team was able to determine through certain classifications or signals to determine a high-level abstraction of a message or conversation. While the CA did its job sufficiently, the team was able to determine through their created signal methodology that approximately 84% of people engaged in some sort of conversation (outside of a normal question and answer scenario) with the CA.

Reflection

I am surprised at the results gleaned from this survey. While I should not be surprised and should assume the closer CA (and AI in general) get to human-like they appear the better the interaction will be, the percentage of “playfulness” or conversational messages seemed relatively high. This may be due to the experience group of the participants (new hires from college), though this is a promising sign on the progress being made.

I appreciate the aspect (or angle) this research went into. Having a strong technical background, my immediate thought is to ensure all the questions are answered correctly and investigate how it can be integrated with other systems (like a Jenkins Slack bot, polling the survey of a project). The extent of a project (I believe) is not only dependent on how usable it is, but also how user-friendly it is. Given the example MySpace and Facebook, Facebook created a much easier to use and more user-centric experience (based on connecting people), while MySpace suffered from lack of improvement for both of these aspects and is currently degrading in usage.

Questions

  • With only 34% of the participants responding to the survey, do you think a higher percentage would’ve enforced and backup the data currently collected?
  • Given the general maturity and time allocations a new hire from college has, do you think current employees (who have been with the company for a while) would have this percentage of conversation? To shorten it, do you think the normally busy or higher-up employees would have given similar conversational time to the CA?
  • Given the percentage of new hires that responded and responded conversationally to the CA, the opportunity rises for the user to communicate wholly and disregard current work in favor of a conversation (potentially as a form of escapism). Do you think if this kind of CA were implemented throughout companies, these kinds of capabilities would be abused or would be used too much?

Leave a Reply