02/26/20 – Fanglan Chen – Will You Accept an Imperfect AI? Exploring Designs For Adjusting End-user Expectations of AI Systems

Summary

Kocielnik et al.’s paper “Will You Accept an Imperfect AI?” explores approaches for shaping expectations of end-users before their initial working with an AI system and studies how appropriate expectations impact users’ acceptance of the system. Prior study has presented that end-user expectations of AI-powered technologies are influenced by various factors, such as external information, knowledge and understanding, and first hand experience. The researchers indicate that expectations vary among users and users perception/acceptance of AI systems may be negatively impacted when their expectations are set too high. To fill in the gap of understanding how end-user expectations can be directly and explicitly impacted, the researchers use a Scheduling Assistant – an AI system for automatic meeting schedule detection in email – to study the impact of several methods of expectation shaping. Specifically, they explore two system versions with the same accuracy level of the classifier but each is intended to focus on mitigating different types of errors(False Positives and False Negatives). Based on their study, error types highly relate to users’ subjective perceptions of accuracy and acceptance. Expectation adjustment techniques are proposed to make users fully aware of AI imperfections and enhance their acceptance of AI systems.

Reflection

We need to be aware that AI-based technologies cannot be perfect, just like nobody is perfect. Hence, there is no point setting a goal that involves AI systems making no mistake. Realistically defining what success and failure look like associated with working with AI-powered technologies is of great importance in adopting AI to improve the imperfection of nowadays solutions. That calls for an accurate positioning of where AI sits in the bigger picture. I feel the paper mainly focuses on how to set appropriate expectations but lacks a discussion on different scenarios associated with the users expectations to AI. For example, users expectation greatly vary to the same AI system in different decision making frameworks: in human-centric decision making process, the expectation of AI component is comparatively low as AI’s role is more like a counselor who is allowed to make some mistakes; in machine-centric system, all the decisions are made by algorithms which render users’ low tolerance of errors, simply put, some AIs will require more attention than others, because the impact of errors or cost of failures will be higher. Expectations of AI systems vary not only among different users but also under various usage scenarios.

To generate positive user experiences, AI needs to exceed expectations. One simple way to achieve this is to not over-promise the performance of AI in the beginning. That relates with the intention of the researchers on designing the Accuracy Indicator component in the Scheduling Assistant. In the study case, they set the accuracy to 50%. This accuracy is actually very low in AI-based applications. I’m interested in whether the evaluation results would change with AI systems of higher performance (e.g. 70% or 90% in accuracy). I think it is worthwhile to conduct a survey about users’ general expectations of AI-based systems. 

Interpretability of AI is another key component that shapes user experiences. If people cannot understand how AI works or how it comes up with its solutions, and in turn do not trust it, they would probably not choose to use it. As people accumulate more positive experiences, they build trust with AI. In this way, easy-to-interpret models seem to be more promising to deliver success compared with complex black-box models. 

To sum up, by being fully aware of AI’s potential but also its limitations, and developing strategies to set appropriate expectations, users can create positive AI experiences and build trust in an algorithmic approach in decision making processes.

Discussion

I think the following questions are worthy of further discussion.

  • What is your expectation of AI systems in general? 
  • How would users expectations of the same AI system vary in different usage scenarios?
  • What are the negative impacts brought by the inflated expectations? Please give some examples. 
  • How can we determine which type of errors is more severe in an AI system?

4 thoughts on “02/26/20 – Fanglan Chen – Will You Accept an Imperfect AI? Exploring Designs For Adjusting End-user Expectations of AI Systems

  1. I believe this is affected in whether the users know if the system is using an AI internally or not, as well as the type of system itself. When people use a standard system they expect it to work, depend on the results to be predictable, and for the results to be reproducible. Fortunate or not AI has built up an image of being relatively unstable and people would expect as much when they hear it’s powering a system. Conversely, if there is a critical system supported by AI (ex. an atm) people would have high expectations since it’s being depended on. Similarly, people would likely not use a system powered by AI if it’s critical due to the reputation it’s achieved (of being relatively unstable).

  2. I agree with your comment about how one should not ‘over-promise the performance of AI in the beginning’. If the bar is set more realistically (given that AI is still far from being perfect), it would enable the AI systems to generate positive expectations. I also agree that conducting a survey on users’ expectations from AI systems would be interesting and would facilitate the task of not ‘over-promising’.

  3. To your fourth question, I believe it boils down to the error recovery costs, the cognitive workload involved and the criticality of the consequences of the task we are currently doing. If the consequences are innocuous, I would be okay with it e.g. random keyboard suggestions vs expensive flight suggestions.

  4. Hi Fanglan, you mention an interesting point:

    “We need to be aware that AI-based technologies cannot be perfect, just like nobody is perfect. Hence, there is no point setting a goal that involves AI systems making no mistake.”

    I think even though AI is not perfect, it doesn’t make sense to _not_ set a goal. Like we strive to be better at everything we do, we also need to make AI systems better and less biased. Otherwise, AI will continue to further social inequalities, which I hope is not something you want. I think the type of errors also depends on the application. Like I said in my blog post, if it is movie recommendations, we won’t notice false negatives (a relevant movie that was not recommended), but with cancer screenings, false negatives are much worse (if someone has cancer and we don’t detect it). That explains why Geoffrey Hinton’s tweet was so controversial, because he didn’t think about issues like bias and interpretability: https://twitter.com/geoffreyhinton/status/1230592238490615816

Leave a Reply