Define successful use
Why does this matter?
- Align stakeholders. Each stakeholder has a different perception of the success for AI products. Their success definition can be characterized by a plethora of metrics, capturing their various priorities, reference points, and evaluation resources.
- Set priorities. Each evaluation, as long as it is meaningfully informative, holds some value, but it’s difficult to understand which success metric results to consider more actively in AI integration. To align the efforts of varied teams and meaningfully address the challenges for which the technology was introduced, a considered and explicit definition of success for an AI product is required.
- Don’t skip the basics. Furthermore, even when jumping into broader value-based definitions of success, it remains important to assess the most basic claims of functionality being made about the AI product. The integration of the AI product into healthcare needs to be justified with evidence for its improvement or matched performance when compared to the environment before integration (pre-integration).
How to do this?
Step 1: Determine the stakeholders involved in determining metrics for success
- Defining success is a multi-stakeholder effort. Success commonly relates to one or more of four aims:
- Reducing cost
- Improving patient health
- Improving patient experience
- Improving healthcare provider experience
- Definitions should be led by the problem and its various dimensions. This also includes soliciting external perspectives (including community viewpoints and regulatory entities) on what the successful implementation of AI products can mean.
Step 2: Inclusion of metrics informed by the intended use case and context for deployment
- A successful AI product for clinical practice requires contextual assessment – it is entirely possible to deliver significant clinical benefits through an AI product with a mediocre technical performance.
Step 3: Consider performance beyond benchmarks in AI product evaluations
- Ensure to incorporate notions of success that reflect the hopes and expectations of various stakeholder populations, including institutional decision-makers, clinical end-users, and impacted parties such as patients.
“I think folks who are really obsessed with AUC don’t understand that you can have a crappy or modestly performing model that actually drives a huge amount of outcome benefit.”
Technical Leader
Step 4: Assess available evaluation resources
- Consider dependencies required to measure the success of various metrics, like which datasets, documentation, or personnel must be involved to successfully measure a desired metric.
- Ensure that required resources are readily available.
Step 5: Defining the unsuccessful use of an AI product
- This is an important first step toward defining how an unintended use of an AI product would be measured and what thresholds for action, such as decommissioning, should be set.
Step 6: Seek stakeholder consensus on definitions of successful and unsuccessful use of the AI product
- If clinical, administrative, and technical stakeholders do not have an opportunity to contribute to these definitions up-front they will be more likely to disagree on downstream decisions related to continued AI use. Losing buy-in from these important stakeholder groups makes AI product integrations more likely to fail.
- Prioritize success metrics to identify key factors to focus on at various stages of AI product integration.
References
Raji, Inioluwa Deborah, et al. “The fallacy of AI functionality.” 2022 ACM Conference on Fairness, Accountability, and Transparency. 2022.