Sustain improved outcomes
Why does this matter?
Health AI products’ performance is not always steady. Improvements achieved through AI products can rapidly vanish, even if initial results are impressive. AI products are notoriously brittle, and performance can be undermined by the downstream effects of decisions made by a broad range of stakeholders. While monitoring the performance of the technology and the work environment are incredibly important, it’s only a matter of time before something changes, and you’re trying to make up lost ground.
How to do this?
Step 1: Prerequisites to ensure that you can sustain improved outcomes
- The outcome you are trying to restore must be sustainable. For example, there may be external forces or internal changes that alter the opportunity for improvement. There are also scenarios where the problem addressed by the AI product is transient, such as a temporary public health emergency or supply chain constraint. For more about ways to define outcome measures related to AI product use, see define performance targets and define successful use.
- The AI product must have previously achieved outcomes that meet or exceed performance targets specified in identify and mitigate risks. If the rollout of the AI product does not lead to outcomes that surpass pre-specified target goals, then you may need to consider decommissioning the product.
- Future monitoring or audits must occur after you intervene to sustain outcomes to allow for assessing the effectiveness of any actions you take to improve outcomes.
- There must be open and transparent communication channels between the AI product developer, front-line clinicians who use the AI product in practice, and operational leaders responsible for outcomes in the implementation setting. If the AI product is no longer effective, it’s easy for people to point fingers. The tension between these three stakeholder groups–AI developers, end-user, and managers–will make it harder to address problems. Ultimately, these groups of stakeholders must be aligned to work together.
- No single stakeholder group can guarantee continued outcomes of an AI product used in clinical care. There are too many moving parts, and stakeholders must be prepared and equipped to step in and support improvement efforts when needed.
Step 2: Make changes to the AI product integration based on pre-specified performance targets
- While developing measures of success for product integration, the healthcare delivery organization should pre-specify performance targets and objectives for success at define performance targets and define successful use. During monitoring, if the AI product performance fails to meet targets, action is needed.
- When evaluating product safety and efficacy prior to clinical use, monitoring target thresholds should be set at identify and mitigate risks. During monitoring, if AI product performance fails to meet performance thresholds, action is needed.
Step 3: Determine what type of shift changed AI product performance
- The action required to sustain performance or prevent potential harm depends on the cause of the problem. Given the dynamic nature of healthcare delivery, there is also no guarantee that changes in performance can be attributed to a single cause. You must evaluate all types of shifts. See sustain improved outcomes table to identify potential causes of changed performance and determine the relevant actions.
Type of shift | Description and example | Approach to identify | Monitoring dependency | AI product intervention | Work environment intervention |
---|---|---|---|---|---|
Population drift | Characteristics of the target patient population changes COVID-19 pandemic affected the cause of inpatient deterioration | Observe changes in values and distribution of AI model inputs Confirm that there are no environmental changes that alter the way individuals in the target population utilize healthcare (e.g., changes in location of service delivery, policy changes at population level) – Identified through interviews and surveys of AI product users and individuals within target population Confirm that there are no changes in the way the condition targeted by the AI product is diagnosed or managed Identified through interviews and surveys of AI product users and domain experts Confirm that there are no changes in the way that data elements ingested by the AI product are represented in the data sources – Identified through metadata monitoring across data sources | Identified through monitoring of AI solution (see the guide on monitor AI performance ) | If AI product performance is within pre-specific performance bounds, continue using If AI product performance falls out of pre-specified performance bounds, consider updating or decommissioning (see the guide to determine if updating or decommissioning is necessary) Identify the source of population drift and monitor the new population state. When the population state has stabilized, confirmed through stable values and distribution of model inputs, revisit the appropriateness of AI as an approach to address the problem (see the guide on determining the suitability of technical approaches ). | |
Patient behavior drift | – Systemic factors alter the way that patients seek healthcare – COVID-19 pandemic caused many individuals with chronic disease to stop seeking care | Confirm environmental changes that alter the way individuals in the target population utilize healthcare (e.g., changes in location of service delivery, policy changes at population level) Identified through interviews and surveys of AI product users and individuals within target population | Identified through monitoring of AI solution (see the guide on monitoring AI performance ) | If AI product performance is within pre-specific performance bounds, continue using If AI product performance falls out of pre-specified performance bounds, consider updating or decommissioning (see the guide to determine if updating or decommissioning is necessary) | May need to adapt approach to service delivery to ensure that patients are able to appropriately access care. If new patient behavior is optimal (in terms of improving patient outcomes), may need to adapt workflow (see the guide to design and test workflow for clinicians) or change scope of use of AI product (see the guide to define the role of AI). |
Practice of medicine drift | – The way that medicine is practiced changes. There is a shift from a prior standard of care to a new standard of care. – Rather than waiting for blood culture results to return after 48 hours to narrow antibiotic treatment, adapt treatment rapidly using novel Biofire PCR testing | Confirm that there are changes in the way the condition targeted by the AI product is diagnosed or managed Identified through interviews and surveys of AI product users and domain experts | Identified through monitoring of AI solution (see the guide on monitoring AI performance ) | If AI product performance is within pre-specific performance bounds, continue using. Start gathering data for newly available data elements that reflect new clinical practice. Consider updating model when sufficient data is available (see the guide to determine if updating or decommissioning is necessary). If AI product performance falls out of pre-specified performance bounds, consider updating or decommissioning (see the guide to determine if updating or decommissioning is necessary). | If the change in standard of care is not optimal, revert to prior clinical practice. Ensure that AI product users are trained and educated on best clinical practice and consider re-initiating information dissemination regarding AI product use (see the guide on disseminating information to end users) If change in standard of care is optimal, update scope of use (see the guide to define the role of AI). workflow (see the guide to design and test workflow for clinicians). Conduct new round of model validation to ensure that AI product should remain clinically integrated (see the guide on determining if AI should be integrated). |
Data source drift | – The way that data is represented in an AI product data source changes. – New way of ordering or measuring the same clinical concept (e.g., new device for measuring blood pressure) | Confirm that there are changes in the way that data elements ingested by the AI product are represented in the data sources Identified through metadata monitoring across data sources | Identified through monitoring of AI solution (see the guide on monitoring AI performance ) | First, update metadata mappings to ensure that new representations of data elements are appropriately normalized and fed into AI product. Then, if AI product performance is within pre-specific performance bounds, continue using If AI product performance falls out of pre-specified performance bounds, consider updating or decommissioning (see the guide to determine if updating or decommissioning is necessary). | |
Clinician staffing drift | – Characteristics of the clinicians who provide services using the AI product change – The same type of service may be performed in a non-invasive fashion (shifting service delivery from surgery to IR or cardiology, etc); same type of service can be shifted from a physician to an advanced practice provider. | Confirm that there are changes in the types of clinicians who directly use or are affected by the AI product Identified through interviews and surveys of AI product users | Identified through monitoring of work environment (see the guide on monitoring work environment) | If the staffing change is not optimal or unintended, revert to prior scope of use (see the guide to define the role of AI) and workflow (see the guide to design and test workflow for clinicians). If staffing change is optimal, update scope of use (see the guide to define the role of AI) and workflow (see the guide to design and test workflow for clinicians). Conduct new round of model validation to ensure that AI product should remain clinically integrated (see the guide on determining if AI should be integrated). | |
Clinician workflow drift | – The way in which clinicians use the AI product in the workflow can change. – Rather than make a phone call to ED physicians, RRT nurses using sepsis watch sent asynchronous text messages. | Confirm that there are changes in the workflow affecting the way that the AI product is used in practice Identified through interviews and surveys of AI product users | Identified through monitoring of work environment (see the guide on monitoring work environment) | If the workflow change is not optimal or unintended, revert to prior scope of use (see the guide to define the role of AI) and workflow (see the guide to design and test workflow for clinicians). Ensure that AI product users are trained and educated on scope of use and workflow and consider re-initiating information dissemination regarding AI product use (see the guide on disseminating information to end users). If workflow change is optimal, update scope of use (see the guide to define the role of AI) and workflow (see the guide to design and test workflow for clinicians). Conduct new round of model validation to ensure that AI product should remain clinically integrated (see the guide on determining if AI should be integrated). | |
Clinician management drift | – Incentive structures in the environment in which clinicians use the AI product may change. – Incentive payments added to help increase use of advanced care planning by hospitalists. | Confirm that there are changes in the incentive structures or workplace priorities that affect clinicians who use the AI product Identified through interviews and surveys of AI product users and the managers of AI product users | Identified through monitoring of work environment (see the guide on monitoring work environment) | If the management change is not optimal or unintended, revert to prior incentive structures or workplace priorities. If management change is optimal, update scope of use (see the guide to define the role of AI) and workflow(see the guide to design and test workflow for clinicians) to align with new incentive structures and workplace priorities. Conduct a new round of model validation to ensure that AI product should remain clinically integrated (see the guide on determining if AI should be integrated). |
“There’s probably a notion that if the model starts to drift, or you realize you need to retune it, that you’ve gone through a lot of analysis to understand that it’s not natural drift, or even if it is natural drift, then you have to reset your threshold. Ultimately, you need to get the band back together, you need the people who designed the model, you need the people who implemented the model, you need the people who signed off on it. They need to get together and take a look [to evaluate if] the model is not drifting, the scores are drifting”
Technical Expert