Mitigating Bias to Improve Fairness in Predictive Risk Modeling Using Healthcare Data: An Analysis of Long COVID Risk
- Our results demonstrate that applying bias mitigation techniques can improve fairness while maintaining model performance, as observed through monitoring key performance and fairness metrics.
- When building predictive risk models in healthcare, researchers should carefully consider the inclusion of protected attributes, monitor key performance and fairness metrics, and implement strategies for mitigating bias where needed.
- Testing and improving algorithmic fairness will ensure that predictive models contribute to more equitable healthcare outcomes, where algorithmic findings – such as those from long COVID predictive risk models – directly influence patient care, clinical decision-making, and policy.
Background
Algorithmic bias in healthcare predictive risk models can worsen existing health inequities, making bias mitigation crucial for responsible model development and implementation. Our study examined ways to improve fairness across both univariable and multivariable protected attributes using leading bias mitigation methods and measures of performance and fairness, aiming to provide researchers with guidance for how to test and improve algorithmic fairness. We conduct our analysis using predictive risk models for long COVID, an area of significant societal interest, as a case study to demonstrate effective strategies for addressing bias in predictive modeling.
Data sources
Our study used previously developed long COVID machine learning models applied to a sample of 1.23 million participants from the National COVID Cohort Collaborative (N3C), a longitudinal EHR data repository from 80 sites in the United States with more than 8 million COVID-19 patients.
Methods
We analyzed model fairness for the protected attributes of sex, race, and ethnicity by comparing performance and fairness metrics before and after applying bias mitigation techniques. Our evaluation focused on three leading algorithmic bias mitigation methods: reweighting, MAAT (Mitigating Algorithmic Bias with Adversarial Training), and FairMask. The analysis included both single and multiple protected attributes, using performance metrics (AUROC [area under the receiver operating characteristic curve] and PRAUC [area under the precision-recall curve]) and common fairness metrics (equal opportunity, predictivity equality, and disparate impact).
Findings
Our results demonstrate that applying bias mitigation techniques can improve fairness while maintaining model performance, as observed through monitoring key performance and fairness metrics. Across a variety of bias mitigation techniques, FairMask achieved the most significant gains in fairness for single protected attribute, with minor trade-offs for other attributes. Reweighting was more effective at boosting predictive performance metrics, but when optimizing performance or fairness with respect to one specific protected attribute, the performance and fairness for other attributes varied.
Conclusion
When building predictive risk models in healthcare, researchers should carefully consider the inclusion of protected attributes, monitor key performance and fairness metrics, and implement strategies for mitigating bias where needed. Testing and improving algorithmic fairness will ensure that predictive models contribute to more equitable healthcare outcomes, where algorithmic findings – such as those from long COVID predictive risk models – directly influence patient care, clinical decision-making, and policy.
How do you apply evidence?
Take our quick four-question survey to help us curate evidence and insights that serve you.
Take our survey