NCHR Comment on Regulating Artificial Intelligence/Machine Learning-Based Software

National Center for Health Research, June 3, 2019

National Center for Health Research’s Public Comments on “Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) – Discussion Paper and Request for Feedback”

The National Center for Health Research is a nonprofit think tank that conducts, analyzes, and scrutinizes research, policies, and programs on a range of issues related to health and safety. We do not accept funding from companies that make products that are the subject of our work.

This discussion paper describes the FDA’s proposed framework for regulating Software as a Medical Device (SaMD) containing Artificial Intelligence/Machine Learning (AI/ML). AI/ML-based SaMD is a type of software device that incorporates special computer techniques (algorithms) capable of detecting relevant medical patterns from large amounts of data. These very sophisticated pattern recognition capabilities (AI/ML) have broad potential healthcare applications, including making recommendations to healthcare providers and patients about the diagnosis, prognosis, treatment, or prevention of disease.¹ The complexity, scalability, and broad scope of these new technologies thus raise important issues regarding their safety and effectiveness as it pertains to patients’ health.

Recommendations and Issues of Concern:

Ensure that Good Machine Learning Practices (GMLP) clearly describe the datasets being used to build AI/ML-related SaMD. In general, AI/ML-based SaMD products are created by using large datasets to “train” software to find relevant patterns, and then applying that knowledge to new patients. As a result, the safety and effectiveness of AI/ML-based SaMD products will strongly depend on the quality of these datasets. There are growing concerns that the datasets used to build SaMD products are not representative of the diverse populations they are meant to help, and underlying biases could manifest as inaccurate (or even harmful) clinical recommendations made by the software.^1,2 Indeed, data and analyses supporting medical device approval often does not fully reflect the diversity of the target patient populations,³ and it is unlikely that this would be different for data used to build AI/ML-based SaMD products. The discussion paper mentions that GMLP should include an “appropriate level of transparency (clarity) of the output and the algorithm aimed at users.” However, a focus solely on SaMD outputs and algorithms is not sufficient. Like the ingredients and nutrition information listed on food packaging, detailed information about the data used as inputs for AI/ML-based SaMD will help patients and providers make better decisions about these products. We therefore recommend that Good Machine Learning Practices also require transparency of the underlying data used to build or refine any AI/ML-based SaMD product.

Clearly explain how the proposed regulatory framework will protect patients given the scale, scope, and complexity of recommendations made by AI/ML-based SaMD. When designed correctly, AI/ML-based SaMD can be deployed to help a diverse group of providers, patients, and organizations. However, several key concerns must be addressed when software problems cause AI/ML-based recommendations to become inaccurate or even dangerous to patients:

Prior research has shown that even simple problems can transform low-risk software into serious safety threats that lead to high-risk product recalls.⁴ For highly scalable technologies like SaMD, there is thus the potential to quickly help or swiftly harm large populations of patients. We recommend that the framework’s real-world performance monitoring require specific mechanisms to identify and stop flawed AI/ML-based SaMD from continued use by healthcare providers after risks are detected.
Most clinical decisions are documented in the Electronic Health Record (EHR), yet current EHRs have known usability issues, data interoperability challenges, and are no longer under the regulatory authority of the FDA (per the 21^st Century Cures Act).⁴ Within this context, we recommend that the proposed framework require manufacturers to explain, if their AI/ML-based SaMD later becomes a safety risk, how healthcare providers can quickly track down past flawed AI/ML product recommendations and determine if their patients were affected.
The framework proposes a total product lifecycle regulatory approach which is “particularly important for AI/ML-based SaMD due to its ability to adapt and improve from real-world use.” However, it is unclear if the proposed framework for AI/ML-based SaMD would be compatible with existing recall procedures or if new processes must be developed. We recommend that the framework clearly describe how a typical “recall” would be practically implemented for AI/ML-based SaMD products found to be a serious risk to patients.

The National Center for Health Research can be reached at info@center4research.org.

References

Rajkomar A, Dean J, Kohane I. Machine learning in medicine. New England Journal of Medicine. 2019;380:1347–1358. https://www.nejm.org/doi/full/10.1056/NEJMra1814259
Char DS, Shah NH, Magnus D. Implementing machine learning in health care — Addressing ethical challenges. New England Journal of Medicine. 2018;378:981–983. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5962261/
Fox-Rawlings S, Gottschalk L, Doamekpor L, et al. Diversity in medical device clinical trials: Do we know what works for which patients? Milbank Quarterly. 2018;96:499–529. https://onlinelibrary.wiley.com/doi/abs/10.1111/1468-0009.12344
Ronquillo JG, Zuckerman DM. Software-related recalls of health information technology and other medical devices: Implications for FDA regulation of digital health. Milbank Quarterly. 2017;95:535–553. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5594275/