There are several situations where it would be convenient if a quantity of interest essential to support a medical or regulatory decision could be predicted as a function of other measurable quantities rather than measured experimentally. To do so, we need to ensure that in all practical cases, the predicted value does not differ from what we would measure experimentally by more than an acceptable threshold, defined by the context in which that quantity of interest is used in the decision-making process. This is called Credibility Assessment. Initial work, which guided the elaboration of the first technical standard on the topic (ASME VV-40:2018), focused on predictive models built from available mechanistic knowledge of the phenomenon of interest. For this class of predictive models, sometimes called biophysical models, a credibility assessment practice based on the so-called verification, Validation, Uncertainty, Quantification and Applicability (VVUQA) analysis is accepted. Through theoretical considerations, this position paper aims to summarise a complex debate on whether such an approach can be extended to predictive models built without any mechanistic knowledge (machine learning (ML) predictors). We conclude that the VVUQA can be extended to ML-based predictors; however, since there is no certainty that the features used to predict the quantity of interest are necessary and sufficient, according to the VVUQA framework, such credibility assessment is limited to the test sets used for the validation studies. This calls for a Total Product Life Cycle approach, where periodic retesting of ML-based predictors is part of post-marketing surveillance to ensure that no “unknown bias” may play a role.
Viceconti, M., Lanubile, F., Carbonaro, A., Mellone, S., Curreli, C., Aldieri, A., et al. (2025). Extending Credibility Assessment of In Silico Medicine Predictors to Machine Learning Predictors. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, Early access, 1-9 [10.1109/JBHI.2025.3552320].
Extending Credibility Assessment of In Silico Medicine Predictors to Machine Learning Predictors
Marco Viceconti;Antonella Carbonaro;Sabato Mellone;Cristina Curreli;Alessandra Aldieri;Saverio Ranciati;Angela Montanari
2025
Abstract
There are several situations where it would be convenient if a quantity of interest essential to support a medical or regulatory decision could be predicted as a function of other measurable quantities rather than measured experimentally. To do so, we need to ensure that in all practical cases, the predicted value does not differ from what we would measure experimentally by more than an acceptable threshold, defined by the context in which that quantity of interest is used in the decision-making process. This is called Credibility Assessment. Initial work, which guided the elaboration of the first technical standard on the topic (ASME VV-40:2018), focused on predictive models built from available mechanistic knowledge of the phenomenon of interest. For this class of predictive models, sometimes called biophysical models, a credibility assessment practice based on the so-called verification, Validation, Uncertainty, Quantification and Applicability (VVUQA) analysis is accepted. Through theoretical considerations, this position paper aims to summarise a complex debate on whether such an approach can be extended to predictive models built without any mechanistic knowledge (machine learning (ML) predictors). We conclude that the VVUQA can be extended to ML-based predictors; however, since there is no certainty that the features used to predict the quantity of interest are necessary and sufficient, according to the VVUQA framework, such credibility assessment is limited to the test sets used for the validation studies. This calls for a Total Product Life Cycle approach, where periodic retesting of ML-based predictors is part of post-marketing surveillance to ensure that no “unknown bias” may play a role.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.