The advent of epigenetic age estimation through DNA methylation analysis has transformed our understanding of biological aging, offering a more refined perspective than traditional chronological measures. Current research in DNA methylation primarily focuses on developing epigenetic clocks, which estimate biological age based on DNA methylation patterns (a.k.a methylage). Discrepancies between chronological and biological age, known as age acceleration, have been identified as early indicators of diseases such as cancer and neurodegenerative disorders [1]. Once properly estimated, age acceleration has the potential to serve as a biomarker for risk factors in many common diseases. However, the precise determination of biological age and age acceleration remains a significant challenge in this field due to both technical limitations and variability in methylation patterns across populations. In fact, to date, while simple models of epigenetic age provide valuable insights, they often lack the reliability needed for clinical applications. Literature, indeed, reports that highly accurate epigenetic clocks (i.e., able to properly recapitulate chronological age) typically fail to detect significant age accelerations [2]. This suggests that traditional epigenetic clocks tend to capture broad trends while missing critical details essential for translational medicine. More recently, these limitations have been addressed by shifting the focus from accurate biological/chronological age prediction to the ambitious goal of predicting mortality risk and healthspan. This second generation of DNA methylation-based epigenetic clocks incorporates, in fact, additional lifestyle-associated indicators, such as smoking pack-years, and proves to be more sensitive to age acceleration than traditional biological age predictions [3,4]. While these newer models are still often based on linear frameworks, incorporating these additional covariates helps these new-generation clocks identify factors impacting age acceleration that may be obscured by linear models that rely solely on DNA methylation data. However, this increased sensitivity to age acceleration comes at the cost of reduced accuracy in predicting chronological age, as well as diminished reliability, since lifestyle metadata is often incomplete or unavailable. Upon closer inspection, these performances are not entirely surprising, as all such clocks are based on linear models, which are limited in their ability to capture complex relationships within the data. Interestingly, recent work [5] has shown how the non-linear epigenetic pacemaker (EPM) clock is able to identify significant associations between polybrominated biphenyl (PBB) exposure and accelerated epigenetic aging, a validated association completely overlooked by linear clocks. This suggests that non-linear models may be more appropriate to capture the complexity of the relation between aging and methylation. Building further on these observations, we believe that exploiting novel non-linear regression approaches, such as generative artificial intelligence (AI) models, along with the integration of lifestyle-related features, holds significant promise for enhancing the predictive power of DNA methylation-based models. These models can capture complex, non-linear relationships within the data that traditional linear models may overlook and have the potential to identify subtle patterns and interactions between DNA methylation and various lifestyle factors, leading to more accurate predictions of biological age and age acceleration. Incorporating lifestyle-related features, such as diet, physical activity, smoking habits and environmental exposures, can provide a more holistic view of the factors influencing epigenetic aging. By integrating these variables into AI models, researchers can develop more sophisticated clocks that not only predict biological age but also offer insights into specific lifestyle modifications that could mitigate age acceleration and reduce disease. Moreover, AI-driven models can continuously learn and improve as more data become available. This adaptive learning capability is crucial for personalized medicine, as it allows the models to stay up-to-date with the latest scientific findings and population-specific trends. For instance, AI models can be trained on diverse datasets from different demographic groups, ensuring that the predictions are relevant and accurate across various populations. To date, the main limitation in this direction are twofold: the opacity of deep-learning algorithms (requiring explainable AI to be developed, as in a recent immunological clock [6]) and the lack of sufficiently large and diverse datasets with associated lifestyle metadata to robustly train AI-based models. High-quality, longitudinal datasets that track individuals’ DNA methylation patterns and lifestyle factors over time are essential for developing and validating these advanced models [7]. Further novel approaches like generative models and transfer learning [8] can be leveraged in this research area. Generative models, by learning features from existing populations, can simulate new data that mimic the characteristics of the original population: shall this reference population be small, generative models are a means to enlarge the representative dataset. Transfer learning, on the other hand, allows AI models to leverage data from populations or systems that are similar to the one of interest (proxy), but much more abundant, thus, again, increasing the available dataset. Generally, however, data from proxy systems are used during the initial training phase. In this way, the limited amount of data from the real system can be used and is generally sufficient to fine- tune the algorithm, thereby improving performance despite constraints on data diversity. Anyways, at the moment, collaborative efforts between research institutions, healthcare providers, and public health organizations are needed to gather and share multi-longitudinal data to build proper training datasets. Addressing these challenges will be critical for translating cutting-edge research into practical tools for personalized medicine, with the ambition to improve health outcomes and extend healthspan. Beyond the limitations observed above and discussed at large in the literature [9], other applications deserve discussion. In addition to prevention, in fact, methylage could be tested to assess the effectiveness of therapies. In particular, physical therapies exploit the ability of our cells to transduce (mechanic, optic, magnetic, electric) signals (or combinations thereof) into biochemical signals and effectors. For reasons that are not fully elucidated, the efforts posed to obtain evidence on the effectiveness of such therapies is quite different from the vast majority of pharmacological research [10], despite the recent birth of disciplines like bioelectronic medicine [11] or mechanopharmacology [12]. Physical therapies are known to interfere with wound healing [13] and epithelial mesenchymal transition, where numerous modifications occur, including systemic methylation [14]. Yet, to the best of our knowledge, although exploration of the effect of exercise has been pursued [15], the usage of this approach to assess reproducibility and dosage of physical therapies (beyond lifestyle habits, hence including physical exercise) is only in its infancy. Physical therapies lack of systematic exploration, and exploiting cutting-edge approaches like assessment of methylage could be beneficial to promote evidence and limitations as well as a means to quantify stimuli and dose–response relationships. Finally, it is also worth noting that the choices on the computational aspects of epigenetic age are likely to have an impact on other research and application areas, beyond the clinic. In particular, given the interest in the correlation of methylomic changes with a variety of factors, the relevance of methylage should be promoted (and warned against) accordingly. Indeed, blooming correlations are being drawn in a great variety of research areas. For instance, the importance of methylation changes under psychological [16], socioeconomic [17] and environmental stress [18] is well recognized in literature, but not always accompanied by information on chronological age as confounding factor [19]. Indeed, focus on multi-modal analyses is likely to be an area where attention should be strongly paid in the upcoming research on methylomic. In fact, given the political potential behind these types of analyses, robustness of the biological datum first, and strong interdisciplinary interpretation then, are crucial, to ensure evidence-based policies are designed [20]. Methylage is therefore a research topic with ample room to continue to challenge scientists in search of optimal solutions transdisciplinary, a worthwhile quest given the socioeconomic and specifically clinical potential behind its application.
Nardini, C., Di Lena, P. (2024). Predictive power of epigenetic age – opportunities and cautions. EPIGENOMICS, 1, 1-3 [10.1080/17501911.2024.2433409].
Predictive power of epigenetic age – opportunities and cautions
Nardini, Christine
Co-primo
Writing – Review & Editing
;Di Lena, Pietro
Co-primo
Writing – Review & Editing
2024
Abstract
The advent of epigenetic age estimation through DNA methylation analysis has transformed our understanding of biological aging, offering a more refined perspective than traditional chronological measures. Current research in DNA methylation primarily focuses on developing epigenetic clocks, which estimate biological age based on DNA methylation patterns (a.k.a methylage). Discrepancies between chronological and biological age, known as age acceleration, have been identified as early indicators of diseases such as cancer and neurodegenerative disorders [1]. Once properly estimated, age acceleration has the potential to serve as a biomarker for risk factors in many common diseases. However, the precise determination of biological age and age acceleration remains a significant challenge in this field due to both technical limitations and variability in methylation patterns across populations. In fact, to date, while simple models of epigenetic age provide valuable insights, they often lack the reliability needed for clinical applications. Literature, indeed, reports that highly accurate epigenetic clocks (i.e., able to properly recapitulate chronological age) typically fail to detect significant age accelerations [2]. This suggests that traditional epigenetic clocks tend to capture broad trends while missing critical details essential for translational medicine. More recently, these limitations have been addressed by shifting the focus from accurate biological/chronological age prediction to the ambitious goal of predicting mortality risk and healthspan. This second generation of DNA methylation-based epigenetic clocks incorporates, in fact, additional lifestyle-associated indicators, such as smoking pack-years, and proves to be more sensitive to age acceleration than traditional biological age predictions [3,4]. While these newer models are still often based on linear frameworks, incorporating these additional covariates helps these new-generation clocks identify factors impacting age acceleration that may be obscured by linear models that rely solely on DNA methylation data. However, this increased sensitivity to age acceleration comes at the cost of reduced accuracy in predicting chronological age, as well as diminished reliability, since lifestyle metadata is often incomplete or unavailable. Upon closer inspection, these performances are not entirely surprising, as all such clocks are based on linear models, which are limited in their ability to capture complex relationships within the data. Interestingly, recent work [5] has shown how the non-linear epigenetic pacemaker (EPM) clock is able to identify significant associations between polybrominated biphenyl (PBB) exposure and accelerated epigenetic aging, a validated association completely overlooked by linear clocks. This suggests that non-linear models may be more appropriate to capture the complexity of the relation between aging and methylation. Building further on these observations, we believe that exploiting novel non-linear regression approaches, such as generative artificial intelligence (AI) models, along with the integration of lifestyle-related features, holds significant promise for enhancing the predictive power of DNA methylation-based models. These models can capture complex, non-linear relationships within the data that traditional linear models may overlook and have the potential to identify subtle patterns and interactions between DNA methylation and various lifestyle factors, leading to more accurate predictions of biological age and age acceleration. Incorporating lifestyle-related features, such as diet, physical activity, smoking habits and environmental exposures, can provide a more holistic view of the factors influencing epigenetic aging. By integrating these variables into AI models, researchers can develop more sophisticated clocks that not only predict biological age but also offer insights into specific lifestyle modifications that could mitigate age acceleration and reduce disease. Moreover, AI-driven models can continuously learn and improve as more data become available. This adaptive learning capability is crucial for personalized medicine, as it allows the models to stay up-to-date with the latest scientific findings and population-specific trends. For instance, AI models can be trained on diverse datasets from different demographic groups, ensuring that the predictions are relevant and accurate across various populations. To date, the main limitation in this direction are twofold: the opacity of deep-learning algorithms (requiring explainable AI to be developed, as in a recent immunological clock [6]) and the lack of sufficiently large and diverse datasets with associated lifestyle metadata to robustly train AI-based models. High-quality, longitudinal datasets that track individuals’ DNA methylation patterns and lifestyle factors over time are essential for developing and validating these advanced models [7]. Further novel approaches like generative models and transfer learning [8] can be leveraged in this research area. Generative models, by learning features from existing populations, can simulate new data that mimic the characteristics of the original population: shall this reference population be small, generative models are a means to enlarge the representative dataset. Transfer learning, on the other hand, allows AI models to leverage data from populations or systems that are similar to the one of interest (proxy), but much more abundant, thus, again, increasing the available dataset. Generally, however, data from proxy systems are used during the initial training phase. In this way, the limited amount of data from the real system can be used and is generally sufficient to fine- tune the algorithm, thereby improving performance despite constraints on data diversity. Anyways, at the moment, collaborative efforts between research institutions, healthcare providers, and public health organizations are needed to gather and share multi-longitudinal data to build proper training datasets. Addressing these challenges will be critical for translating cutting-edge research into practical tools for personalized medicine, with the ambition to improve health outcomes and extend healthspan. Beyond the limitations observed above and discussed at large in the literature [9], other applications deserve discussion. In addition to prevention, in fact, methylage could be tested to assess the effectiveness of therapies. In particular, physical therapies exploit the ability of our cells to transduce (mechanic, optic, magnetic, electric) signals (or combinations thereof) into biochemical signals and effectors. For reasons that are not fully elucidated, the efforts posed to obtain evidence on the effectiveness of such therapies is quite different from the vast majority of pharmacological research [10], despite the recent birth of disciplines like bioelectronic medicine [11] or mechanopharmacology [12]. Physical therapies are known to interfere with wound healing [13] and epithelial mesenchymal transition, where numerous modifications occur, including systemic methylation [14]. Yet, to the best of our knowledge, although exploration of the effect of exercise has been pursued [15], the usage of this approach to assess reproducibility and dosage of physical therapies (beyond lifestyle habits, hence including physical exercise) is only in its infancy. Physical therapies lack of systematic exploration, and exploiting cutting-edge approaches like assessment of methylage could be beneficial to promote evidence and limitations as well as a means to quantify stimuli and dose–response relationships. Finally, it is also worth noting that the choices on the computational aspects of epigenetic age are likely to have an impact on other research and application areas, beyond the clinic. In particular, given the interest in the correlation of methylomic changes with a variety of factors, the relevance of methylage should be promoted (and warned against) accordingly. Indeed, blooming correlations are being drawn in a great variety of research areas. For instance, the importance of methylation changes under psychological [16], socioeconomic [17] and environmental stress [18] is well recognized in literature, but not always accompanied by information on chronological age as confounding factor [19]. Indeed, focus on multi-modal analyses is likely to be an area where attention should be strongly paid in the upcoming research on methylomic. In fact, given the political potential behind these types of analyses, robustness of the biological datum first, and strong interdisciplinary interpretation then, are crucial, to ensure evidence-based policies are designed [20]. Methylage is therefore a research topic with ample room to continue to challenge scientists in search of optimal solutions transdisciplinary, a worthwhile quest given the socioeconomic and specifically clinical potential behind its application.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.