This paper reviews the global evolution of synthetic data (SD) generation in the field of genomic cancer medicine, with an analysis of research trends from the past decade. The use of artificial intelligence, particularly machine learning and deep learning techniques has transformed this area, providing solutions to overcome the limited availability of real clinical data. Through a bibliometric analysis of a wide sample of scientific articles from SCOPUS, this study highlights the adoption of SD generation techniques in oncological applications, focusing on major methodologies and challenges. Key application areas, such as multi-omics integration (genomics, transcriptomics, and proteomics) and tumor genomic heterogeneity, emerge as fields of growing interest. Despite noise management and performance optimization challenges, advanced machine learning techniques prove essential for generating high-quality SD that reflects biological complexity. The study also identifies key open challenges, such as simulation accuracy and noise control, offering insights into future applications of SD in personalized medicine and cancer therapy.

De Nicoló, V., Frasca, M., Graziosi, A., Gazzaniga, G., Torre, D.L., Pani, A. (2025). Synthetic data generation in genomic cancer medicine: a review of global research trends in the last ten years. DISCOVER ARTIFICIAL INTELLIGENCE, 5(148), 1-31 [10.1007/s44163-025-00384-9].

Synthetic data generation in genomic cancer medicine: a review of global research trends in the last ten years

Graziosi, Agnese;Torre, Davide La;
2025

Abstract

This paper reviews the global evolution of synthetic data (SD) generation in the field of genomic cancer medicine, with an analysis of research trends from the past decade. The use of artificial intelligence, particularly machine learning and deep learning techniques has transformed this area, providing solutions to overcome the limited availability of real clinical data. Through a bibliometric analysis of a wide sample of scientific articles from SCOPUS, this study highlights the adoption of SD generation techniques in oncological applications, focusing on major methodologies and challenges. Key application areas, such as multi-omics integration (genomics, transcriptomics, and proteomics) and tumor genomic heterogeneity, emerge as fields of growing interest. Despite noise management and performance optimization challenges, advanced machine learning techniques prove essential for generating high-quality SD that reflects biological complexity. The study also identifies key open challenges, such as simulation accuracy and noise control, offering insights into future applications of SD in personalized medicine and cancer therapy.
2025
De Nicoló, V., Frasca, M., Graziosi, A., Gazzaniga, G., Torre, D.L., Pani, A. (2025). Synthetic data generation in genomic cancer medicine: a review of global research trends in the last ten years. DISCOVER ARTIFICIAL INTELLIGENCE, 5(148), 1-31 [10.1007/s44163-025-00384-9].
De Nicoló, Valentina; Frasca, Maria; Graziosi, Agnese; Gazzaniga, Gianluca; Torre, Davide La; Pani, Arianna
File in questo prodotto:
File Dimensione Formato  
unpaywall-bitstream-314229443.pdf

accesso aperto

Tipo: Versione (PDF) editoriale / Version Of Record
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione 2.73 MB
Formato Adobe PDF
2.73 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1043447
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact