The potential of deep learning for medical imaging is often constrained by limited data availability. Generative models can unlock this potential by generating synthetic data that reproduces the statistical properties of real data while being more accessible for sharing. In this study, we investigated the influence of training set size on the performance of a state-of-the-art generative adversarial network, the StyleGAN2-ADA, trained on a cohort of 3,227 subjects from the OpenBHB dataset to generate 2D slices of brain MR images from healthy subjects. The quality of the synthetic images was assessed through qualitative evaluations and state-of-the-art quantitative metrics, which are provided in a publicly accessible repository. Our results demonstrate that StyleGAN2-ADA generates realistic and high-quality images, deceiving even expert radiologists while preserving privacy, as it did not memorize training images. Notably, increasing the training set size led to slight improvements in fidelity metrics. However, training set size had no noticeable impact on diversity metrics, highlighting the persistent limitation of mode collapse. Furthermore, we observed that diversity metrics, such as coverage and beta-recall, are highly sensitive to the number of synthetic images used in their computation, leading to inflated values when synthetic data significantly outnumber real ones. These findings underscore the need to carefully interpret diversity metrics and the importance of employing complementary evaluation strategies for robust assessment. Overall, while StyleGAN2-ADA shows promise as a tool for generating privacy-preserving synthetic medical images, overcoming diversity limitations will require exploring alternative generative architectures or incorporating additional regularization techniques.

Lai, M., Mascalchi, M., Tessa, C., Diciotti, S. (2025). Generating Brain MRI with StyleGAN2-ADA: The Effect of the Training Set Size on the Quality of Synthetic Images. JOURNAL OF IMAGING INFORMATICS IN MEDICINE, In press, 1-11 [10.1007/s10278-025-01536-0].

Generating Brain MRI with StyleGAN2-ADA: The Effect of the Training Set Size on the Quality of Synthetic Images

Lai M.
Primo
;
Diciotti S.
Ultimo
2025

Abstract

The potential of deep learning for medical imaging is often constrained by limited data availability. Generative models can unlock this potential by generating synthetic data that reproduces the statistical properties of real data while being more accessible for sharing. In this study, we investigated the influence of training set size on the performance of a state-of-the-art generative adversarial network, the StyleGAN2-ADA, trained on a cohort of 3,227 subjects from the OpenBHB dataset to generate 2D slices of brain MR images from healthy subjects. The quality of the synthetic images was assessed through qualitative evaluations and state-of-the-art quantitative metrics, which are provided in a publicly accessible repository. Our results demonstrate that StyleGAN2-ADA generates realistic and high-quality images, deceiving even expert radiologists while preserving privacy, as it did not memorize training images. Notably, increasing the training set size led to slight improvements in fidelity metrics. However, training set size had no noticeable impact on diversity metrics, highlighting the persistent limitation of mode collapse. Furthermore, we observed that diversity metrics, such as coverage and beta-recall, are highly sensitive to the number of synthetic images used in their computation, leading to inflated values when synthetic data significantly outnumber real ones. These findings underscore the need to carefully interpret diversity metrics and the importance of employing complementary evaluation strategies for robust assessment. Overall, while StyleGAN2-ADA shows promise as a tool for generating privacy-preserving synthetic medical images, overcoming diversity limitations will require exploring alternative generative architectures or incorporating additional regularization techniques.
2025
Lai, M., Mascalchi, M., Tessa, C., Diciotti, S. (2025). Generating Brain MRI with StyleGAN2-ADA: The Effect of the Training Set Size on the Quality of Synthetic Images. JOURNAL OF IMAGING INFORMATICS IN MEDICINE, In press, 1-11 [10.1007/s10278-025-01536-0].
Lai, M.; Mascalchi, M.; Tessa, C.; Diciotti, S.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1050834
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 0
social impact