Purpose: To test whether internal memory states from a medical foundational segmentation model can serve as compact, mask-aware embeddings for predicting progression-free survival (PFS) in multiple myeloma (MM) from whole-body [18F]FDG PET/CT, and how late fusion of PET, CT, and clinical data enhances prognostic performance. Methods: We analyzed 227 newly diagnosed MM patients with PET/CT and clinical data. For two regions of interest (spine-dilated and full skeleton), we prompted MedSAM2 slice-wise using mask-derived bounding boxes and cached the final spatio-temporal memory tensor per modality. We compared two downsampling strategy to obtain per-study embeddings: channel & times;memory averaging with a small CNN head, and depth-attention pooling. PET and CT embeddings were combined by late fusion and passed to a DeepSurv head. We evaluated image-only and multimodal (image+clinical) models with stratified 5-fold cross-validation. The primary endpoint was Harrell's c-index (mean +/- SE across folds). Results: Image-only models using the averaging downsampler achieved up to 0.659 +/- 0.015 c-index (PET, spine-dilated), comparable to baseline radiomics results. Multimodal models improved discrimination to 0.710 +/- 0.032 (CT, spine-dilated), with similar performance for other PET/CT+clinical variants (0.703-0.710), improving clinical-only baselines similar to 6.5%. Averaging consistently outperformed depth-attention; concatenation and gated fusion performed comparably. PET outperformed CT within the same mask in image-only settings. Conclusion: Mask-aware memory embeddings extracted from a foundational segmentation model provide effective, data-efficient imaging biomarkers for MM PFS and, when fused with routine clinical covariates, significantly improve risk stratification over clinical-only or radiomics baselines. This offers a practical path to prognostic modeling on small medical cohorts without feature design.

Guinea-Pérez, J., Uribe, S., Peluso, S., Castellani, G., Nanni, C., Álvarez, F. (2026). Mask-aware foundational-model embeddings for 18F-FDG-PET/CT prognosis in multiple myeloma. COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 130, --- [10.1016/j.compmedimag.2026.102752].

Mask-aware foundational-model embeddings for 18F-FDG-PET/CT prognosis in multiple myeloma

Peluso, Sara;Castellani, Gastone;Nanni, Cristina;
2026

Abstract

Purpose: To test whether internal memory states from a medical foundational segmentation model can serve as compact, mask-aware embeddings for predicting progression-free survival (PFS) in multiple myeloma (MM) from whole-body [18F]FDG PET/CT, and how late fusion of PET, CT, and clinical data enhances prognostic performance. Methods: We analyzed 227 newly diagnosed MM patients with PET/CT and clinical data. For two regions of interest (spine-dilated and full skeleton), we prompted MedSAM2 slice-wise using mask-derived bounding boxes and cached the final spatio-temporal memory tensor per modality. We compared two downsampling strategy to obtain per-study embeddings: channel & times;memory averaging with a small CNN head, and depth-attention pooling. PET and CT embeddings were combined by late fusion and passed to a DeepSurv head. We evaluated image-only and multimodal (image+clinical) models with stratified 5-fold cross-validation. The primary endpoint was Harrell's c-index (mean +/- SE across folds). Results: Image-only models using the averaging downsampler achieved up to 0.659 +/- 0.015 c-index (PET, spine-dilated), comparable to baseline radiomics results. Multimodal models improved discrimination to 0.710 +/- 0.032 (CT, spine-dilated), with similar performance for other PET/CT+clinical variants (0.703-0.710), improving clinical-only baselines similar to 6.5%. Averaging consistently outperformed depth-attention; concatenation and gated fusion performed comparably. PET outperformed CT within the same mask in image-only settings. Conclusion: Mask-aware memory embeddings extracted from a foundational segmentation model provide effective, data-efficient imaging biomarkers for MM PFS and, when fused with routine clinical covariates, significantly improve risk stratification over clinical-only or radiomics baselines. This offers a practical path to prognostic modeling on small medical cohorts without feature design.
2026
Guinea-Pérez, J., Uribe, S., Peluso, S., Castellani, G., Nanni, C., Álvarez, F. (2026). Mask-aware foundational-model embeddings for 18F-FDG-PET/CT prognosis in multiple myeloma. COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 130, --- [10.1016/j.compmedimag.2026.102752].
Guinea-Pérez, Javier; Uribe, Silvia; Peluso, Sara; Castellani, Gastone; Nanni, Cristina; Álvarez, Federico
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1055771
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact