Long-running serialized television series present audiences with a significant cognitive challenge: tracking complex, interwoven storylines across hundreds of hours of content. While "Previously on..." recaps serve as essential narrative devices to manage this cognitive load, their creation remains a labor-intensive manual process requiring substantial artistic and editorial judgment. This challenge is particularly evident in medical dramas, which often are long and contain narrative arcs that span multiple seasons. This paper presents a comprehensive, end-to-end system for automated episodic video recap generation, moving beyond single-episode summarization to a model of cross-episodic memory retrieval. Our system directly processes raw audiovisual content, using a multi-stage pipeline that first performs speaker attribution from auditory data, then employs a multi-agent system to populate a long-term, structured narrative memory. Finally, a dedicated generative stage queries this extensive memory to select and assemble the most salient historical Events needed to understand the context of a new episode. Building upon our previous work, this research transitions our text-only analytical model into a practical, multimodal generative engine. By simulating narrative memory, our system offers a scalable solution for automating recap generation in serialized media.
Balestri, R., Degli Esposti, M., Pescatore, G. (2026). “Previously On…”: Toward Automating Episodic Recaps through LLM-Based Semantic Narrative Analysis in Medical Drama. Milano : Vita e Pensiero.
“Previously On…”: Toward Automating Episodic Recaps through LLM-Based Semantic Narrative Analysis in Medical Drama
Roberto Balestri
Primo
;Mirko Degli EspostiSecondo
;Guglielmo PescatoreUltimo
2026
Abstract
Long-running serialized television series present audiences with a significant cognitive challenge: tracking complex, interwoven storylines across hundreds of hours of content. While "Previously on..." recaps serve as essential narrative devices to manage this cognitive load, their creation remains a labor-intensive manual process requiring substantial artistic and editorial judgment. This challenge is particularly evident in medical dramas, which often are long and contain narrative arcs that span multiple seasons. This paper presents a comprehensive, end-to-end system for automated episodic video recap generation, moving beyond single-episode summarization to a model of cross-episodic memory retrieval. Our system directly processes raw audiovisual content, using a multi-stage pipeline that first performs speaker attribution from auditory data, then employs a multi-agent system to populate a long-term, structured narrative memory. Finally, a dedicated generative stage queries this extensive memory to select and assemble the most salient historical Events needed to understand the context of a new episode. Building upon our previous work, this research transitions our text-only analytical model into a practical, multimodal generative engine. By simulating narrative memory, our system offers a scalable solution for automating recap generation in serialized media.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



