We revisit Bolt's classic Put-That-There concept for modern head-mounted displays by pairing Large Language Models (LLMs) with XR sensor and tech stack. The agent fuses (i) a semantically segmented 3-D environment, (ii) live application metadata, and (iii) users' verbal, pointing, and head-gaze cues to issue JSON window-placement actions. As a result, users can manage a panoramic workspace through: (1) explicit commands ("Place Google Maps on the coffee table"), (2) deictic speech plus gestures ("Put that there"), or (3) high-level goals ("I need to send a message"). Unlike traditional explicit interfaces, our system supports one-to-many action mappings and goal-centric reasoning, allowing the LLM to dynamically infer relevant applications and layout decisions, including interrelationships across tools. This enables seamless, intent-driven interaction without manual window juggling in immersive XR environments

Bovo, R., Giunchi, D., Cascarano, P., Gonzalez, E., Gonzalez-Franco, M. (2025). Revisiting put-that-there, context aware window interactions via LLMs [10.1109/ISMAR-Adjunct68609.2025.00104].

Revisiting put-that-there, context aware window interactions via LLMs

Pasquale Cascarano;
2025

Abstract

We revisit Bolt's classic Put-That-There concept for modern head-mounted displays by pairing Large Language Models (LLMs) with XR sensor and tech stack. The agent fuses (i) a semantically segmented 3-D environment, (ii) live application metadata, and (iii) users' verbal, pointing, and head-gaze cues to issue JSON window-placement actions. As a result, users can manage a panoramic workspace through: (1) explicit commands ("Place Google Maps on the coffee table"), (2) deictic speech plus gestures ("Put that there"), or (3) high-level goals ("I need to send a message"). Unlike traditional explicit interfaces, our system supports one-to-many action mappings and goal-centric reasoning, allowing the LLM to dynamically infer relevant applications and layout decisions, including interrelationships across tools. This enables seamless, intent-driven interaction without manual window juggling in immersive XR environments
2025
2025 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)
512
517
Bovo, R., Giunchi, D., Cascarano, P., Gonzalez, E., Gonzalez-Franco, M. (2025). Revisiting put-that-there, context aware window interactions via LLMs [10.1109/ISMAR-Adjunct68609.2025.00104].
Bovo, Riccardo; Giunchi, Daniele; Cascarano, Pasquale; Gonzalez, Eric; Gonzalez-Franco, Mar
File in questo prodotto:
File Dimensione Formato  
Revisiting_put_that_there__context_aware_window_interactions_via_LLMs.pdf

embargo fino al 31/12/2027

Descrizione: paper
Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review
Licenza: Licenza per accesso libero gratuito
Dimensione 430.77 kB
Formato Adobe PDF
430.77 kB Adobe PDF   Visualizza/Apri   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1026455
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact