CRIS Current Research Information System

We revisit Bolt's classic Put-That-There concept for modern head-mounted displays by pairing Large Language Models (LLMs) with XR sensor and tech stack. The agent fuses (i) a semantically segmented 3-D environment, (ii) live application metadata, and (iii) users' verbal, pointing, and head-gaze cues to issue JSON window-placement actions. As a result, users can manage a panoramic workspace through: (1) explicit commands ("Place Google Maps on the coffee table"), (2) deictic speech plus gestures ("Put that there"), or (3) high-level goals ("I need to send a message"). Unlike traditional explicit interfaces, our system supports one-to-many action mappings and goal-centric reasoning, allowing the LLM to dynamically infer relevant applications and layout decisions, including interrelationships across tools. This enables seamless, intent-driven interaction without manual window juggling in immersive XR environments

Bovo, R., Giunchi, D., Cascarano, P., Gonzalez, E., Gonzalez-Franco, M. (2025). Revisiting put-that-there, context aware window interactions via LLMs [10.1109/ISMAR-Adjunct68609.2025.00104].

Revisiting put-that-there, context aware window interactions via LLMs

Riccardo Bovo;Daniele Giunchi;Pasquale Cascarano;Eric Gonzalez;Mar Gonzalez-Franco

2025

Abstract

We revisit Bolt's classic Put-That-There concept for modern head-mounted displays by pairing Large Language Models (LLMs) with XR sensor and tech stack. The agent fuses (i) a semantically segmented 3-D environment, (ii) live application metadata, and (iii) users' verbal, pointing, and head-gaze cues to issue JSON window-placement actions. As a result, users can manage a panoramic workspace through: (1) explicit commands ("Place Google Maps on the coffee table"), (2) deictic speech plus gestures ("Put that there"), or (3) high-level goals ("I need to send a message"). Unlike traditional explicit interfaces, our system supports one-to-many action mappings and goal-centric reasoning, allowing the LLM to dynamically infer relevant applications and layout decisions, including interrelationships across tools. This enables seamless, intent-driven interaction without manual window juggling in immersive XR environments

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Titolo del volume
	
				2025 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)
			
	Pagina iniziale
	
				512
			
	Pagina finale
	
				517
			
	Codice DOI
	
				https://dx.doi.org/10.1109/ISMAR-Adjunct68609.2025.00104
			
	Citazione
	
				Bovo, R., Giunchi, D., Cascarano, P., Gonzalez, E., Gonzalez-Franco, M. (2025). Revisiting put-that-there, context aware window interactions via LLMs [10.1109/ISMAR-Adjunct68609.2025.00104].
			
	Tutti gli autori
	
						Bovo, Riccardo; Giunchi, Daniele; Cascarano, Pasquale; Gonzalez, Eric; Gonzalez-Franco, Mar
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Revisiting_put_that_there__context_aware_window_interactions_via_LLMs.pdf embargo fino al 31/12/2027 Descrizione: paper Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review Licenza: Licenza per accesso libero gratuito Dimensione 430.77 kB Formato Adobe PDF Visualizza/Apri Contatta l'autore	430.77 kB	Adobe PDF	Visualizza/Apri Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1026455

Citazioni

ND

0

ND

ND

social impact