CRIS Current Research Information System

In recent years, deep learning has revolutionized computer vision and has been widely used for monitoring in diverse visual scenes. However, in terms of some aspects such as complexity and explainability, deep learning is not always preferable over traditional machine-learning methods. Traditional visual tracking approaches have shown certain advantages in terms of data collection efficiency, computing requirements, and power consumption and are generally easier to understand and explain than deep neural networks. At present, traditional feature-based techniques relying on correlation filtering (CF) have become common for understanding complex visual scenes. However, current CF algorithms use a single feature to describe the information of the target and locate it accordingly. They cannot fully express changeable target appearances in a complex scene, which can easily lead to inaccurate target locations in time-varying visual scenes. Moreover, owing to the complexity of surveillance scenes, monitoring algorithms can lose their target. The original template update strategy uses each frame with a fixed interval length as a new template, which may lead to unreliable feature extraction and low tracking accuracy. To overcome these issues, in this work, we introduce an original location fusion mechanism based on multiple visual cognition processing streams to achieve real-time and efficient visual monitoring in complex scenes. First, we propose a process for extracting multiple forms of visual cognitive information, and it is periodically used to extract multiple feature information flows of a target of interest. Subsequently, a cognitive information fusion process is employed to fuse the positioning results of different visual cognitive information flows to achieve high-quality visual monitoring and positioning. Finally, a novel feature template memory storage and retrieval strategy is adopted. When the location result is unreliable, the target is retrieved from memory to ensure robust and accurate tracking. In addition, we provide an extensive set of performance results showing that our proposed approach exhibits more robust performance at a lower computational cost compared with 36 state-of-the-art algorithms for visual tracking in complex scenes.

Liu, S., Huang, S.c., Wang, S., Muhammad, K., Bellavista, P., Del Ser, J. (2023). Visual tracking in complex scenes: A location fusion mechanism based on the combination of multiple visual cognition flows. INFORMATION FUSION, 96, 281-296 [10.1016/j.inffus.2023.02.005].

Visual tracking in complex scenes: A location fusion mechanism based on the combination of multiple visual cognition flows

Liu, S;Huang, SC;Wang, S;Muhammad, K;Bellavista, P;Del Ser, J

2023

Abstract

In recent years, deep learning has revolutionized computer vision and has been widely used for monitoring in diverse visual scenes. However, in terms of some aspects such as complexity and explainability, deep learning is not always preferable over traditional machine-learning methods. Traditional visual tracking approaches have shown certain advantages in terms of data collection efficiency, computing requirements, and power consumption and are generally easier to understand and explain than deep neural networks. At present, traditional feature-based techniques relying on correlation filtering (CF) have become common for understanding complex visual scenes. However, current CF algorithms use a single feature to describe the information of the target and locate it accordingly. They cannot fully express changeable target appearances in a complex scene, which can easily lead to inaccurate target locations in time-varying visual scenes. Moreover, owing to the complexity of surveillance scenes, monitoring algorithms can lose their target. The original template update strategy uses each frame with a fixed interval length as a new template, which may lead to unreliable feature extraction and low tracking accuracy. To overcome these issues, in this work, we introduce an original location fusion mechanism based on multiple visual cognition processing streams to achieve real-time and efficient visual monitoring in complex scenes. First, we propose a process for extracting multiple forms of visual cognitive information, and it is periodically used to extract multiple feature information flows of a target of interest. Subsequently, a cognitive information fusion process is employed to fuse the positioning results of different visual cognitive information flows to achieve high-quality visual monitoring and positioning. Finally, a novel feature template memory storage and retrieval strategy is adopted. When the location result is unreliable, the target is retrieved from memory to ensure robust and accurate tracking. In addition, we provide an extensive set of performance results showing that our proposed approach exhibits more robust performance at a lower computational cost compared with 36 state-of-the-art algorithms for visual tracking in complex scenes.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Rivista
	
				INFORMATION FUSION
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.inffus.2023.02.005
			
	Citazione
	
				Liu, S., Huang, S.c., Wang, S., Muhammad, K., Bellavista, P., Del Ser, J. (2023). Visual tracking in complex scenes: A location fusion mechanism based on the combination of multiple visual cognition flows. INFORMATION FUSION, 96, 281-296 [10.1016/j.inffus.2023.02.005].
			
	Tutti gli autori
	
						Liu, S; Huang, Sc; Wang, S; Muhammad, K; Bellavista, P; Del Ser, J
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Revised Manuscript_INFFUS-D-22-00883_25 Jan 2023_Accepted(1).pdf Open Access dal 04/02/2025 Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND) Dimensione 2.58 MB Formato Adobe PDF Visualizza/Apri	2.58 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/952079

Citazioni

ND

81

67

social impact