Kick-started by deployment of the well-known KinectFusion, recent research on the task of RGBD-based dense volume reconstruction has focused on improving different shortcomings of the original algorithm. In this paper we tackle two of them: drift in the camera trajectory caused by the accumulation of small per-frame tracking errors and lack of semantic information within the output of the algorithm. Accordingly, we present an extended KinectFusion pipeline which takes into account per-pixel semantic labels gathered from the input frames. By such clues, we extend the memory structure holding the reconstructed environment so to store per-voxel information on the kinds of object likely to appear in each spatial location. We then take such information into account during the camera localization step to increase the accuracy in the estimated camera trajectory. Thus, we realize a SemanticFusion loop whereby perframe labels help better track the camera and successful tracking enables to consolidate instantaneous semantic observations into a coherent volumetric map.
Cavallari, T., Di Stefano, L. (2016). SemanticFusion: Joint labeling, tracking and mapping. Springer Verlag [10.1007/978-3-319-49409-8_55].
SemanticFusion: Joint labeling, tracking and mapping
CAVALLARI, TOMMASO;DI STEFANO, LUIGI
2016
Abstract
Kick-started by deployment of the well-known KinectFusion, recent research on the task of RGBD-based dense volume reconstruction has focused on improving different shortcomings of the original algorithm. In this paper we tackle two of them: drift in the camera trajectory caused by the accumulation of small per-frame tracking errors and lack of semantic information within the output of the algorithm. Accordingly, we present an extended KinectFusion pipeline which takes into account per-pixel semantic labels gathered from the input frames. By such clues, we extend the memory structure holding the reconstructed environment so to store per-voxel information on the kinds of object likely to appear in each spatial location. We then take such information into account during the camera localization step to increase the accuracy in the estimated camera trajectory. Thus, we realize a SemanticFusion loop whereby perframe labels help better track the camera and successful tracking enables to consolidate instantaneous semantic observations into a coherent volumetric map.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.