Abstract

= PDF Reprint, = BibTeX entry, = Online Abstract

S. Filipe, L. Itti, L. A. Alexandre, BIK-BUS: Biologically Motivated 3D Keypoint Based on Bottom-Up Saliency, IEEE Transactions on Image Processing, Vol. 24, No. 1, pp. 163-175, Jan 2015. [2013 impact factor: 3.111] (Cited by 32)

Abstract: One of the major problems found when developing a 3D recognition system involves the choice of keypoint detector and descriptor. To help solve this problem, we present a new method for the detection of 3D keypoints on point clouds and we perform benchmarking between each pair of 3D keypoint detector and 3D descriptor to evaluate their performance on object and category recognition. These evaluations are done in a public database of real 3D objects. Our keypoint detector is inspired by the behavior and neural architecture of the primate visual system. The 3D keypoints are extracted based on a bottom-up 3D saliency map, that is, a map that encodes the saliency of objects in the visual environment. The saliency map is determined by computing conspicuity maps (a combination across different modalities) of the orientation, intensity, and color information in a bottom-up and in a purely stimulus-driven manner. These three conspicuity maps are fused into a 3D saliency map and, finally, the focus of attention (or keypoint location) is sequentially directed to the most salient points in this map. Inhibiting this location automatically allows the system to attend to the next most salient location. The main conclusions are: with a similar average number of keypoints, our 3D keypoint detector outperforms the other eight 3D keypoint detectors evaluated by achieving the best result in 32 of the evaluated metrics in the category and object recognition experiments, when the second best detector only obtained the best result in eight of these metrics. The unique drawback is the computational time, since biologically inspired 3D keypoint based on bottom-up saliency is slower than the other detectors. Given that there are big differences in terms of recognition performance, size and time requirements, the selection of the keypoint detector and descriptor has to be matched to the desired task and we give some directions to facilitate this choice.

Themes: Model of Bottom-Up Saliency-Based Visual Attention, Computational Modeling, Computer Vision