Abstract

= PDF Reprint, = BibTeX entry, = Online Abstract

J. Zhao, C. Siagian, L. Itti, Fixation Bank: Learning to Reweight Fixation Candidates, In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, pp. 3174-3182, Jun 2015. [2015 acceptance rate: 28%] (Cited by 14)

Abstract: Predicting where humans will fixate in a scene has many practical applications. Biologically-inspired saliency models decompose visual stimuli into feature maps across multiple scales, and then integrate different feature channels, e.g., in a linear, MAX, or MAP. However, to date there is no universally accepted feature integration mechanism. Here, we propose a new a data-driven solution: We first build a 'fixation bank' by mining training samples, which maintains the association between local patterns of activation, in 4 feature channels (color, intensity, orientation, motion) around a given location, and corresponding human fixation density at that location. During testing, we decompose feature maps into blobs, extract local activation patterns around each blob, match those patterns against the fixation bank by group lasso, and determine weights of blobs based on reconstruction errors. Our final saliency map is the weighted sum of all blobs. Our system thus incorporates some amount of spatial and featural context information into the location-dependent weighting mechanism. Tested on two standard data sets (DIEM for training and test, and CRCNS for test only; total of 23,670 training and 15,793 + 4,505 test frames), our model slightly but significantly outperforms 7 state-of-the-art saliency models.

Themes: Model of Bottom-Up Saliency-Based Visual Attention, Human Eye-Tracking Research, Computational Modeling, Computer Vision