Abstract

= PDF Reprint, = BibTeX entry, = Online Abstract

Z. Li, L. Itti, Gist Based Top-Down Templates for Gaze Prediction, In: Proc. Vision Science Society Annual Meeting (VSS09), May 2009. (Cited by 3)

Abstract: People use focal visual attention and rapid eye movements to analyze complex visual inputs in a manner that highly depends on current scene's properties. Here we propose a top-down attention model which exploits visual templates associated with different types of scenes. During training, an image set has been manually classified into several scene categories and for each category we define a corresponding top-down map which highlights locations likely to be of interest empirically. Then 'gist' feature vectors of each category's images are computed to generate a Gaussian gist feature distribution, or signature of that category. During testing, the input image's gist feature vector is computed first, based on this feature vector and the already generated scene categories' gist feature distributions, a group of corresponding weights are computed using the probability density functions. The top-down map is then the weighted summation of those pre-defined templates. Finally, the top-down map is combined with a bottom-up saliency map (Itti & Koch 2001) to generate a final attention guidance map. In eye-tracking validation experiments, two video types are adopted as testing data, one is an original set of captured video clips and the other one is built by cutting the original clips into 1-3s small clips and re-assembling. Results show that in the original clips, the area under curve (AUC) score and the KL distance of the standard bottom-up saliency map is 0.665 and 0.185 (higher is better) while the attention guidance map result is 0.688 and 0.242, respectively; with the re-assembled clips, the standard bottom-up model result is 0.648 and 0.145 while the combined model result is 0.718 and 0.327. Our results suggest that the attention selection can be more accurate with the proposed top-down component. [1] Itti, L. and Koch, C. 2001 Computational Modeling of Visual Attention, Nature Reviews Neuroscience, 2(3), 194-203. Acknowledgement: The authors gratefully acknowledge the contribution of NSF, HFSP, NGA, DARPA and China Scholarship Council.

Themes: Model of Bottom-Up Saliency-Based Visual Attention, Model of Top-Down Attentional Modulation, Computational Modeling, Computer Vision