Abstract

= PDF Reprint, = BibTeX entry, = Online Abstract

R. J. Peters, A. Iyer, C. Koch, L. Itti, Components of Bottom-Up Gaze Allocation in Natural Scenes, In: Proc. Vision Science Society Annual Meeting (VSS05), May 2005. (Cited by 30)

Abstract: A model of bottom-up visual attention (``baseline salience model'', based on local detectors with coarse global surround inhibition) has been shown (Parkhurst et al., 2002) to account in part for the spatial locations fixated by people while free-viewing complex natural and artificial scenes. Here, we tested the additional roles in bottom-up gaze allocation played by several visual cortical mechanisms. In each case, we added a component to the salience model: non-linear interactions among orientation-tuned units both at short spatial ranges (for clutter reduction) and long ranges (for contour facilitation), and a detailed model of eccentricity-dependent changes in visual processing. Subjects free-viewed naturalistic and artificial images while their eye movements were recorded, and we used a metric called the Normalized Scanpath Salience (NSS) to compare the resulting fixation locations with the different models' predicted salience maps. NSS values indicate, on average, how many standard deviations above or below the mean salience was the model-predicted salience at human-fixated locations. Thus the minimum NSS value (when the model and human behavior are unrelated) is 0; the theoretical maximum NSS value is given by the ability of one observer's fixations to be predicted by the remaining observers' fixations, which in practice fell in the range 1.1--1.3 for different image categories. The baseline salience model predicted fixations at 39--57 percent of the maximum NSS level. Adding short-range orientation interactions increased this range to 50--65 percent, contour facilitation further increased it to 53--74 percent, and eccentricity-dependent processing increased it to 84--95 percent. Thus the proposed cortical interactions indeed appear to play a significant role in the spatiotemporal deployment of attention in natural scenes. This suggests that bottom-up attentional guidance does not depend solely on local visual features, but must also include the effects of non-local interactions.

Themes: Computational Modeling, Model of Bottom-Up Saliency-Based Visual Attention, Human Eye-Tracking Research