THIS PAGE IS OUTDATED. PLEASE VISIT here for the latest!
What draws people's attention and gaze when they view complex visual displays? What are the underlying neural computations that enable us to detect objects of interest amidst clutter? My research aims at developing a behavioral and computational understanding of how different factors like task demands, visual salience and reward value of objects compete with each other for controlling attention and gaze. In doing so, I use a mix of experimental (psychophysics, eye tracking), and theoretical / modeling techniques (signal detection theory, Bayesian theory, neurally plausible algorithms).
Selected publications
V. Navalpakkam, C. Koch, P. Perona, Homo Economicus in Visual Search, In: Journal of Vision, 9(1):31, 1-16, http://journalofvision.org/9/1/31/, doi:10.1167/9.1.31
Abstract: How do reward outcomes affect early visual performance? Previous studies found a suboptimal influence, but they ignored the non-linearity in how subjects perceived the reward outcomes. In contrast, we find that when the non-linearity is accounted for, humans behave optimally and maximize expected reward. Our subjects were asked to detect the presence of a familiar target object in a cluttered scene. They were rewarded according to their performance. We systematically varied the target frequency and the reward/penalty policy for detecting/missing the targets. We find that 1) Decreasing the target frequency will decrease the detection rates, in accordance with the literature. 2) Contrary to previous studies, increasing the target detection rewards will compensate for target rarity and restore detection performance. 3) A quantitative model based on reward-maximization accurately predicts human detection behavior in all target frequency and reward conditions; thus, reward schemes can be designed to obtain desired detection rates for rare targets. 4) Subjects quickly learn the optimal decision strategy; we propose a neurally plausible model that exhibits the same properties. Potential applications include designing reward schemes to improve detection of life-critical, rare targets (e.g., cancers in medical images).
V. Navalpakkam, L. Itti, Search goal tunes visual features optimally, In: Neuron, Vol. 53, No. 4, pp. 605-617, Feb 2007.
Abstract: How does a visual search goal modulate the activity of neurons encoding different visual features (e.g., color, direction of motion)? Previous research suggests that goal-driven attention enhances the gain of neurons representing the target's visual features. Here, we present mathematical and behavioral evidence that this strategy is suboptimal and that humans do not deploy it. We formally derive the optimal feature gain modulation theory, which combines information from both the target and distracting clutter to maximize the relative salience of the target. We qualitatively validate the theory against existing electrophysiological and psychophysical literature. A surprising prediction is that it is sometimes optimal to enhance nontarget features. We provide experimental evidence toward this through psychophysics experiments on human subjects, thus suggesting that humans deploy the optimal gain modulation strategy.
Also see preview entitled Paying Attention to Neurons with Discriminating Taste by A. Pouget and D. Bavelier, In: Neuron 2007, Vol. 53, No. 4, pp. 473-475, Feb 2007.
Also see Faculty of 1000 Biology evaluation
V. Navalpakkam, L. Itti, Top-down Attention Selection is Fine-grained, In: Journal of Vision, Vol. 6, No. 11, pp. 1180-1193, Oct 2006.
Abstract: Although much is known about the sources and modulatory effects of top-down attentional signals, the information capacity of these signals is less known. Here, we investigate the granularity of top-down attentional signals. Previous theories in psychophysics have provided conflicting evidence on whether top-down guidance is coarse grained (i.e., one gain control term per feature dimension) or fine grained (i.e., multiple gain control terms per dimension). We resolve the conflict by designing new experiments that disentangle top-down from bottom-up contributions, thereby avoiding confounds existing in previous studies. The results of our eye-tracking experiments show that subjects can selectively saccade to items belonging to the relevant feature interval compared with irrelevant intervals within a dimension. This suggests that top-down signals can specify not only the relevant feature dimension but also the relevant feature interval within a dimension. We conclude that top-down signals are fine grained and can specify multiple gain control terms per dimension.
V. Navalpakkam, L. Itti, An Integrated Model of Top-down and Bottom-up Attention for Optimal Object Detection, In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2006.
Abstract: Integration of goal-driven, top-down attention and image-driven, bottom-up attention is crucial for visual search. For instance, in robot navigation, it is important to detect goal-relevant targets like road signs and landmarks, and to simultaneously notice unexpected visual events like sudden obstacles and accidents. Yet, previous research has mostly focused on models that are purely top-down or bottom-up. Here, we propose a new model that combines both. The bottom-up component computes the visual salience of scene locations in different feature maps extracted at multiple spatial scales. The top-down component uses accumulated statistical knowledge of the visual features of the desired search target and background clutter, to optimally tune the bottom-up maps so as to maximize target detection speed. The results of testing on 600 artificial search arrays and 300 natural scenes show that the model's predictions are consistent with a large body of available literature on human psychophysics of visual search. The promising results suggest that our model may provide good approximation to how humans combine bottom-up and top-down cues such as to optimize visual search behavior.
V. Navalpakkam, L. Itti, Optimal cue selection strategy, In: Neural Information Processing Systems (NIPS), 2005.
V. Navalpakkam, M. A. Arbib, L. Itti, Attention and Scene Understanding, In: Neurobiology of Attention, (L. Itti, G. Rees, J. K. Tsotsos Ed.), pp. 197-203, San Diego, CA:Elsevier, 2005.

V. Navalpakkam, L. Itti, Modeling the influence of task on attention, Vision Research, Vol. 45, No. 2, pp. 205-231, 2005.
Abstract: We propose a computational model for the task-specific guidance of visual attention in real-world scenes. Our model emphasizes
four aspects that are important in biological vision: determining task-relevance of an entity, biasing attention for the low-level visual
features of desired targets, recognizing these targets using the same low-level features, and incrementally building a visual map of
task-relevance at every scene location. Given a task definition in the form of keywords, the model first determines and stores the
task-relevant entities in working memory, using prior knowledge stored in long-term memory. It attempts to detect the most relevant
entity by biasing its visual attention system with the entitys learned low-level features. It attends to the most salient location in the
scene, and attempts to recognize the attended object through hierarchical matching against object representations stored in longterm
memory. It updates its working memory with the task-relevance of the recognized entity and updates a topographic taskrelevance
map with the location and relevance of the recognized entity. The model is tested on three types of tasks: single-target
detection in 343 natural and synthetic images, where biasing for the target accelerates target detection over twofold on average;
sequential multiple-target detection in 28 natural images, where biasing, recognition, working memory and long term memory
contribute to rapidly finding all targets; and learning a map of likely locations of cars from a video clip filmed while driving on
a highway. The models performance on search for single features and feature conjunctions is consistent with existing psychophysical
data. These results of our biologically-motivated architecture suggest that the model may provide a reasonable approximation to
many brain processes involved in complex task-driven visual behaviors.
V. Navalpakkam, L. Itti, A Goal Oriented Attention Guidance Model, Lecture Notes in Computer Science, Vol. 2525, pp. 453-461, Nov 2002.
Conference abstracts / posters
V. Navalpakkam, C. Koch, A. Rangel, & P. Perona (2009). Where to look? Dissociating the effect of reward, salience and attention. In: Vision Science Society, May 2009. (talk)
W. J. Ma, V. Navalpakkam, J. Beck, & A. Pouget (2009). Optimal integration of information across space in homogeneous and heterogeneous search displays: data and neural implementation. In:Vision Science Society, May 2009. (poster)
M. Milosavljevic, V. Navalpakkam, C. Koch, & A. Rangel (2009). The role of visual saliency and subjective-value in rapid decision making. In: Vision Science Society, May 2009. (poster)
R. Pedersini, V. Navalpakkam, T. Horowitz, P. Perona, & J. Wolfe (2009). Quitting rules in visual search. In: Vision Science Society, May 2009. (poster)
V. Navalpakkam, C. Koch, A. Rangel, & P. Perona (2009). Where to look? Dissociating the effect of reward, salience and attention. Frontiers in Systems Neuroscience. Conference Abstract: Computational and systems neuroscience. doi: 10.3389/conf.neuro.06.2009.03.038
V. Navalpakkam, C. Koch, P. Perona, Homo economicus in visual search, In: Vision Science Society, May 2008. (talk)
J. Beck, V. Navalpakkam, W. J. Ma, Bayesian Theory of Visual Search, In: Vision Science Society, May 2008. (poster)
R. Pedersini, V. Navalpakkam, T. Horowitz, & J. Wolfe (2009). Monetory reward and target prevalence in a baggage-screening task. In: Vision Science Society, May 2008. (poster)
V. Navalpakkam, C. Koch, P. Perona, Homo economicus in visual search, In: Computation and Neural Systems (Cosyne), Mar 2008. (poster)
W. J. Ma, V. Navalpakkam, J. Beck, Neural Bayesian Theory of Visual Search, In: Computation and Neural Systems (Cosyne), Mar 2008. (poster)
J. Beck, W. J. Ma, V. Navalpakkam, A. Pouget, Exact and approximate solutions for marginalization with probabilistic population codes, In: Computation and Neural Systems (Cosyne), Mar 2008. (poster)
V. Navalpakkam, L. Itti, Attentional modulation of tuning width, preferred features and gains during visual search, In: Vision Science Society (VSS), May 2007. (poster)
V. Navalpakkam, L. Itti, Optimal feature gain modulation during visual search, In: Vision Science Society (VSS), May 2006. (poster)
V. Navalpakkam, L. Itti, Attention can be guided to the relevant feature category, In: Vision Science Society (VSS), May 2005. (poster)
V. Navalpakkam, L. Itti, A mathematical framework for the design and analysis of feature biasing strategies, In: 11th Joint Symposium on Neural Computation, May 2004. (poster)
V. Navalpakkam, L. Itti, A Biologically-Inspired Scene-based Question Answering Agent, In: Proc. 9th Joint Symposium on Neural Computation (JSNC'02), Pasadena, California, May 2002. (poster)