Bottom-Up Visual Attention Home Page

We are developing a neuromorphic model that simulates which elements of a visual scene are likely to attract the attention of human observers. Given an image or video sequence, the model computes a saliency map, which topographically encodes for conspicuity (or ``saliency'') at every location in the visual input. The model predicts human performance on a number of psychophysical tasks. In addition, because it includes a detailed visual processing front-end, our model is enjoying wide applicability to machine vision problems, including automated target detection in natural scenes, smart image compression, fast guidance of object recognition systems, and even high-level scene analysis with application to the validation of advertising designs.

This project was started at Caltech with Prof. Christof Koch. It is actively being pursued both here and at Caltech (both jointly and in different directions).

[Theory]
The Theory
Details about the trainable model of bottom-up, task-independent visual attention under development in our laboratory.
[Images]
The Images
A short overview of example images and the corresponding attentional trajectories. Test images, psychophysical stimuli, target detection images, natural scenes, artwork, etc.
[Movie]
The Movies
Several MPEG movies showing attentional trajectories and the temporal dynamics of the Saliency Map for test, psychophysical, artistic and natural images. Also shown are 3D warping of the original image onto the evolving saliency map.
[javaDemo]
The Interactive Demo
An interactive demonstration of the dynamic behavior of our attentional model, for a variety of complete image databases. Most recent JavaTM-aware Web browser required.
[publications]
The Publications
Some pre-versions of our papers describing this research are available in HTML, Postscript and PDF format.
[ongoing]
The Ongoing Projects
New! Previews of a few of our ongoing projects and preliminary screenshots. These include our SaliencyVehicle off-road muscle car, our real-time SaliencyCam which computes attentional deployment on live video feeds (15 frames/s), our SaliencyAgent which detects salient pedestrians in natural color scenes, and other exciting projects.
[source code]
The C++ Source Code
The C++ source code and associated doxygen documentation are available through our CVS server. You will need the latest version of g++ (3.x) and several non-standard packages installed on your Linux distribution (e.g., IEEE1394 development libraries) in order to compile it. Please see the README in the source tree for details. The code is being released under GPL. You may want to check the reference manual for an idea of what is included.

Copyright © 2000 by the University of Southern California, iLab and Prof. Laurent Itti