Bottom-Up Visual Attention Home Page
We are developing a neuromorphic model that simulates which
elements of a visual scene are likely to attract the attention of
human observers. Given an image or video sequence, the model computes
a saliency map, which topographically encodes for conspicuity
(or ``saliency'') at every location in the visual input. The model
predicts human performance on a number of psychophysical tasks. In
addition, because it includes a detailed visual processing front-end,
our model is enjoying wide applicability to machine vision problems,
including automated target detection in natural scenes, smart image
compression, fast guidance of object recognition systems, and even
high-level scene analysis with application to the validation of
advertising designs.
This project was started at Caltech with Prof. Christof Koch. It is
actively being pursued both here and at Caltech (both jointly and in
different directions).
|
The Theory
- Details about the trainable model of bottom-up,
task-independent visual attention under development in our laboratory.
|
|
The Images
- A short overview of example images and the corresponding
attentional trajectories. Test images, psychophysical stimuli, target
detection images, natural scenes, artwork, etc.
|
|
The Movies
- Several MPEG movies showing attentional trajectories and
the temporal dynamics of the Saliency Map for test, psychophysical,
artistic and natural images. Also shown are 3D warping of the original
image onto the evolving saliency map.
|
|
The Interactive Demo
- An interactive demonstration of the dynamic behavior
of our attentional model, for a variety of complete image databases.
Most recent JavaTM-aware
Web browser required.
|
|
The Publications
- Some pre-versions of our papers describing this research
are available in HTML, Postscript and PDF format.
|
|
The Ongoing Projects
- New! Previews of a few of our ongoing projects and
preliminary screenshots. These include our SaliencyVehicle
off-road muscle car, our real-time SaliencyCam which computes
attentional deployment on live video feeds (15 frames/s), our
SaliencyAgent which detects salient pedestrians in natural color
scenes, and other exciting projects.
|
|
The C++ Source Code
- The C++ source code and associated doxygen documentation are
available through our CVS server. You will need the latest version of
g++ (3.x) and several non-standard packages installed on your Linux
distribution (e.g., IEEE1394 development libraries) in order to
compile it. Please see the README in the source tree for details. The
code is being released under GPL. You may want to check the reference manual for an idea of what is
included.
|
Copyright © 2000 by the University of
Southern California, iLab and Prof. Laurent
Itti