A Crash Course on Visual Saliency Modeling: Behavioral Findings and Computational Models

Location and Dates
Conference on Computer Vision and Pattern Recognition CVPR 2013
The Oregon Convention Center in Portland, Oregon, USA
June 24, 2013, 8:30 - 17:15 



Ali Borji 
University of Southern California (USC)
[primary organizor]



Simone Frintrop 
University of Bonn,


Laurent Itti
University of Southern California (USC)


Neil D. Bruce
University of 
Manitoba, Canada


Xiaodi Hou
California Institute of Technology (Caltech)

xiaodi.hou@ gmail.com

Course Description

Over the last two decades, the fields of visual attention and visual saliency have attracted a lot of interest in computer vision. CVPR has been one of the main venues for publishing results in this domain. There exists a vast literature in visual saliency from both biological/behavioral perspectives to computational attention modeling. Our main aims in this tutorial are reviewing bold advances in the field and bringing together new researchers and prominent figures. We will provide the theoretical background of saliency concepts and models, as well as illustrating successful applications (in some cases, outperforming the state-of-the-art) of saliency models. We are expecting a broad audience, from experts in the field to undergraduate and graduate students interested in enlarging their understanding and discovering open problems and new directions. Our tutorial is one of the first attempts to reviewing/criticizing saliency literature in a vision conference.


We will cover the following topics in this course based on the agenda presented in a recent comprehensive review by the organizers (Borji & Itti, PAMI 2013). We will begin by an introduction to visual attention, saliency, and eye movement strategies. Here, we will cover the most important discoveries in the field of attention (mainly behavioral). We will then proceed by giving a short history of how the story began 25 years ago with Koch & Ullman's computational architecture until today. Next, we will have some selected topics, for each category of saliency models, each presented by an expert in that subfield. These speakers will start from the base models in each category (e.g., Spectral saliency models, Bayesian models, Information-theoretic models, etc) and move on to the derived models. We will then have a talk on modeling saliency in the spatio-temporal domain and another one on applications of saliency models. We will finish by discussing open problems, remaining issues, and challenges in this field (datasets, scores, center-bias, etc), followed by an open discussion for finding the best ways to tackle them.
In summary, this course is composed of the following subjects:
  • Fundamental concepts and theories of visual attention from a behavioral perspective
  • Introduction to visual saliency modeling and review of models based on the Koch & Ullman's computational architecture.
  • Saliency models based on Information theory and Bayesian concepts
  • Spectral analysis saliency models
  • Graphical models
  • Pattern classification models
  • Applications of saliency modeling
  • Spatio-temporal saliency modeling
  • Model comparison, challenges, and open problems for future
8:30 - 8:45 Introduction to the tutorial  
8:45 - 9:30 Visual attention: Background material [Ali Borji]
9:30 - 10:15 Attention in daily life [Ali Borji]
10:15 - 10:30 Break  
10:30 - 11:30 Bayesian and information-theoretic models [Neil D. Bruce]
11:30 - 12:00 Applications of saliency modeling [Neil D. Bruce]
12:00 - 13:30 Lunch break  
13:30 - 14:15 Saliency and sparsity [Xiaodi Hou]
14:15 - 15:00 Towards attentive robots [Simone Frintrop]
15:00 - 15:30 Attention for 3D object discovery [Simone Frintrop]
15:30 - 15:45 Break  
15:45 - 16:45 Model comparison and challenges I [Ali Borji]
16:45 - 17:15 Model comparison and challenges II [Xiaodi Hou]
17:15 - 18:00 Open forum  
Course Materials
We plan to distribute the slides that compose the tutorial, as well as code to illustrate all the aspects of it, including some of the applications and demos (stay tuned).

Speaker Biographies
Ali Borji  received the BS and MS degrees in computer engineering from the Petroleum University of Technology, Tehran, Iran, 2001 and Shiraz University, Shiraz, Iran, 2004, respectively. He received the PhD degree in cognitive neurosciences from the Institute for Studies in Fundamental Sciences (IPM) in Tehran, Iran, 2009. He has been a postdoctoral scholar at iLab, University of Southern California, Los Angeles since March 2010. His research interests include visual attention, visual search, object and scene recognition, machine learning, cognitive sciences, and neurosciences.
Simone Frintrop received the MS and the PhD degree from the university of Bonn, Germany, in 2001 and 2005 respectively. She is a senior researcher at the Computer Science department at the University of Bonn and is currently heading the Cognitive Vision Group. She has worked for more than 10 years in the field of computational visual attention and saliency detection, with applications for mobile vision systems and robotics.
Laurent Itti received the MS degree in image processing from the Ecole Nationale Superiere des Te´lecommunications in Paris in 1994 and the PhD degree in computation and neural systems from the California Institute of Technology in 2000. He is an associate professor of computer science, psychology, and neurosciences at the University of Southern California. His research interests include biologically-inspired computational vision, in particular in the domains of visual attention, gist, saliency, and surprise, with technological applications to video compression, target detection, and robotics.
Neil D. Bruce recently joined the Department of Computer Science at the University of Manitoba as an Assistant Professor. Prior to this, he completed two post-doctoral fellowships, one at the Center for Vision Research at York University, and the other at INRIA Sophia Antipolis. Previously, he completed a Ph.D. in the department of Computer Science and Engineering in 2008 as a member of the Centre for Vision Research at York University, Toronto, Canada. In 2003, Neil completed a M.A.Sc in System Design Engineering at the University of Waterloo, and received an Honours B.Sc. with a double major in Computer Science and Mathematics from the University of Guelph in 2001. Neil's current research interests include a variety of topics in human and machine vision including but not limited to computer vision, computational neuroscience, visual attention, natural image statistics, statistical and Bayesian approaches, information theory, and machine learning.
Xiaodi Hou received the BEng degrees in computer science from the Shanghai Jiao Tong University, China, 2008. He is currently a PhD student at KLab at Caltech since 2008. His research interests include natural image statistics, visual attention, psychophysics, neurosciences, and mid-level computer/computational vision.