Abstract

= PDF Reprint, = BibTeX entry, = Online Abstract

J. Wang, A. Borji, C.-C. J. Kuo, L. Itti, Learning a combined model of visual saliency for fixation prediction, IEEE Transactions on Image Processing, Vol. 25, No. 4, pp. 1566-1579, Apr 2016. [2014 impact factor: 3.625] (Cited by 78)

Abstract: A large number of saliency models, each based on a different hypothesis, have been proposed over the past 20 years. In practice, while subscribing to one hypothesis or computational principle makes a model to perform well on some types of images, it hinders general performance of a model on arbitrary images and large-scale datasets. One natural approach to improve the overall saliency detection accuracy would then be fusing different types of models. In this paper, inspired by the success of late-fusion strategies in semantic analysis and multi-modal biometrics, we propose to fuse state-of-the-arts saliency models at the score level in a para-boosting learning fashion. First, saliency maps generated by several models are used as confidence scores. Then, these scores are fed into our para-boosting learner (i.e., Support Vector Machine (SVM), Adaptive Boosting (AdBoost), or Probability Density Estimator (PDE)) to generate the final saliency map. In order to explore strength of para-boosting learners, traditional transformation-based fusion strategies such as Sum, Min, and Max are also explored and compared in this paper. To further reduce computation cost of fusing too many models, a few of them are considered in the next step. Experimental results show that score-level fusion outperforms each individual model and can further reduce the performance gap between current models and human inter-observer (human IO) model.

Themes: Model of Bottom-Up Saliency-Based Visual Attention, Computational Modeling, Computer Vision