Abstract

= PDF Reprint, = BibTeX entry, = Online Abstract

Z. Li, L. Itti, Saliency and Gist Features for Target Detection in Satellite Images, IEEE Transactions on Image Processing, Vol. 20, No. 7, pp. 2017-2029, 2011. [2009 impact factor: 2.848] (Cited by 214)

Abstract: Reliably detecting objects in broad-area overhead or satellite images has become an increasingly pressing need, as the capabilities for image acquisition are growing rapidly. The problem is particularly difficult in the presence of large intraclass variability, e.g., finding “boats” or “buildings,” where model-based approaches tend to fail because no good model or template can be defined for the highly variable targets. This paper explores an automatic approach to detect and classify targets in high-resolution broad-area satellite images, which relies on detecting statistical signatures of targets, in terms of a set of biologically-inspired low-level visual features. Broad-area images are cut into small image chips, analyzed in two complementary ways: 'attention/saliency' analysis exploits local features and their interactions across space, while 'gist analysis focuses on global nonspatial features and their statistics. Both feature sets are used to classify each chip as containing target(s) or not, using a support vector machine. Four experiments were performed to find 'boats' (Experiments 1 and 2), 'buildings' (Experiment 3) and 'airplanes' (Experiment 4). In experiment 1, 14 416 image chips were randomly divided into training (300 boat, 300 non-boat) and test sets (13 816), and classification was performed on the test set (ROC area: 0.977 +/- 0.003). In experiment 2, classification was performed on another test set of 11 385 chips from another broad-area image, keeping the same training set as in experiment 1 (ROC area: 0.952 +/- 0.006). In experiment 3, 600 training chips (300 for each type) were randomly selected from 108 885 chips, and classification was conducted (ROC area: 0.922 +/- 0.005). In experiment 4, 20 training chips (10 for each type) were randomly selected to classify the remaining 2581 chips (ROC area: 0.976 +/- 0.003). The proposed algorithm outperformed the state-of-the-art SIFT, HMAX, and hidden-scale salient structure methods, and previous gist-only features in all four experiments. This study shows that the proposed target search method can reliably and effectively detect highly variable target objects in large image datasets.

Themes: Model of Bottom-Up Saliency-Based Visual Attention, Computational Modeling, Computer Vision