Abstract

= PDF Reprint, = BibTeX entry, = Online Abstract

Z. Li, L. Itti, Visual attention guided video compression, In: Proc. Vision Science Society Annual Meeting (VSS08), May 2008. (Cited by 5)

Abstract: Human visual characteristics show promising future for applications to video coding. Here, we propose, implement, and test a universal visual attention based video coding platform (VAVC). This platform includes two main parts: the visual attention module and the video coding module. The visual attention module is used to generate saliency maps (or other maps which can represent human visual characteristics) according to the human visual system (HVS) while the video coding module is used to compress the raw video sequence according to the results of the first module. Using this platform, a saliency-based video coding algorithm is implemented. The bottom-up methods proposed in Itti et al. (1998) are adopted to get the saliency map. Then we transform the saliency map into the quantization map used in the latest video coding standard H.264 to guide the residual quantization. For the salient regions, we decrease the quantization step to reduce the artifacts, and for the non-salient regions, we increase the quantization step to increase the compression ratio. In our experiment, 18 natural video sequences are adopted for encoding with different methods while 6 subjects to evaluate these encoded results. Subjects were asked to subjectively rate on a 1-5 scale the perceptual quality of 3 variants of the clips: standard H.264, our VAVC (yielding on average 17.37% smaller file sizes), and rate controlled H.264 to match the smaller size of the VAVC encoded clips. The experiment results show that, for 64.9% samples, the subjective quality of VAVC-encoded clips is equal or better than traditional H.264-encoded clips. For 87.04% samples, the subjective quality of the proposed VAVC method is equal or better than the rate-controlled H.264 method for equal file size. Our results suggest that exploiting human visual characteristics can lead to better video compression without degrading perceptual quality.

Themes: Model of Bottom-Up Saliency-Based Visual Attention, Scene Understanding