ScaleSpace Class Reference

ScaleSpace is like a Gaussian pyramid but with several images per level. More...

#include <SIFT/ScaleSpace.H>

Collaboration diagram for ScaleSpace:

Public Types
enum	ChannlesDef { LUM_CHANNEL = 0, RG_CHANNEL = 1, BY_CHANNEL = 2 }
Public Member Functions
	ScaleSpace (const ImageSet< float > &in, const float octscale, const int s=3, const float sigma=1.6F, bool useColor=false)
	Constructor.
	~ScaleSpace ()
	Destructor.
Image< float >	getTwoSigmaImage (int channel) const
	Get the image blurred by 2*sigma.
uint	findKeypoints (std::vector< rutz::shared_ptr< Keypoint > > &keypoints)
	Find the keypoints in the previously provided picture.
uint	getGridKeypoints (std::vector< rutz::shared_ptr< Keypoint > > &keypoints)
	Get the keypoints in the previously provided picture.
uint	getNumBlurredImages () const
	Get number of blurred images.
Image< float >	getBlurredImage (const uint idx) const
	Get a blurred image.
uint	getNumDoGImages () const
	Get number of DoG images.
Image< float >	getDoGImage (const uint idx) const
	Get a DoG image.
Image< float >	getKeypointImage (std::vector< rutz::shared_ptr< Keypoint > > &keypoints)
	Get a keypoint image for saliency submap.

Detailed Description

ScaleSpace is like a Gaussian pyramid but with several images per level.

ScaleSpace is a variation of a Gaussian pyramid, where we have several images convolved by increasingly large Gaussian filters for each level (pixel size). So, while in the standard dyadic Gaussian pyramid we convolve an image at a given resolution, then decimate by a factor 2 it to yield the next lower resolution, here we convolve at each resolution several times with increasingly large Gaussians and decimate only once every several images. This ScaleSpace class represents the several convolved images that are obtained for one resolution (one so-called octave). Additionally, ScaleSpace provides machinery for SIFT keypoint extraction. In the SIFT algorithm we use several ScaleSpace objects for several image resolutions.

Some of the code here based on the Hugin panorama software - see http://hugin.sourceforge.net - but substantial modifications have been made, espacially by carefully going over David Lowe's IJCV 2004 paper. Some of the code also comes from http://autopano.kolor.com/ but substantial debugging has also been made.

Definition at line 68 of file ScaleSpace.H.

Constructor & Destructor Documentation

ScaleSpace::ScaleSpace	(	const ImageSet< float > &	in,
		const float	octscale,
		const int	s = `3`,
		const float	sigma = `1.6F`,
		bool	useColor = `false`
	)

Constructor.

Take an input image and create a bunch of Gaussian-convolved versions of it as well as a bunch of DoG versions of it. We obtain the following images, given 's' images per octave and a baseline Gaussian sigma of 'sigma'. In all that follows, [*] represents the convolution operator.

We want that once we have convolved the image 's' times we end up with an images blurred by 2 * sigma. It is well known that convolving several times by Gaussians is equivalent to convolving once by a single Gaussian whose variance (sigma^2) is the sum of all the variances of the other Gaussians. It is easy to show that we then want to convolve each image (n >= 1) by a Gaussian of stdev std(n) = sigma(n-1) * sqrt(k*k - 1.0) where k=2^(1/s) and sigma(n) = sigma * k^n, which means that each time we multiply the total effective sigma by k. So, in summary, for every n >= 1, std(n) = sigma * k^(n-1) * sqrt(k*k - 1.0).

For our internal Gaussian imageset, we get s+3 images; the reason for that is that we want to have all the possible DoGs for that octave:

blur[ 0] = in [*] G(sigma) blur[ 1] = blur[0] [*] G(std(1)) = in [*] G(k*sigma) blur[ 2] = blur[1] [*] G(std(2)) = in [*] G(k^2*sigma) ... blur[ s] = blur[s-1] [*] G(std(s)) = in [*] G(k^s*sigma) = in[*]G(2*sigma) blur[s+1] = blur[s] [*] G(std(s+1)) = in [*] G(2*k*sigma) blur[s+2] = blur[s+1] [*] G(std(s+2)) = in [*] G(2*k^2*sigma)

For our internal DoG imageset, we just take differences between two adjacent images in the blurred imageset, yielding s+2 images such that:

dog[ 0] = in [*] (G(k*sigma) - G(sigma)) dog[ 1] = in [*] (G(k^2*sigma) - G(k*sigma)) dog[ 2] = in [*] (G(k^3*sigma) - G(k^2*sigma)) ... dog[ s] = in [*] (G(2*sigma) - G(k^(s-1)*sigma)) dog[s+1] = in [*] (G(2*k*sigma) - G(2*sigma))

The reason why we need to compute dog[s+1] is that to find keypoints we will look for extrema in the scalespace, which requires comparing the dog value at a given level (1 .. s) to the dog values at levels immediately above and immediately below.

To chain ScaleSpaces, you should construct a first one from your original image, after you have ensured that it has a blur of 'sigma'. Then do a getTwoSigmaImage(), decimate the result using decXY(), and construct your next ScaleSpace directly from that. Note that when you chain ScaleSpace objects, the effective blurring accelerates: the first ScaleSpace goes from sigma to 2.0*sigma, the second from 2.0*sigma to 4.0*sigma, and so on.

Parameters:

	in	the input image.
	octscale	baseline octave scale for this scalespace compared to the size of our original image. For example, if 'in' has been decimated by a factor 4 horizontally and vertically relatively to the original input image, octscale should be 4.0. This is used later on when assembling keypoint descriptors, so that we remember to scale the coordinates to the keypoints from our coordinates (computed in a possibly reduced input image) back to the coordinates of the original input image.
	s	number of Gaussian scales per octave.
	sigma	baseline Gaussian sigma to use.

Definition at line 58 of file ScaleSpace.C.

References ASSERT, CONV_BOUNDARY_CLEAN, sepFilter(), sqrt(), and sum().

ScaleSpace::~ScaleSpace ( )

Destructor.

Definition at line 117 of file ScaleSpace.C.

Member Function Documentation

uint ScaleSpace::findKeypoints ( std::vector< rutz::shared_ptr< Keypoint > > & keypoints )

Find the keypoints in the previously provided picture.

The ScaleSpace will be searched for good SIFT keypoints. Any keypoint found will be pushed to the back of the provided vector. Note that the vector is not cleared, so if you pass a non-empty vector, new keypoints will just be added to is. Returns the number of keypoints added to the vector of keypoints.

Definition at line 156 of file ScaleSpace.C.

References EXT, ImageSet< T >::size(), and ZEROS.

Referenced by VisualObject::computeKeypoints(), and SIFTChannel::doInput().

Image< float > ScaleSpace::getBlurredImage ( const uint idx ) const

Get a blurred image.

Definition at line 144 of file ScaleSpace.C.

References ImageSet< T >::getImage().

Image< float > ScaleSpace::getDoGImage ( const uint idx ) const

Get a DoG image.

Definition at line 152 of file ScaleSpace.C.

References ImageSet< T >::getImage().

uint ScaleSpace::getGridKeypoints ( std::vector< rutz::shared_ptr< Keypoint > > & keypoints )

Get the keypoints in the previously provided picture.

The ScaleSpace will extract keypoints in a grid setup. Those keypoints will be pushed to the back of the provided vector. Note that the vector is not cleared, so if you pass a non-empty vector, new keypoints will just be added to it. Returns the number of keypoints added to the vector of keypoints.

Definition at line 271 of file ScaleSpace.C.

Image< float > ScaleSpace::getKeypointImage ( std::vector< rutz::shared_ptr< Keypoint > > & keypoints )