GistEstimatorBeyondBoF Class Reference

Gist estimator for ``Beyond Bags of Features ...'' by Lazebnik, et al. More...

#include <Neuro/GistEstimatorBeyondBoF.H>

Inheritance diagram for GistEstimatorBeyondBoF:
Inheritance graph
[legend]
Collaboration diagram for GistEstimatorBeyondBoF:
Collaboration graph
[legend]

List of all members.

Classes

struct  SiftDescriptor

Public Types

typedef Image< SiftDescriptorSiftGrid
typedef std::list< SiftDescriptorVocabulary
typedef void(* TrainingHook )(const SiftGrid &)

Public Member Functions

 GistEstimatorBeyondBoF (OptionManager &mgr, const std::string &descrName="GistEstimatorBeyondBoF", const std::string &tagName="GistEstimatorBeyondBoF")
void setVocabulary (const Image< float > &)
void setTrainingHook (TrainingHook)
Image< double > getGist ()
 Return the gist vector (useless in training mode).
virtual ~GistEstimatorBeyondBoF ()
 Destructor.

Static Public Member Functions

static int num_channels ()
static int num_levels ()
static int gist_vector_size ()

static void num_channels (int)
static void num_levels (int)

Protected Member Functions

 SIMCALLBACK_DECLARE (GistEstimatorBeyondBoF, SimEventRetinaImage)
 Callback for when a new input (retina) frame is available.

Detailed Description

Gist estimator for ``Beyond Bags of Features ...'' by Lazebnik, et al.

This class computes the gist vector for an input image using the feature extraction and spatial pyramid matching scheme described in sections 4 and 3 (respectively) of ``Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories'' by Lazebnik, et al.

While the authors of the above-mentioned paper experiment with weak features (oriented edge points) and strong features (SIFT descriptors) and with different resolutions of the spatial matching pyramid, this class only implements strong features clustered into 200 categories and a two-level pyramid because other configurations were either not as good as this one or did not offer any significant advantages over it (as reported in the paper).

Thus, utilizing the terminology employed by Lazebnik, et al., we have the number of channels M = 200 and the number of levels of the spatial matching pyramid L = 2. This will result in gist vectors of dimensionality:

M * (4^(L+1) - 1)/3 = M * (2^(2L+2) - 1)/3 = 200 * (2^(2*2+2) - 1)/3 = 200 * (2^6 - 1)/3 = 200 * 63/3 = 200 * 21 = 4200

See the paper for the gory details.

Definition at line 142 of file GistEstimatorBeyondBoF.H.


Member Typedef Documentation

Like other gist estimators, this one too filters the input image. Its filteration process involves subdividing the input image into 16x16 pixel patches and running SIFT on each of these patches. The filteration results are, therefore, a grid of SIFT descriptors. The following type is used to represent these results.

Definition at line 212 of file GistEstimatorBeyondBoF.H.

To assist with training, GistEstimatorBeyondBoF can be configured to operate in a special training mode in which it does not have a vocabulary from which to form gist vectors but rather simply passes back (to its client) the grid of SIFT descriptors for the input image, i.e., the results of the filteration step. The client may then store these descriptors, perform the clustering required to create the vocabulary necessary for subsequent normal use of this gist estimator, and then run the estimator in non-training mode to compute the actual gist vectors.

Training mode is set by specifying a hook function that accepts the filteration results, i.e., the grid/Image of SIFT descriptors.

Definition at line 248 of file GistEstimatorBeyondBoF.H.

In order to compute a gist vector, this estimator needs to know the vocabulary associated with the bag of features. This vocabulary is usually obtained by a clustering process as part of the training. Each ``word'' or ``vis-term'' in this vocabulary is also known as a channel. The gist vector essentially represents a ``spatial histogram'' for each of these channels.

The vocabulary itself is merely a collection of 200 SIFT descriptors (the centroids of the clusters) passed in via an Image of floating point numbers. Thus, the size of this Image would be 128x200. (The 128 comes from the dimensionality of a SIFT descriptor.)

Definition at line 226 of file GistEstimatorBeyondBoF.H.


Constructor & Destructor Documentation

GistEstimatorBeyondBoF::GistEstimatorBeyondBoF ( OptionManager mgr,
const std::string descrName = "GistEstimatorBeyondBoF",
const std::string tagName = "GistEstimatorBeyondBoF" 
)

The constructor expects to be passed an option manager, which it uses to set itself up within the INVT simulation framework.

Definition at line 159 of file GistEstimatorBeyondBoF.C.

GistEstimatorBeyondBoF::~GistEstimatorBeyondBoF (  )  [virtual]

Destructor.

Definition at line 216 of file GistEstimatorBeyondBoF.C.


Member Function Documentation

Image< double > GistEstimatorBeyondBoF::getGist (  )  [inline, virtual]

Return the gist vector (useless in training mode).

Implements GistEstimator.

Definition at line 286 of file GistEstimatorBeyondBoF.H.

void GistEstimatorBeyondBoF::num_channels ( int  n  )  [static]

Modifiers for some parameters used internally by the Lazebnik algorithm.

WARNING: These methods should be used with care. Do not call them to change the size of the SIFT vocabulary and spatial pyramid size in the "middle" of an "entire" run. That is, if you use 100 channels and a 4-level pyramid for training, don't switch to 200 channels and a 3-level pyramid during the testing phase!

Definition at line 139 of file GistEstimatorBeyondBoF.C.

static int GistEstimatorBeyondBoF::num_channels (  )  [inline, static]

Accessors for some parameters used internally by the Lazebnik algorithm.

Definition at line 167 of file GistEstimatorBeyondBoF.H.

void GistEstimatorBeyondBoF::setTrainingHook ( GistEstimatorBeyondBoF::TrainingHook  H  )  [inline]

This method should be called once during the client's initialization sequence to specify the training mode hook function to configure GistEstimatorBeyondBoF to run in training mode. If this hook is not specified, the estimator will run in ``normal'' mode and compute gist vectors from the vocabulary.

It is an error to not specify either the training hook or the vocabulary. If both are specified, the training hook takes precedence, i.e., the estimator runs in training mode, wherein it passes back filteration results (grid of SIFT descriptors) to the client rather than computing gist vectors.

Definition at line 293 of file GistEstimatorBeyondBoF.H.

void GistEstimatorBeyondBoF::setVocabulary ( const Image< float > &   ) 

This method should be called once during the client's initialization process prior to attempting to obtain gist vectors for input images. Thus, the clustering phase of the training must be complete before this estimator can be used to compute gist vectors.

GistEstimatorBeyondBoF::SIMCALLBACK_DECLARE ( GistEstimatorBeyondBoF  ,
SimEventRetinaImage   
) [protected]

Callback for when a new input (retina) frame is available.


The documentation for this class was generated from the following files:
Generated on Sun May 8 08:21:56 2011 for iLab Neuromorphic Vision Toolkit by  doxygen 1.6.3