iLab Neuromorphic Vision C++ Toolkit Screenshots

The following are examples of how to use the programs provided with the iLab Neuromorphic Vision C++ Toolkit. Click on each thumbnail to see the corresponding full-size image or movie clip. These examples are refreshed nightly by running 'make demo'. To run the demos interactively, try the following from your local saliency/ source directory:

./configure             # if not done previously
make core               # if not done previously
cd screenshots
./build.tcl runner
./rundemo.sh

which will ask you interactively which demo to run and will show you the results.

For additional explanations about the meaning of the various command-line options used in these examples, try to run the same command plus a --help command-line option, and look for the definitions of the various options used.

screenshot for demo #1
Goal: Attend to salient locations in a static image
Command: ezvision --in=frame000000.png -T --out=display -+
Outputs: (none; output is sent to the screen)
Notes: There are several options here. The --in option says to take input from the named raster file; we support several raster file formats, including png and pnm (ppm/pgm/pbm). The -T option says to generate an output showing the attention trajectory superimposed on the input image. The --out option says to send output to the screen. Finally, the -+ option says to keep processing even after the input stream is exhausted (ezvision is a stream-oriented program; without a -+ option, it would quit once it realized that there was no second input frame to be processed) -- thus, the program will keep running until you interrupt it with a Control-C.
screenshot for demo #2
Goal: As before, but save the output to disk
Command: ezvision --in=frame000000.png -T --out=display --out=mpeg --out=raster --output-frames=0-29@30Hz -+
Outputs: T.mpg T[000000-000029].pnm
Notes: This is similar to the first demo, but with two differences. First, we have multiple --out options (this illustrates the general feature that you can send copies of the output to an unlimited number of valid destinations). We send output to the display as before, but we also save the output to an mpeg file (--out=mpeg) and to a series of raster files (--out=raster). In each case the files are named after the particular output type that we've requested, which in this case is 'T'. Thus you'll notice that the onscreen window is titled 'T', the output mpeg file is named 'T.mpg', and the output raster files are named T000000.pnm, T0000001.pnm, etc. The second new option here is --output-frames, which gives a range of output frames and a rate at which to generate output frames. Thus we've requested 30 frames (number 0-29), coming at 30 frames per second (30Hz). You can also specify the frame rate as a time interval using an appropriate suffix (e.g. '0.033333s', '33.333ms', '33333.0us' would all be roughly equivalent to '30Hz'). Finally note how the results you obtain will differ slightly each time you run this command, as we add small amounts of random neural noise at several stages in the model. If you want reproducible results, specify --nouse-random as an additional command-line option.
screenshot for demo #3
Goal: As before, but trigger outputs only when there is a new shift of attention
Command: ezvision --in=frame000000.png -T --out=display --output-frames=0-9@EVENT -+
Outputs: (none; output is sent to the screen)
Notes: By default, the value of the --output-frames option is 0-MAX@30Hz, which says to do a (virtually) infinite number of output frames, the frames coming in a 30Hz cycle. We have already seen how to restrict the output frame range; it is also possible to request that the output be synced to interesting "events", rather than to a clock. In the case of ezvision, interesting "events" are shifts of attention; so, in this example, we give "@EVENT" rather than "@30Hz" and thus we get a new output frame when and only when there is a new shift of attention.
screenshot for demo #4
Goal: Get the coordinates of the first 5 most salient locations in an image
Command: ezvision --top5 --in=frame000000.png
Outputs: top5.txt T000000.pnm ... T000004.pnm
Notes: The first 5 shifts of attention towards the 5 most salient locations in the image are shown in the output images T000000.pnm ... T000004.pnm. In addition, the (x,y) coordinates to these locations are saved in the text file top5.txt, along with other useful information. Note how --top5 is an option alias, that is, actually is a shorthand for a series of command-line options. Have a look at 'ezvision --help', towards the very bottom of the list, for the definition of --top5 which will also give you hints on how to work from these initial options to some slightly different output you may want.
screenshot for demo #5
Goal: Just compute the saliency map for an image and save it to disk
Command: ezvision --just-initial-saliency-map --in=frame000000.png
Outputs: VCO000000.pfm
Notes: Here we just save the initial saliency map (i.e., the output from the VisualCortex) before any shift of attention, and exit. A floating-point version of the map will be saved in our proprietary PFM image format and with prefix 'VCO', which is good for comparisons across images as it contains absolute saliency values. See saliency/matlab/pfmreadmatlab.m for a simple function to read a PFM image into Matlab, or the simple program saliency/bin/pfmtopnm. Note that the values in the VCO map are typically very small, as they are in Amperes of synaptic current flowing into the integrate-and-fire neurons of the dynamic saliency map. Typical values are in the nA range (1.0e-9 Amps). Note how --just-initial-saliency-map is an option alias, that is, actually is a shorthand for a series of command-line options. Have a look at 'ezvision --help', towards the very bottom of the list, for the definition of --just-initial-saliency-map which will also give you hints on how to work from these initial options to some slightly different output you may want.
screenshot for demo #6
Goal: Attend to salient locations in a static image, show attended locations + saliency map + feature maps
Command: ezvision -K --in=frame000000.png --out=raster
Outputs: T000000.pnm
Notes: As usual, you can add --out=display to display the results in an X window, and they will be refreshed on each new output frame (though you'll need to add -+ to tell ezvision to keep going after its input is exhausted).
screenshot for demo #7
Goal: Same as above, but use long-range competition for salience as in the IEEE PAMI paper
Command: ezvision -K --maxnorm-type=Maxnorm --in=frame000000.png --out=raster -+
Outputs: T000000.pnm
Notes: If the resulting image is too large to fit on your screen, try --rescale-output=<width>x<height> to resize the output images.
screenshot for demo #8
Goal: Create a cool 3D output where the original image is warped onto the saliency map, with height proportional to saliency
Command: ezvision -3 --in=frame000000.png --out=raster:3d --out=display -+
Outputs: 3d-T000000.pnm, 3d-T000001.pnm, etc.
Notes: Note that we have given a prefix, "3d", for the raster filenames, so that the files are named 3d-T000000.pnm, etc. instead of just T000000.pnm. Also note that by default the values in the saliency map are autoscaled to the 0..255 range for display purposes. To use a fixed scaling factor instead (which makes more sense when you want to compare several saliency maps), try --display-map-factor=50000. Note how each time attention shifts to the maximum of the saliency map (yellow circle), that location then is suppressed from the saliency map by the inhibition-of-return mechanism; also, after all of the interesting locations have been visited once, subsequent visits to those locations are considered "boring" and are marked with a green instead of yellow circle. Try running this command with an additional --ior-type=None to disable inhibition-of-return.
screenshot for demo #9
Goal: Get a cropped 100x100 imagelet around each attended location
Command: ezvision -T --crop-foa=100x100 --in=frame000000.png --out=display --out=raster:foa -+
Outputs: foa-T000000.pnm, foa-T000001.pnm, etc.
Notes: For this to be useful, usually you want to also turn off some of the markings that the program writes by default onto the output images (yellow circle, time, yellow square, etc). Try this in conjunction with --nodisplay-time --nodisplay-traj --nodisplay-patch --nodisplay-foa to see the difference.
screenshot for demo #10
Goal: For the center-surround mechanisms, use center scales of 1 through 3, center-surround scale differences of 2 through 5, and build the saliency map at scale 3; this will tweak the system towards being sensitive to smaller objects than by default.
Command: ezvision -K --levelspec=1-3,2-5,3 --in=frame000000.png --out=display --out=raster:levelspec -+
Outputs: levelspec-T000000.pnm, levelspec-T000001.pnm, etc.
Notes: The default levelspec is 2-4,3-4,4.
screenshot for demo #11
Goal: Use twelve orientation channels spanning 0 through 180deg instead of the default 4.
Command: ezvision -K --num-orient=12 --in=frame000000.png --out=display --out=raster:12ori -+
Outputs: Tframe000000.pnm
Notes: A similar option exists for the number of motion direction channels, --num-directions=XX
screenshot for demo #12
Goal: Use a visual cortex with only intensity and orientation channels (no color, motion, flicker), and give a weight of 3.0 to orientation and 1.0 to intensity.
Command: ezvision -K --vc-chans=O:3I:1 --in=frame000000.png --out=display --out=raster:vctype -+
Outputs: vctype-T000000.pnm, vctype-T000001.pnm, etc.
Notes: You can pick any combination of available channels using this option.
screenshot for demo #13
Goal: Estimate the rough shape of the attended object instead of having a fixed circular focus of attention.
Command: ezvision -K --shape-estim-mode=FeatureMap --in=frame000000.png --out=display --out=raster:se -+
Outputs: se-T000000.pnm, se-T000001.pnm, etc.
Notes: The shape estimator can work in different modes; see our BMCV'2002 paper for details. Note also that when using a circular focus of attention, its radius is automatically set to max(imageWidth, imageHeight)/12; you can change that using --foa-radius=XX (this will affect inhibition-of-return).
screenshot for demo #14
Goal: Process images frame000000.png through frame0000019.png as a movie with a new frame being input to the Brain every 30ms of simulated time. Write output frames every 3ms of simulated time. Do not use additive displays, that is, show at most one attended location on each frame rather than a cumulative drawing. Do not display output frames, so that the program can run in batch mode until all inputs have been processed.
Command: ezvision -T --nodisplay-additive --input-frames=@30ms --output-frames=@3ms --in=frame#.png --out=display --out=raster
Outputs: T000000.pnm ... T000199.pnm
Notes: To use a series of raster files as an input movie, the files must all have the same name except for a six-digit frame number. Then, when you pass the filename to --in, put a '#' in the place where the frame number should go. Thus --in=frame#.png (short for --in=raster:frame#.png, but ezvision can infer the 'raster' from the '.png' extension) will read frame000000.png, frame000001.png. By default, it will keep reading input files until it encounters a missing raster file. If you want it to stop sooner, you can specify a frame range with the --input-frames option, such as --input-frames=0-19@30ms. Note that ezvision can also read mpeg movies (using the ffmpeg/avcodec library) as input, so you could just as well do --in=movie.mpg. To generate output movies, you can either generate mpegs directly using --out=movie.mpg (straightforward, but you have little control over the movie quality), or you can save a series of raster files using --out=raster:myprefix and then encode those mpeg_encode or ppmtompeg.
screenshot for demo #15
Goal: Same as above, but now simulate eye movements using a simplistic mass/spring/friction model, and apply a foveation filter (simulating the decay of resolution in the retina with eccentricity) to the input frames around current eye position.
Command: ezvision -T --nodisplay-additive --input-frames=@24Hz --output-frames=@240Hz --ehc-type=Simple --hsc-type=None --esc-type=Friction --foveate-input-depth=5 --in=frame#.png --out=display --out=raster
Outputs: T000000.pnm ... T000199.pnm
Notes: Note how once you have chosen an Eye/Head Controller using the --ehc-type option, a Head Saccade Controller using --hsc-type, and an Eye Saccade Controller using the --esc-type option, if you subsequently use a --help option you will see new options that are available for the type of saccade controller which you have selected (e.g., when a Friction saccade controller is selected, this creates new options that are specific to that controller, like --eye-spring-k).
screenshot for demo #16
Goal: Same as above, but do not display the yellow circle, and save both the foveated frames and the saliency map.
Command: ezvision -Y --nodisplay-additive --input-frames=@33Hz --output-frames=@0.005s --ehc-type=Simple --hsc-type=None --esc-type=Friction --foveate-input-depth=5 --nodisplay-foa --display-map-factor=50000 --in=frame#.png --out=display --out=raster
Outputs: T000000.pnm ... T000199.pnm
Notes: See here for many more cool movies.

That should be a good start. Have a look at the various perl scripts in saliency/bin/ for additional examples of interesting combinations of command-line options.

Copyright © 2002-2004 by the University of Southern California, iLab and Prof. Laurent Itti. Last updated Wednesday, 05-Oct-2016 16:20:31 PDT