For the impatient: if you want to write low-level tests IN c++ OF your c++ code, start with the whitebox tests section; if you want to write high-level regression tests for existing executables, start with the blackbox tests section.
The files for the test suite all live in the tests/
subdirectory; these include
*.pm
files) that implement the test suite framework,*.pl
files) that run the individual test suites,tests/inputs/
, but in practice some still sit directly in tests/
for historical reasons), andtests/ref
).The test suite is split into two parts: one is the main test suite, and the other is the so-called testlong suite, so named because it contains tests that take much longer to run. The main test suite can be run with:
$ make test # OR $ cd tests && ./run_test_suite.pl
Note that for a number of reasons, you must be sitting in the tests/
directory in order for ./run_test_suite.pl
to work; that is, you can't be sitting in the top-level directory and call ./tests/run_test_suite.pl
(the main reason is that many of the tests use relative, not absolute, paths to refer to input and output files).
Likewise, the testlong suite can be run with:
$ make testlong # OR $ cd tests && ./run_testlong_suite.pl
Each of the ./run_test_suite.pl
and ./run_testlong_suite.pl
scripts are implemented with the testrun.pm
perl module defined in tests
; their job is essentially to run a bunch of individual test scripts and collect status information from each of those scripts. Scripts matching test_*.pl
are run in the main test suite, and scripts matching testlong_*.pl
are run in the testlong suite. The only condition on the individual test scripts is that they must emit a line of the form "MMM of NNN tests succeeded", where MMM and NNN are integers. Specifically, testrun.pm
does that parsing with a block like the following:
if (/^([0-9]+) of ([0-9]+) tests succeeded/) { # increment counters }
After running all the scripts and parsing the status lines emitted by each script, the test suite drivers print a status summary of all the scripts that were run, and exit with a status code that indicates the number of tests that failed (thus, as usual, 0 indicates success).
SUMMARY: ALL TESTS PASSED (267 of 267) 1 of 1 tests succeeded (./test_Brain_whitebox.pl) 7 of 7 tests succeeded (./test_Component_whitebox.pl) 3 of 3 tests succeeded (./test_ImageEqual.pl) 100 of 100 tests succeeded (./test_Image_whitebox.pl) 1 of 1 tests succeeded (./test_Learn_whitebox.pl) 7 of 7 tests succeeded (./test_LevelSpec_whitebox.pl) 60 of 60 tests succeeded (./test_Pixels_whitebox.pl) 8 of 8 tests succeeded (./test_Raster_whitebox.pl) 6 of 6 tests succeeded (./test_ShapeEstimator_blackbox.pl) 19 of 19 tests succeeded (./test_ezvision_blackbox.pl) 4 of 4 tests succeeded (./test_mpeg2yuv_blackbox.pl) 4 of 4 tests succeeded (./test_retina_blackbox.pl) 19 of 19 tests succeeded (./test_scriptvision_blackbox.pl) 2 of 2 tests succeeded (./test_sformat_whitebox.pl) 17 of 17 tests succeeded (./test_vision_blackbox.pl) 9 of 9 tests succeeded (./test_yuv2ppm_blackbox.pl)
So, in theory, any script that emitted such a line could be integrated into the test suite. However, in practice, most tests fall into either the blackbox or whitebox category, and the job of writing a one of those kinds of test scripts is much easier if you use either the whitebox.pm
or the blackbox.pm
perl module.
Whitebox tests are so named because they inspect the inner workings of the code (rather than treating the code as a "black box"). So, whitebox tests are written in C++, using the TestSuite class, and are used to test the low-level workings of various C++ classes.
Implementing a whitebox test program means writing a single C++ source file plus a tiny perl script that loads the whitebox perl module and uses that to drive the C++ program, like so:
#!/usr/bin/perl -w use invt_config; use whitebox; whitebox::run("$invt_config::exec_prefix/bin/whitebox-Image");
Internally, that whitebox::run
call first calls the whitebox c++ program with a --perlquery
option to retrieve a list of available tests, and then loops over those tests, calling the c++ program with a --run
option for each test. Along the way, whitebox::run
prints each test name and the success or failure status of running that test. If a test fails, the stderr from that test run is also printed.
Writing the C++ implementation for a new whitebox test is much like writing any other new program for the toolkit. By convention, whitebox test sources live in src/TestSuite/
and are named like whitebox-MyTestClass.C
, for the whitebox tests of class MyTestClass
. You can make a template for the source file like this:
$ ./devscripts/newclass.tcl src/TestSuite/whitebox-MyTestClass generated src/TestSuite/whitebox-MyTestClass.H generated src/TestSuite/whitebox-MyTestClass.C $ rm src/TestSuite/whitebox-MyTestClass.H # you don't need the .H file
Then you need to add a line to depoptions.in
telling the dependency calculator how to build an executable from your new source file. You would want to add a line like this:
--exeformat testx , @source@TestSuite/whitebox-MyTestClass.C :@exec_prefix@/bin/whitebox-MyTestClass
This line says
${exec_prefix}/bin/whitebox-MyTestClass
(where ${exec_prefix}
is determined by the configure script; by default it is just ./
),src/TestSuite/whitebox-MyTestClass.C
,include
s from that source file, andmake testx
(itself part of make all
).Inside the source file, you need the following to bring in the TestSuite class:
#include "TestSuite/TestSuite.H"
Then you can start writing tests. Let's look at some of the code from src/TestSuite/whitebox-Image.C
as an example. First, let's look at a portion of the main()
function:
int main(int argc, const char** argv) { TestSuite suite; suite.ADD_TEST(Image_xx_type_convert_xx_1); suite.parseAndRun(argc, argv); return 0; }
There are four lines within main()
there:
TestSuite
object which will handle the command-line options and run the tests;return 0
-- even if a test fails, we still return 0, since the failure of the test is indicated through the program's stdout.The ADD_TEST
line is actually a macro that calls TestSuite::addTest()
with the address of the test function &Image_xx_type_convert_xx_1
and the name of that function as a string, i.e. "Image_xx_type_convert_xx_1"
. The idea is that the test name should have three parts, separated by _xx_
(the whitebox.pm
perl module will later clean up those ugly _xx_
and convert them to --
): first, the class name being tested, in this case Image
; second, the name of the test or group of tests, in this case type_convert
; third, the number of the test within its group, in this case 1
. A test function itself looks like this:
void Image_xx_type_convert_xx_1(TestSuite& suite) { float array[4] = { -10.9, 3.2, 254.7, 267.3 }; Image<float> f(array, 4, 1); Image<byte> b = f; REQUIRE_EQ((int)b.getVal(0), 0); // clamped REQUIRE_EQ((int)b.getVal(1), 3); REQUIRE_EQ((int)b.getVal(2), 254); REQUIRE_EQ((int)b.getVal(3), 255); // clamped }
You can make as many test functions as you want; just be sure to ADD_TEST
each of them within main()
. Basically, you can do whatever you want within a test function, and along the way you call one or more of the REQUIRE
macros to verify that things are as they should be. There are several varieties to choose from:
REQUIRE_EQ(lhs, rhs); // require (lhs == rhs) REQUIRE_EQFP(lhs, rhs, eps); // require (abs(lhs-rhs) < eps) REQUIRE(expr); // require (expr == true) REQUIRE_NEQ(lhs, rhs); // require (lhs != rhs) REQUIRE_LT(lhs, rhs); // require (lhs < rhs) REQUIRE_LTE(lhs, rhs); // require (lhs <= rhs) REQUIRE_GT(lhs, rhs); // require (lhs > rhs) REQUIRE_GTE(lhs, rhs); // require (lhs >= rhs)
Each of these is a macro that actually calls back to TestSuite::require()
or TestSuite::requireEq()
.
Blackbox tests are so named because they test the external behavior of a program, treating it as a "black box", without requiring any knowledge of its inner workings (although you yourself may use such knowledge to help decide what tests will best exercise the program).
In our toolkit, the blackbox.pm
perl module helps you implement what are essentially regression tests of executables in the toolkit. Unlike whitebox tests, where you write a dedicated piece of C++ code that implements the tests, with blackbox tests you typically have an existing program in place that already does something useful, and you just want to write some tests to verify that that program continues to behave as expected under a variety conditions (e.g., different inputs, different command-line options). In some cases you may need to tweak an existing program slightly to make it more testable; for example:
Once you have those elements in place, writing blackbox tests for you program is very easy. Let's look at tests/test_retina_blackbox.pl
as an example. Each blackbox test script has three main parts. In practice, it's probably easiest to get going by copying an existing blackbox test script and then modifying the various segments to fit your needs, but for now let's step through each part. First, it should begin with the following to import the necessary modules:
#!/usr/bin/perl -w use strict; use blackbox; use invt_config;
Second, the core of the test script is a local array of anonymous hashes, where each hash describes one test to be run; for example:
my @tests = ( # ... { name => 'bluecones--1', args => ['-f', 'testpic001.pnm', 'retinaout.ppm'], files => ['retinaout.ppm'], }, { name => 'allopt--1', args => ['-bf', 'testpic001.pnm', 'retinaout.ppm', 100, 100], files => ['retinaout.ppm'], }, );
More on that in a moment. The third and final part is a call to blackbox::run
with the name of the executable to be tested, and our local array of test descriptors:
# Run the black box tests; note that the default executable can be # overridden from the command-line with "--executable" blackbox::run("$invt_config::exec_prefix/bin/retina", @tests);
Once you have those elements in place, make sure the script is executable (with chmod +x
), and then you have a fully-functioning test script. Each test script accepts a standard set of command-line options; you can try passing --help
to see a description of the available options.
Now let's return to the test descriptors in more detail. Here's one example again:
my @tests = ( { name => 'noopt--1', args => ['testpic001.pnm', 'retinaout.ppm'], files => ['retinaout.ppm'], }, );
Each descriptor is an anonymous hash with three fields:
name
is just a human readable name for the test, using double dashes to separate different parts of the test nameargs
is a list of command line options that should be passed to the test program for this testfiles
is a list of outputs that the test program is expected to produce, and which should be compared against stored reference filesInternally, blackbox::run
loops over all of the descriptors, and for each one, it runs the test program (bin/retina
in this case) with the desired args
, and then checks each output file in files
against the corresponding reference file. If any test file doesn't match its reference file, then a detailed comparison of the two files is run and a summary of this comparison is printed.
When you first write a new test, obviously the reference files won't be there, but creating them the first time is very easy: just pass a --createref
option to the test script. For example, let's add a new test to the test_retina_blackbox.pl
test script, by adding the following to the array:
{ name => 'allopt--2', args => ['-bf', 'testpic001.pnm', 'retinaout.ppm', 90, 90], files => ['retinaout.ppm'], },
Now let's try running the test script and see what happens:
$ ./test_retina_blackbox.pl ... ========================================================= test 'allopt--1' ... running command '/path/to/saliency/bin/retina -bf testpic001.pnm retinaout.ppm 100 100' checking retinaout.ppm ... ok --------------------------------------------------------- ========================================================= test 'allopt--2' ... running command '/path/to/saliency/bin/retina -bf testpic001.pnm retinaout.ppm 90 90' checking retinaout.ppm ... FAILED! reference file '/path/to/saliency/tests/ref/allopt--2--retinaout.ppm' is missing! Raster::ReadFrame: reading raster file: testpic001.pnm PnmParser::PnmParser: PBM Reading RGB Image: testpic001.pnm retinafilt::main: Using (90, 90) for fovea center retinafilt::showtypeof: type of pix0 is 6PixRGBIhE retinafilt::showtypeof: type of pix1 is 6PixRGBIhE retinafilt::showtypeof: type of pix1-pix0 is 6PixRGBIiE retinafilt::showtypeof: type of (pix1-pix0)*dd is 6PixRGBIfE retinafilt::showtypeof: type of pix0 + (pix1-pix0)*dd is 6PixRGBIfE retinafilt::showtypeof: type of (pix0 + (pix1-pix0)*dd) * blind is 6PixRGBIfE Raster::WriteFrame: writing raster file: retinaout.ppm test FAILED (command exited with exit status '1'): --------------------------------------------------------- 4 of 5 tests succeeded FAILED tests: allopt--2
OK, so it ran our test, but of course the test failed because it didn't find the reference file that we haven't created yet. Notice how the name of the missing reference file includes both the test name (allopt--2
) and the test file name (retinaout.ppm
):
/path/to/saliency/tests/ref/allopt--2--retinaout.ppm
Now let's use the --match
option, which lets you specify a regular expression to filter the names of tests to be run, to run just our new test:
$ ./test_retina_blackbox.pl --match allopt--2 ========================================================= test 'allopt--2' ... ... --------------------------------------------------------- 0 of 1 tests succeeded FAILED tests: allopt--2
You might also try the different --verbosity
levels. The default level is 3, but you can use any of --verbosity
-1, 0, 1, 2, 3, or 4.
So now let's create the missing reference file for our new test, using the --createref
option:
$ ./test_retina_blackbox.pl --match allopt--2 --createref ========================================================= test 'allopt--2' ... running command '/path/to/saliency/bin/retina -bf testpic001.pnm retinaout.ppm 90 90' checking retinaout.ppm ... (creating reference file from results) ok --------------------------------------------------------- 1 of 1 tests succeeded
Now, having created the reference file, if we re-run the test without --createref
, it should pass:
$ ./test_retina_blackbox.pl --match allopt--2 ========================================================= test 'allopt--2' ... running command '/path/to/saliency/bin/retina -bf testpic001.pnm retinaout.ppm 90 90' checking retinaout.ppm ... ok --------------------------------------------------------- 1 of 1 tests succeeded
Now let's make the test break on purpose, to see what happens in that case. Let's change the args for our new test from this:
args => ['-bf', 'testpic001.pnm', 'retinaout.ppm', 90, 90],
to this:
args => ['-bf', 'testpic001.pnm', 'retinaout.ppm', 91, 91],
Now when we re-run the test, it fails, and prints a detailed comparison of how the test file differs from the reference file:
$ ./test_retina_blackbox.pl --match allopt--2 ========================================================= test 'allopt--2' ... running command '/home/rjpeters/projects/saliency/bin/retina -bf testpic001.pnm retinaout.ppm 91 91' checking retinaout.ppm ... FAILED check against '/home/rjpeters/projects/saliency/tests/ref/allopt--2--retinaout.ppm.gz'! comparison statistics: magnitude -25: 4 diffs magnitude -24: 9 diffs magnitude -23: 21 diffs magnitude -22: 14 diffs magnitude -21: 16 diffs ... magnitude 17: 10 diffs magnitude 18: 17 diffs magnitude 19: 10 diffs magnitude 20: 6 diffs num diff locations: 100334 file1 length: 196623 bytes file2 length: 196623 bytes % of bytes differing: 51.0286182186214 mean offset position: 103514.439880798 num (file diff location % 2) == 0: 50251 num (file diff location % 2) == 1: 50083 num (file diff location % 3) == 0: 33460 num (file diff location % 3) == 1: 33719 num (file diff location % 3) == 2: 33155 num (file diff location % 4) == 0: 25104 num (file diff location % 4) == 1: 25066 num (file diff location % 4) == 2: 25147 num (file diff location % 4) == 3: 25017 sum of file1 bytes (at diff locations): 11107399 sum of file2 bytes (at diff locations): 11100625 mean diff (at diff locations): -0.0675145015647736 mean abs diff (at diff locations): 2.01233878844659 mean diff (at all locations): -0.0344517172456935 mean abs diff (at all locations): 1.02686867762164 corrcoef: 0.999408 md5sum (test) retinaout.ppm: f7009f3aed7dd4270816b7512d4f89c8 md5sum (ref) allopt--2--retinaout.ppm: 4ffb20c024537328d692aff9309b020d Raster::ReadFrame: reading raster file: testpic001.pnm PnmParser::PnmParser: PBM Reading RGB Image: testpic001.pnm retinafilt::main: Using (91, 91) for fovea center retinafilt::showtypeof: type of pix0 is 6PixRGBIhE retinafilt::showtypeof: type of pix1 is 6PixRGBIhE retinafilt::showtypeof: type of pix1-pix0 is 6PixRGBIiE retinafilt::showtypeof: type of (pix1-pix0)*dd is 6PixRGBIfE retinafilt::showtypeof: type of pix0 + (pix1-pix0)*dd is 6PixRGBIfE retinafilt::showtypeof: type of (pix0 + (pix1-pix0)*dd) * blind is 6PixRGBIfE Raster::WriteFrame: writing raster file: retinaout.ppm test FAILED (command exited with exit status '256'): --------------------------------------------------------- 0 of 1 tests succeeded FAILED tests: allopt--2
In a real situation, you might be able to use the comparison stats to help diagnose why the test failed.
If you are sure that the reason for the test failure is innocuous, or if you have deliberately changed your program to produce different results, you can interactively replace the reference files with the --interactive
command-line option. If you give --interactive 1
, you will just see the textual comparison of the two files; if you give --interactive 2
, then two windows will also pop up showing the two image files for visual comparison (you need to have the program xv installed for that to work). Either way, you will be asked
replace previous reference file (y or n)?
for each non-matching reference file.
Use great care with this option: If you are absolutely sure you want to update ALL non-matching reference files, you can pass the --replaceref
command-line option, which will NON-interactively replace all reference files. Be sure you know what changes are going to be made when you use this option.
Each blackbox test script comes equipped with a number of useful command-line options (some of these we've discussed already).
With --createref
, any reference files that are missing will be instantiated from the corresponding test file generated during the test run. The new reference file will either go in the default ref/
directory or in the location specified by a --refdir
option.
Normally, the blackbox tests use the executable that is hardwired into the blackbox::run()
call. However, it is possible to override that by specifying an alternate executable on the command line with the --executable
option. This might be useful if you have an alternate build that has extra debugging or profiling built in.
This just lists the available command-line options along with a description of their behavior.
As we discussed before, if a non-matching test file is found, --interactive
will cause the test script to ask you whether to replace the reference file with the new test file. With --interactive 1
, you'll just get the textual comparison stats of the two files; with --interactive 2
, you'll also get a visual comparison of the two files if they are image files.
Then there is --list
, which causes the driver to list the names of available tests. For example, let's use this option with the test_ezvision_blackbox.pl
script in the tests/
directory:
$ ./test_ezvision_blackbox.pl --list ez-trajectory--1 ez-vc-type-entropy--1 ez-vc-type-variance--1 ez-raw-maps--1 ez-feature-maps--1 ez-conspicuity-maps--1 ez-conspicuity-maps--2 ez-saliency-map--1 ez-saliency-map--2 ez-saliency-map--3 ez-saliency-map--4 ez-foa--1 ez-saliency-map--5 ez-saliency-map--6 ez-variable-rate--1 ez-save-everything--1 ez-junction-channels--2 ez-target-mask--1 ez-eye-trace--1
Also useful is --list-refs
, which lists the full paths to all of the reference files involved in the test script. Again, you can combine this with --match
to restrict the output to only matching tests. For example:
$ ./test_ezvision_blackbox.pl --list-refs --match foa /path/to/saliency/tests/ref/ez-foa--1--test.txt /path/to/saliency/tests/ref/ez-foa--1--T000000.pnm.gz /path/to/saliency/tests/ref/ez-foa--1--T000001.pnm.gz /path/to/saliency/tests/ref/ez-foa--1--T000002.pnm.gz /path/to/saliency/tests/ref/ez-foa--1--T000003.pnm.gz /path/to/saliency/tests/ref/ez-foa--1--T000004.pnm.gz
Note how that list mirrors the files
element of the ez-foa--1
test descriptor in test_ezvision_blackbox.pl
:
{ name => "ez-foa--1", args => ['-ZT', '--boring-delay=FOREVER', '--boring-sm-mv=0.0', '--nodisplay-eye', '--nodisplay-eye-traj', '--textlog=test.txt', '--output-frames=0-4@250', '--crop-foa=64x64', '--in=raster:testpic001.pnm', '-+', '--out=ppm:'], files => ['test.txt', 'T000000.pnm', 'T000001.pnm', 'T000002.pnm', 'T000003.pnm', 'T000004.pnm'], },
You might want to use the output of --list
to help you select a --match
pattern for a later test run; or in fact you can use --match
along with --list
to just list matching tests.
$ ./test_ezvision_blackbox.pl --list --match sal.*map ez-saliency-map--1 ez-saliency-map--2 ez-saliency-map--3 ez-saliency-map--4 ez-saliency-map--5 ez-saliency-map--6
If you are interested in using the test suite for benchmarking, for example to compare run times with different build options, or across different machines, you may want to use the --nocomparison
option. This option causes the script to run the program with all the same command-line option sets that it normally would, but all test/reference file comparisons are skipped. That way, the vast majority of the CPU time spent will be due to running the program itself (and not spent running cmp or diff on the test files). For example:
$ ./test_retina_blackbox.pl --match bluecones --nocomparison ========================================================= test 'bluecones--1' ... running command '/path/to/saliency/bin/retina -f testpic001.pnm retinaout.ppm' checking retinaout.ppm ... comparison skipped for benchmarking --------------------------------------------------------- 1 of 1 tests succeeded
Normally, the blackbox test scripts run all the tests, continuing even some tests fail. If you want instead to have the script stop immediately if/when a test fails, you can pass --quit-on-fail
on the command line.
If you are working on a system for which the test suite doesn't pass as-is (this is usually due to differences in floating-point operations between different CPUs and between different compiler versions), you may want to use the --refdir
option to build an alternate set of reference files.
Let's say you are working on such a machine, and you want to make some changes to the source code, and you want to make sure those changes don't break the test suite. Normally, you'd just make your changes and then run the test suite, but on this machine, the test suite doesn't even pass in the first place. To get around that, you can first build a fresh set of reference files by combining --refdir
with --createref
:
$ ./test_ezvision_blackbox.tcl --refdir my_alternate_ref --createref
Then make your source code changes, then check that test suite still passes against the NEW set of reference files:
$ ./test_ezvision_blackbox.tcl --refdir my_alternate_ref
This is already automated as part of make localtest
, which will build a fresh reference set the first time it is run, and will test against that reference set on subsequent runs.
Use great care with this option! This will cause the test script to NON-interactively replace ALL non-matching reference files with the corresponding new test files. If you are going to use this option, it's generally a very good idea to first run the test script once without --replaceref
so you can see exactly which reference files would be replaced.
This option controls how much information is printed. Allowable levels are -1, 0, 1, 2, 3, 4: