4. What is the
long-term vision behind the Beobot philosophy?
Animals demonstrate unparalleled abilities to interact with their
natural visual environment, a task which remains embarrassingly
problematic to machines. Obviously, vision is computationally
expensive, with a million distinct nerve fibers composing each optic
nerve, and approximately half of the mammalian brain dedicated more or
less closely to vision. Thus, for long, the poor real-time performance
of machine vision systems could be attributed to limitations in
computer processing power. With the recent availability of low-cost
supercomputers, such as so-called Beowulf clusters of standard
interconnected personal computers, however, this excuse is rapidly
losing credibility. What could then be the reason for the dramatic
discrepancy between animal and machine vision? Too often computer
vision algorithms are designed with a specific goal and setting in
mind, e.g., detecting traffic signs by matching geometric and
colorimetric models of specific signs to image features. Consequently,
dedicated tuning or algorithmic alterations are typically required to
accommodate for novel environments, targets or tasks. For example, an
algorithm to detect traffic signs from images acquired by a
vehicle-mounted camera will typically not be trivially applicable to
the detection of military vehicles in overhead imagery.
Much progress has been made in the field of visual neuroscience,
using techniques such as single neuron electrophysiology,
psychophysics and functional neuroimaging. Together, these
experimental advances have set the basis for a deeper understanding of
biological vision. Computational modeling has also seen recent
advances, and fairly accurate software models of specific parts or
properties of the primate visual system are now available, which show
great promise of unparalleled robustness, versatility and
adaptability. A common shortcoming of computational neuroscience
models, however, is that they are not readily applicable to real
images. Neuromorphic engineering proposes to address this
problem by establishing a bridge between computational neuroscience
and machine vision. An example of neuromorphic algorithm developed in
our laboratory is our model of
bottom-up, saliency-based visual attention, which has demonstrated
strong ability at quickly locating not only traffic signs in 512x384
video frames from a vehicle-mounted camera, but also -- without any
modification or parameter tuning -- artificial targets in
psychophysical visual search arrays, soda cans, emergency triangles,
faces and people, military vehicles in 6144x4096 overhead imagery, and
many other types of targets.
The new Beobot robotics platform is a test-bed aimed at
demonstrating how neuromorphic algorithms may yield a fresh
perspective upon traditionally hard engineering problems, including
computer vision, navigation, sensorimotor coordination, and decision
making under time pressure. This contrasts with the motivation behind
biorobots, which aim at physically and mechanically resembling animal
systems. To exploit real-time video streams and effectively base
control on computationally-demanding neuromorphic vision algorithms,
our new robots combine a small Beowulf cluster to a low-cost but agile
four-wheel-drive robotics platform, together forming a
Beowulf-robot or Beobot.
What will Beobots be capable of that existing robots cannot already
achieve? Most robots have underemphasized the vision component that is
our main focus, and rely instead on dedicated sensors including laser
range finders and sonars. Much progress has been made in developing
very impressive physically capable robots (e.g., Honda
humanoid). In addition, very sophisticated and powerful algorithms
are now available that make robots intelligent (e.g., teams
of robots collaborating towards a common goal at the USC robotics
lab). However, we believe that some improvement still is possible in
making robots more visually capable, as current systems often
rely on simplified, low-computation visual tricks which greatly limit
their autonomy and restrict their applicability to perfectly
well-defined, prototypical environments.
|
5. How does a
Beobot work?
The Beobot is basically a high-load R/C car carrying a small
cluster of PC computers on top of it. The PCs are loaded with flash
memory as replacement hard drives, and a standard Linux OS is
installed on them. iLab's neuromorphic vision software, which
includes subsystems for focal visual attention, object recognition,
rapid computation of scene gist and layout, and for scene
understanding, runs on the PC cluster. The PCs use serial ports
to communicate with the R/C car engine, telling it how much to turn
and how fast to go. Firewire interface is used to connect to cameras,
constantly feeding video data that the navigation software can
decipher and act on.
|