CS-599: Computational Architectures in Biological Vision

Course Syllabus

1/9: Course Overview and Fundamentals of Neuroscience. Overview of course syllabus: Topics covered and their relevance to neuroscience, computer vision, psychology and visual psychophysics, and signal and image processing. Overview of the major challenges in biological and computer vision; why is vision hard while it seems so naturally easy? why is half of our brain primarily concerned with vision? computation and complexity; Towards domestic robots: how far are we today? What can be learned from the interplay between biology and computer science?

1/16: Neuroscience basics. The brain, its gross anatomy; major anatomical and functional areas; the spinal cord and nerves; neurons; action potentials; demonstration of spike propagation in axons using the NEURON simulator; different types of neurons; support machinery and glial cells; synapses and inter-neuron communication; neuromodulation; power consumption and supply; adaptability and learning;

1/23: Experimental techniques in visual neuroscience. Recording from single neurons: electrophysiology; multi-unit recording using electrode arrays; stimulating while recording; anesthetized vs. awake animals; single-neuron recording in awake humans; probing the limits of vision: visual psychophysics; functional neuroimaging; Positron Emission Tomography (PET) and Single-Photon Emission Tomography (SPECT); functional Magnetic Resonance Imaging (fMRI) and the Blood Oxygen Level Dependent (BOLD) effect; BOLD is not the end: fast response; experimental design issues; blocked vs. event-related paradigms; optical imaging; Transcranial Magnetic Stimulation (TMS);

1/30: Introduction to vision. Biological eyes compared to cameras and VLSI sensors; different types of eyes; optics; theoretical signal processing limits in eyes and cameras; introduction to Fourier transforms and their applicability to biological and artificial vision; the Sampling Theorem; experimental probing of theoretical limits (acuity and hyperacuity); phototransduction; organization of photoreceptors in primate retina; processing layers in the retina; adaptability and gain control.

2/6: More Introduction to Vision. Leaving the eyes: optic tracts, optic chiasm; associated pathology and signal processing; the lateral geniculate nucleus of the thalamus: the first relay station to cortical processing; image processing in the LGN; notion of receptive field; primary visual cortex; cortical magnification; retinotopic mapping; overview of higher visual areas; visual processing pathways.

2/13: Low-level processing and feature detection. Basis transforms; introduction to wavelet transforms; optimal coding; jets; texture segregation; satisfying the constraints of both texture segregation and grouping; edges and boundaries; optimal filters for edge detection; random Markov fields and their relevance to biological vision; simple and complex cells; cortical gain control; columnar organization of cortex, hypercolumns, and short-range interactions; long-range horizontal connections and non-classical surround modulation; how can artificial vision systems benefit from these recent advances in neuroscience?

2/20: Coding and representation. Spiking vs. mean-rate neurons; spike timing analysis; autocorrelation and power spectrum; population coding; neurons as random variables; optimal methods for reading out population codes; statistically efficient estimators, Cramer-Rao bound and Fisher Information; entropy; mutual information; principal component analysis (PCA); independent component analysis (ICA); application of these neuroscience analysis tools to engineering problems where data is inherently noisy (e.g., consumer-grade video cameras, VLSI implementations, computationally efficient approximate implementations).

2/27; Stereoscopic vision. Challenges in stereo-vision and depth perception; the Correspondence Problem; inferring depth from several 2D views; several cameras vs. one moving camera; brief overview of epipolar geometry and depth computation; neurons tuned for disparity; size constancy; do we segment objects first and then match their projections on both eyes to infer distance? random-dot stereograms ("magic eye"): how do they work and what do they tell us about the brain?

3/6: Perception of motion. Optic flow; segmentation and regularization issues; efficient algorithms; robust algorithms; the spatio-temporal energy model; computing the focus of expansion and time-to-contact; motion-selective neurons in cortical areas MT and MST;

3/13: Spring recess
3/20: Color perception. Color-sensitive photoreceptors (cones); visible wavelengths and light absorption; the Color Constancy problem: how can we build stable percepts of colors despite variations in illumination, shadows, etc;

3/27: Visual illusions. What illusions can teach us about the brain; examples of illusions; which subsystems studied so far do various illusions tell us about? what computational explanations can we find for many of these illusions?

4/3: Visual attention. Several kinds of attention; image-driven (bottom-up) and volitional (top-down) attentional control; overt (involving eye movements) and covert (shifting attention with your eyes fixed) modes of attention; attentional modulation of early visual processing; how can understanding attention contribute to computer vision systems? biological models of attention; change blindness; attention and awareness; engineering applications of attention: image compression, target detection, evaluation of advertising, more...

4/10: Shape perception and scene analysis. Shape-selective neurons in cortical area IT; coding: one neuron per object ("grandmother cell") or population codes and distributed representations? Biologically-inspired algorithms for shape perception; The "gist" of a scene: how can we get it in 100ms or less? visual memory: how much do we remember of what we have seen? the world as an outside memory and our eyes as a lookup tool; change blindness;

4/17; Object recognition. The basic issues: translation and rotation invariance; neural models that do it; 3D viewpoint invariance (data and models); Classical computer vision approaches: template matching and matched filters; wavelet transforms; correlation; etc. Examples: face recognition. More examples of biologically-inspired object recognition systems which work remarkably well [Looking for local features in certain configurations (Perona et al, Caltech); using support-vector machines to build trainable classifiers (Poggio et al; MIT); using wavelet transforms and dynamic link matching (von der Malsburg et al, USC)].

4/24: Computer graphics, virtual reality and robotics. Exploiting the limitations of the human visual system when generating computer animations; linking vision systems to robots; visuo-motor interaction; real-time implementations; towards conscious machines; parallel implementations; distributed intelligence;

After completing this course, students will have a broad understanding of the major challenges in biological and machine vision. Most importantly, they will be familiarized with the main concepts, theories, experimental techniques, and findings of state-of-the art visual neuroscience. This will allow them to easily understand new developments in neuroscience and apply these results to the design of innovative algorithms for computer vision.