1.2. Previous work
An extensive body of literature has been accumulated in the computer vision community regarding the
study of motion. Most of this literature focuses on the structure-from-motion proble 2 which involves
the computation of camera, object andJor environ- mental parameters based on relative motion between these entities. Often, assumptions are made in visual motion research that prohibit the use of the proposed techniques in applications such as robotic visual servoing. A large number of previously proposed systems exhibit one or more of the following characteristics:
• Systems which avoid the visual detection issue altogether. They assume that their methods are applied on an image after the presence of moving objects has been identified and measured.
• Systems which are applied to artificially trivial conditions that do not occur in natural settings.
In our work, we have tried to avoid these assumptions, specifically focusing on addressing the 8°11l of visually detecting objects of interest for robotic visual servoing.
The majority of existing attempts at detecting moving objects has employed either optical flow or frame-differencing techniques. Optical flow methods are interesting, since they naturally encompass ego- motion of the camera (although some optical flow methods have the equalizing disadvantage of
actually requiri •8 é o-motion). For instance, Jain,’ Nelson,”’ and Thompson and Pon 6 have compiled
a collection of optical flow-based motion detection algorithms which detect a moving object as an inconsistency in some constraint on the optical flow field. Some of these optical flow-based algorithms use a constraint that is based on the orientation of motion vectors away from a focus of expansion (FOE). However, algorithms using the FOE con- straint are not reliable when the distance between a moving object and the FOE is small. Another common optical flow constraint is the assumed relation between optical flow gradients and corre- sponding depth disparities (typically computed with a stereo vision system). Instead of using a stereo vision system in our research, we have restricted ourselves to monocular systems that can acquire visual information with relatively unsophisticated off-the-shelf sensor devices.
In contrast to detection based on optical flow, our framework shares many characteristics with other frame-differencing techniques. An example of these is the system developed by Anderson ei at.’ They detect motion through the use of the Gaussian/Laplacian pyramid which Burt and others have used in a variety of computer vision systems."" The pyramid is applied to the difference between a current input frame and the previously input frame. This has the disadvantage of only signaling appearing and disappearing edges of a moving object. Moreover, the difference responds similarly to large objects as it does to fast ones. Both of these situations do not
occur in our approach. The Anderson method 7 uses
another Gaussian pyramid to facilitate subsampling
down to a level where motion segmentation can be 1.3. Structure of the paper
performed by a general purpose computer. However, This paper incrementally presents each one of the subsampling is restricted to the logarithmic levels