Temporal spatio-velocity transform and its applications
Abstract
Object tracking is important in various applications such as video surveillance
systems, video annotation systems, and human interaction classification systems.
Occlusion and noise are the most significant problems in object tracking. In order to
overcome these problems, I introduce the temporal spatio-velocity (TSV) transform,
which extracts pixel velocities from image sequences. The TSV transform appends the
velocity axes to the image sequences and separates occluding objects based on their
velocities. The TSV transform is derived from the Hough transform over windowed
spatio-temporal images. I present the methodology of the transform and its
implementation in an iterative computational form. The intensity at each pixel in the TSV
image represents a measure of the likelihood of occurrence of a pixel with instantaneous
velocity in the current position. Binarization of the TSV image extracts blobs based on
the similarity of velocity and position. The TSV transform provides an efficient way to
remove noise by focusing on stable velocities, and constructs noise-free blobs. In this
dissertation, I introduce three applications using the TSV transform. The applications are
(i) human interaction recognition system, (ii) object tracking system in occluding
environments, and (iii) soccer player tracking system. The human interaction recognition
system uses side-view image sequences and tracks persons walking on sidewalks. Then it
recognizes the interactions between two persons such as “two persons meet from
different directions” and “one person follows another person”. The system correctly
tracks persons and recognizes the interactions between them. The object tracking system
in occluding environments tracks moving objects behind static obstacles, such as trees
and fences. Although the static obstacles divide moving objects into several pieces both
temporally and spatially, the system correctly tracks the objects. The soccer player
tracking system tracks soccer players and referees using the ordinary TV broadcasting
images. Although the soccer players make complex movements and the camera moves
frequently, the system correctly tracks the players and referees.
Department
Description
text