Segmentation and Tracking of Multiple Video Objects

with A. Colombari and V. Murino Department of Computer Science
University of Verona, Verona - Italy

Overview

This work describes a technique that produces a content-based representation of a video shot composed by a background (still) mosaic and one or more foreground moving objects. Segmentation of moving objects is based on ego-motion compensation and on background modelling using tools from robust statistics. Region matching is carried out by an algorithm that operates on the Mahalanobis distance between region descriptors in two subsequent frames and uses singular value decomposition to compute a set of correspondences satisfying both the principle of proximity and the principle of exclusion. The sequence is represented as a layered graph, and specific techniques are introduced to cope with crossing and occlusion.

The scheme on the left (click to enlarge) shows all the steps of our system. The input is a video shot, the output are shape descriptors for each moving object.

Results

Sequence name	Original sequence	Mosaic of background	Foreground sequence	Encoded/decoded sequence
Granguardia
Pedone
Stefan
Synthetic

Sequence name	Motion compensation	Stroboscopic-like summary	Objects removal	Background substitution/modification
Granguardia
Pedone
Stefan
Synthetic

Reference paper

A. Colombari, A. Fusiello, and V. Murino. Segmentation and tracking of multiple video objects. Pattern Reconition, 40(4):1307-1317, April 2007. (PDF)

Previous work

Global Mosaic and Motion Segmentation

Mosaicing and layered representation of a video shot