Person Tracking

Multi-Person Tracking

Person tracking is one of the main topics of research of the CVHCI group. Many of the other topics investigated, such as face identification, body pose analysis, or focus of attention analysis often depend on the detection and localization of persons in the first place. The algorithms developed are designed to function in a fully automatic way, initializing tracks for one or several persons, adapting person models, handling inter-person dependences such as occlusions, etc.

For this, multiple types of features are considered and integration is performed, where possible using probabilistic, real-time capable techniques, such as particle filters, variations thereof, or other recursive Bayesian estimation methods. Throughout all researched person tracking subtopics or applications, a special focus is always kept on real-time performance of algorithms, and on their usability in realistic, relatively unconstrained scenarios. The effect is that most developed tracking systems can be seamlessly integrated with other components and readily demonstrated in fully-functional live systems.

Three main areas of research are currently being pursued in the field of person tracking at the CVHCI group:

  • The simultaneous tracking of multiple persons in smart environments, using multiple sensors (cameras and microphones). Research on fully integrated and automatic tracking systems for smart environments was started in the course of the EU-funded CHIL project, in a friendly competition amongst consortium partners, and is still a strong topic today. Tracking algorithms developed at CVHCI were evaluated in the international CLEAR evaluation workshops, and ranked first and second for the CLEAR 2007 tasks of audio-visual and visual tracking, respectively. Further, integrated systems performing audio-visual tracking and simultaneous run-on audio-visual identification of multiple persons using faces and voices in unconstrained setups have also been developed.
  • The tracking of multiple users from a robot's perspective, using binocular camera setups. The bulk of the research here is driven by the humanoid research project SFB 588. To allow for natural human-robot interaction, a tracking system using stereo cameras and capable of detecting and tracking multiple users in real-time under varying apperances using only little computational resources was developed. The system has been succesfully tested and applied in a wide range of practical HCI applications.
  • The tracking of single or multiple persons through a network of distributed cameras covering wide areas. One of the directions pursued here is the simultaneous tracking of several users in a closed-set office building scenario, using simple high-level features. These features include, for ex., the hair color, presence of glasses, clothing color, gender, etc. Bayesian methods allowing for robust tracking and graceful degradation when faced with severe sensor or feature extraction failures are investigated. A relatively new direction being pursued is the inter-camera tracking of single individuals in crowded environments, such as train stations or airports. The focus is put here on the determination of adequate, flexible features, the automatic, unsupervised creation of multi-view person models, and the accurate re-identification of persons under extremely challenging conditions.

Apart from these main research areas, where much of the focus is put on the detection and tracking itself, more or less sophisticated tracking algorithms are developed and integrated as components in other applications. This includes, for ex., the tracking of faces in video sequences for more effective feature extraction and identification, the tracking of head pose and facial features using Active Appearance Models, the tracking of heads and upper bodies in the moving images from PTZ cameras, and so forth.

Related Videos:

Selected Recent Publications:

  • Keni Bernardin and Rainer Stiefelhagen, Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics, EURASIP Journal on Image and Video Processing, Special Issue on Video Tracking in Complex Scenes for Surveillance Applications, Volume 2008, Article ID 246309, May 2008. (pdf) (link)
  • Rainer Stiefelhagen, Keni Bernardin, Rachel Bowers, Travis Rose, Martial Michel and John Garofolo, The CLEAR 2007 Evaluation, Proceedings of the International Evaluation Workshops CLEAR 2007 and RT 2007, Baltimore, MD, USA, May 8-11, 2007 Springer Lecture Notes in Computer Science, No. 4625., pp 3-34. (pdf) (link)
  • Keni Bernardin, Rainer Stiefelhagen, Alex Waibel, Probabilistic Integration of Sparse Audio-Visual Cues for Identity Tracking, ACM Multimedia 2008, October 27-31, 2008, Vancouver, BC, Canada.
  • Kai Nickel and Rainer Stiefelhagen, Dynamic Integration of Generalized Cues for Person Tracking, 10th European Conference on Computer Vision - ECCV'08, October 12-18, Marseille, France. (pdf)
  • Florian van de Camp, Keni Bernardin and Rainer Stiefelhagen, Person tracking in Camera Networks Using Graph-Based Bayesian Inference, Third ACM / IEEE International Conference on Distributed Smart Cameras (ICDSC), Como, Italy, Aug. 2009
  • See publications page for more!