UCSD Main WebsiteUCSD Jacobs SchoolDepartment of Computer Science and Engineering
About CSECSE PeopleFacultyGraduate EducationUndergraduate EducationDepartment AdministrationContact CSE
spacer gif
spacer gif
CSE People
spacer gifspacer gif
spacer gif
spacer gifspacer gifAbout CSE
spacer gif
spacer gifspacer gifCSE People
spacer gif
spacer gifspacer gifFaculty & Research
spacer gif
spacer gifspacer gifGraduate
spacer gif
spacer gifspacer gifUndergraduate
spacer gif
spacer gifspacer gifDepartment Administration
spacer gif
spacer gif
spacer gif
Search
spacer gifspacer gifspacer gif
 
 
Google
spacer gifspacer gif
spacer gif
spacer gif
spacer gif
spacer gif
spacer gif
Home»CSE Public Calendar»Abstract - Malik

spacer gif
Recognizing Objects and Actions in Images and Video
spacer gif
spacer gifspacer gifspacer gif
spacer gif

Speaker: Jitendra Malik
UC Berkeley

Monday, February 10, 2003
11:00 pm to 12:00 pm
AP&M Room 4301

ABSTRACT

The central problem in computer vision is that of "understanding" images and video. I will talk about recent progress at UC Berkeley on two principal components: recognizing objects and recognizing actions.

The object recognition problem is that of finding instances of object classes in an image or video sequences: faces, giraffes, the digit 5, chairs etc. This has to be accomplished while allowing for intra-class variation, as well as changes in illumination and viewpoint. Belongie, Malik and Puzicha (2001) introduced a relational descriptor for shapes represented as point sets, the "shape context". This enables one to compute similarity measures between shapes which, together with similarity measures for texture and color, can be used to drive object recognition. I will show further steps to a complete theory of object recognition based on shape contexts. These include (1) algorithmic speedups for finding likely matches at a computational complexity sublinear in the number of models (2) dealing with scene clutter (3) adaptive measures of shape distance for discriminative categorization. I will show results on a variety of 2D and 3D recognition problems.

The action recognition problem is that of finding instances of actions in video sequences: run, jump, kick etc. This has to be accomplished while allowing for variation in the person performing the action, clothing, illumination and viewpoint. We have developed two approaches to recognition of actions. In low resolution data, ("far field") the approach is based on collecting low resolution optical flow measurements over a spatiotemporal volume for each moving figure, constructing a robust descriptor from this volume, and then matching these to stored sequences. We show generalization over person, clothing and illumination while pose variations are dealt in a multiple-view framework. In high resolution data ("near field") the approach is based on extracting stick figures in each frame, and relying on joint level human body tracking to provide a complete intermediate representation which is robust to lighting, clothing as well as pose.

This talk is based on joint work; please visit http://http.cs.berkeley.edu/projects/vision/vision_group.html for pointers to publications.

spacer gif
spacer gif
spacer gifback to top ^
spacer gif
spacer gif
spacer gif
spacer gif
9500 Gilman Drive, La Jolla, CA 92093-0404
spacer gif
About CSE | CSE People | Faculty & Research | Graduate Education | Undergraduate Education
Department Administration | Contact CSE | Help | Search | Site map | Home
webmaster@cs.ucsd.edu
Official web page of the University of California, San Diego
Copyright © 2003 Regents of the University of California. All rights reserved.
spacer gif