Speaker: Trevor Darrell
Massachusetts Institute of Technology, CSAIL
Wednesday, October 11, 2006
11:00 am - 12:00 pm
EBU3b 1202
ABSTRACT
Devices should be perceptive, and respond directly to their human user and/or environment.
In this talk I'll present new computer vision algorithms for fast recognition, indexing, and tracking
that make this possible, enabling multimodal interfaces which respond to users' conversational gesture
and body language, robots which recognize common object categories, and mobile devices which can search
using visual cues of specific objects of interest. As time permits, I'll describe recent advances
in real-time human pose tracking for multimodal interfaces, including new methods which
exploit fast computation of approximate likelihood with a pose-sensitive image embedding.
I'll also present our linear-time approximate correspondence kernel, the Pyramid Match,
and its use for image indexing and object recognition, and discovery of object categories.
Throughout the talk, I'll show interface examples including grounded multimodal conversation
as well s mobile image-based information retrieval applications based on these techniques.
BIO
Trevor Darrell is an Associate Professor
of Electrical Engineering and Computer Science at M.I.T. He leads
the Vision Interface Group at the Computer Science and Artificial Intelligence Laboratory.
His interests include computer vision, interactive graphics, and machine learning. Prior
to joining the faculty of MIT he worked as a Member of the Research Staff at Interval Research
in Palo Alto, CA, researching vision-based interface algorithms for consumer applications.
He received his Ph.D. and S.M. from MIT Media Lab in 1996 and 1991, and a B.S.E. while
working at the GRASP Robotics Laboratory at the University of Pennsylvania in 1998.