The basic skills of perception, planning and language understanding are critical for robots to perform tasks in the human environments. For example, an industrial robot needs to detect objects to be manipulated, plan its motions and communicate with the human operator. A self-driving robot needs to detect objects on the road, plan where to drive and also communicate with the passenger.

In our lab, instead of manually "programming" our robots, we take a machine learning approach where we use variety of data and learning methods to train our robots. Our robots learn from watching (3D) images on the Internet, from observing people via RGB-D cameras, from observing users playing video games, and from humans giving feedback to the robot.

Here a few example videos showing our robots performing tasks using our learning algorithms.


Large-scale Knowledge-Engine for Robots. It learns concepts by searching the Internet. It can interpret natural language text, images, and videos. It watches humans and learn things from interacting with them.

Robobarista: Generalizing Manipulation in 3D Pointclouds.

Using deep learning, robot learns to transfer manipulation trajectories to novel objects utilizing a large collection of demonstrations from crowd-sourcing.

Brain4Cars: Machine Learning for Smart Car Cabins.

Cabin Sensing for Safe and Personalized Driving. Our algorithms anticipate the driver's maneuvers seconds before they occur.

Tell Me Dave: Grounding Language to Manipulation.

Grounding language to actions for a given environment taking into account ambiguity in the language and variations in the environment. Learn from users playing video games!

PlanIt: Learning User Preferences.

Robot learns context-driven user preferences on motion plans via sub-optimal feedback in Co-Active Learning setting. Learn from online rating feedback, and interactive feedback on the robots.

Detecting/Anticipating Human Activities from RGB-D videos.

Anticipate the activities a human will do next (and how!) to enable robots to plan ahead for reactive responses. Learn by watching people!

Robots Hallucinating Humans.

Just given 3D images from the Internet, robots figure out how humans use their environments. Learn by hallucinating how humans would have done it.

Manipulation: Deep Grasping, Placing and Preparing Salad.

Using deep learning methods, our algorithms learn the featuresto grasp novel objects from raw image data. Also see: robot manipulation for arranging rooms, and haptic manipulation.

3D Scene Understanding
Segment and Detect objects (and their attributes) in a 3D Scene by reasoning about their shape, appearance, and geometric properties, as well as physics-based reasoning. ROS/PCL code + dataset available. Learn through Physics!