Scene Understanding for Personal Robots
We consider the problem of high level scene understanding for personal robots. Thanks to the availability of Kinect sensors, our robots can now easily obtain colored 3D pointclouds of it's environment. We perform structured prediction to label these pointclouds into 17 object categories. We use a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurence relationships and geometric relationships. It is trained using a maximum-margin learning approach.
![]() |
![]() |
![]() |
| Original Scene | Ground Truth Labels | Predicted Labels |
Popular Press
New Scientist, ACM Technews, Newswise, Zee News, News Tonight, Azo Robotics, VoiCE, iNewsOne.
Videos
Data/Code
Download data and code.
Publications
- Contextually Guided Semantic Labeling and Search for 3D Point Clouds, Abhishek Anand, Hema S. Koppula, Thorsten Joachims, Ashutosh Saxena. In IJRR, 2012. [PDF]
- Semantic Labeling of 3D Point Clouds for Indoor Scenes, Hema S. Koppula, Abhishek Anand, Thorsten Joachims, Ashutosh Saxena. In NIPS, 2011. [PDF]
- 3D-Based Reasoning with Blocks, Support, and Stability. Zhaoyin Jia, Andy Gallagher, Ashutosh Saxena, Tsuhan Chen. In Computer Vision and Pattern Recognition (CVPR), 2013. [PDF coming soon]
- Labeling 3D scenes for Personal Assistant Robots, Hema Swetha Koppula, Abhishek Anand, Thorsten Joachims, Ashutosh Saxena. In RSS workshop on RGB-D Cameras, 2011. [PDF]
People
| Hema Koppula | hema at cs.cornell.edu (Corresponding Author) |
| Abhishek Anand | |
| Gaurab Basu | |
| Prof. Thorsten Joachims | tj at cs.cornell.edu |
| Prof. Ashutosh Saxena | asaxena at cs.cornell.edu |


