Scene Understanding for Personal Robots
We consider the problem of high level scene understanding for personal robots. Thanks to the availability of Kinect sensors, our robots can now easily obtain colored 3D pointclouds of it's environment. We perform structured prediction to label these pointclouds into 17 object categories. We use a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurence relationships and geometric relationships. It is trained using a maximum-margin learning approach.
|Original Scene||Ground Truth Labels||Predicted Labels|
New Scientist, ACM Technews, Newswise, Zee News, News Tonight, Azo Robotics, VoiCE, iNewsOne.
Download data and code.
- Contextually Guided Semantic Labeling and Search for 3D Point Clouds, Abhishek Anand, Hema S. Koppula, Thorsten Joachims, Ashutosh Saxena. In IJRR, 2012. [PDF]
- Semantic Labeling of 3D Point Clouds for Indoor Scenes, Hema S. Koppula, Abhishek Anand, Thorsten Joachims, Ashutosh Saxena. In NIPS, 2011. [PDF]
- Labeling 3D scenes for Personal Assistant Robots, Hema Swetha Koppula, Abhishek Anand, Thorsten Joachims, Ashutosh Saxena. In RSS workshop on RGB-D Cameras, 2011. [PDF]
- 3D-Based Reasoning with Blocks, Support, and Stability. Zhaoyin Jia, Andy Gallagher, Ashutosh Saxena, Tsuhan Chen. In Computer Vision and Pattern Recognition (CVPR), 2013. [PDF]
- Hallucinated Humans as the Hidden Context for Labeling 3D Scenes, Yun Jiang, Ashutosh Saxena. In Computer Vision and Pattern Recognition (CVPR), 2013. [PDF, project page]
|Hema Koppula||hema at cs.cornell.edu (Corresponding Author)|
|Prof. Thorsten Joachims||tj at cs.cornell.edu|
|Prof. Ashutosh Saxena||asaxena at cs.cornell.edu|