Cross-modal adaptation for RGB-D detection
Title | Cross-modal adaptation for RGB-D detection |
Publication Type | Conference Proceedings |
Year of Publication | 2016 |
Authors | Hoffman, J., Gupta S., Leong J., Guadarrama S., & Darrell T. |
Published in | IEEE International Conference on Robotics and Automation (ICRA) |
Page(s) | 5032-5039 |
Date Published | 05/2016 |
Publisher | IEEE |
ISBN Number | 978-1-4673-8026-3 |
Accession Number | 16055574 |
Keywords | Adaptation models, Detectors, Object Detection, Proposals, Robots, Training, Training data |
Abstract | In this paper we propose a technique to adapt convolutional neural network (CNN) based object detectors trained on RGB images to effectively leverage depth images at test time to boost detection performance. Given labeled depth images for a handful of categories we adapt an RGB object detector for a new category such that it can now use depth images in addition to RGB images at test time to produce more accurate detections. Our approach is built upon the observation that lower layers of a CNN are largely task and category agnostic and domain specific while higher layers are largely task and category specific while being domain agnostic. We operationalize this observation by proposing a mid-level fusion of RGB and depth CNNs. Experimental evaluation on the challenging NYUD2 dataset shows that our proposed adaptation technique results in an average 21% relative improvement in detection performance over an RGB-only baseline even when no depth training data is available for the particular category evaluated. We believe our proposed technique will extend advances made in computer vision to RGB-D data leading to improvements in performance at little additional annotation effort. |
URL | http://www.icsi.berkeley.edu/pubs/vision/rgbddetection16.pdf |
DOI | 10.1109/ICRA.2016.7487708 |
ICSI Research Group | Vision |