Region-Based Convolutional Networks for Accurate Object Detection and Segmentation

TitleRegion-Based Convolutional Networks for Accurate Object Detection and Segmentation
Publication TypeJournal Article
Year of Publication2016
AuthorsGirshick, R., Donahue J., Darrell T., & Malik J.
Published inIEEE Transactions on Pattern Analysis and Machine Intelligence
Date Published01/2016
Keywordscanonical PASCAL VOC Challenge datasets, convolutional codes, Convolutional Networks, Deep Learning, Detection, Detectors, Feature extraction, high-capacity convolutional networks, image coding, image segmentation, mAP, mean average precision, Object Detection, object recognition, object segmentation, Proposals, region-based convolutional networks, semantic segmentation, source code, source coding, Support vector machines, Training, transfer learning

Object detection performance, as measured on the canonical PASCAL VOC Challenge datasets, plateaued in the final years of the competition. The best-performing methods were complex ensemble systems that typically combined multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 50% relative to the previous best result on VOC 2012—achieving a mAP of 62.4%. Our approach combines two ideas: (1) one can apply high-capacity convolutional networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data are scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, boosts performance significantly. Since we combine region proposals with CNNs, we call the resulting model an R-CNN or Region-based Convolutional Network. Source code for the complete system is available at

Abbreviated Authors

R. Girshick, J. Donahue, T. Darrell, and J. Malik

ICSI Research Group


ICSI Publication Type

Article in journal or magazine