UP | HOME

Ren et al 2015 - Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

This paper describes a two-step procedure for object detection:

  1. Using a region proposal network to propose regions that are likely to contain an object
  2. Classifying the objects within these regions

(1) is accomplished by first extracting features from the image using a CNN. Then, we slide a window across the features. Centered in this window is an anchor. And centered around this anchor are \(k\) associated reference panes of various scales and aspect ratios. For each one of these reference panes, the RPN passes the features through an intermediate layer. The results of the intermediate layer are passed to a classifier head and a coordinate head. The classifier head assign an objectness score and the coordinate head outputs the location and dimension of the region of interest.

In (2), the regions of interest which are likely to contain an object are fed through RoI pooling layer and then through an object classifier.

This paper is an improvement over girshick15_fast_r_cnn (see Region of Interest Pooling), which does not have the RPN component.

Bibliography