Cascade R-CNN: Delving into High Quality Object Detection

CVPR2018

The goal of the paper is to research problem of learning high quality object detectors, whose
outputs contain few close false positives.

This paper defines the quality of an input hypothesis as its IoU with the ground truth, and the
quality of the detector as the IoU threshold CascadeRCNN - 图1 used to train it.

Why low IoU threshold result in low quality?

An object detector, trained with low IoU threshold, usually produces noisy detections, faces many
close false positives.

Why we can not simply increase CascadeRCNN - 图2 during training?

  • Overfitting during training, due to exponentially vanishing positive samples
  • Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses.
    An other words, the detector trained with high threshold is not sensitive to the input hypotheses
    distribution which trends to have more low quality input hypotheses.

Why the detector trained with high threshold is not sensitive to the input hypotheses
distribution which trends to have more low quality input hypotheses?

cascade1.png

As we can seen from the figure(c) and figure(d), the detector performs better when the input IoU and
the IoU threshold are similar.

I guess that the detector is more sensitive when the input IoU is near optimization target value.

Architecture

cascade2.png

cascade3.png

Cascade RCNN is trained stage by stage to change input hypotheses distribution step by step,
so that the input hypotheses distribution can match higher quality detector with high threshold.