Cascade R-CNN: Delving into High Quality Object Detection
CVPR2018
- Introduction
- Architecture
Introduction
The goal of the paper is to research problem of learning high quality object detectors, whose
outputs contain few close false positives.
This paper defines the quality of an input hypothesis as its IoU with the ground truth, and the
quality of the detector as the IoU thresholdused to train it.
Why low IoU threshold result in low quality?
An object detector, trained with low IoU threshold, usually produces noisy detections, faces many
close false positives.
Why we can not simply increase during training?
- Overfitting during training, due to exponentially vanishing positive samples
- Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses.
An other words, the detector trained with high threshold is not sensitive to the input hypotheses
distribution which trends to have more low quality input hypotheses.
Why the detector trained with high threshold is not sensitive to the input hypotheses
distribution which trends to have more low quality input hypotheses?

As we can seen from the figure(c) and figure(d), the detector performs better when the input IoU and
the IoU threshold are similar.
I guess that the detector is more sensitive when the input IoU is near optimization target value.
Architecture


Cascade RCNN is trained stage by stage to change input hypotheses distribution step by step,
so that the input hypotheses distribution can match higher quality detector with high threshold.
