Kim, S., Choi, J., Kim, T., & Kim, C. (2019). Self-training and adversarial background regularization for unsupervised domain adaptive one-stage object detection. In Proceedings of the IEEE International Conference on Computer Vision (pp. 6092-6101).

Introduction

The paper introduce a weak self-training (WST) method and adversarial background score regularization (BSR) for domain adaptive one-stage object detection.

image.png


Approach

Problem Setting:

  1. We assume that source data ![](https://cdn.nlark.com/yuque/__latex/a7160ac59a3cacf22cf50d8e07b16905.svg#card=math&code=%EF%BC%88x_s%2Cy_s%EF%BC%89&height=24&width=71)is drawn from the source domain ![](https://cdn.nlark.com/yuque/__latex/f0f9d5ad5e68bdcc34d2959aa5d55bfa.svg#card=math&code=x_s%0A&height=14&width=16), and target data ![](https://cdn.nlark.com/yuque/__latex/b5dcd9f9e4ee2ad207462d17cb22d025.svg#card=math&code=%EF%BC%88x_t%2Cy_t%EF%BC%89&height=24&width=68)is drawn from the target domain ![](https://cdn.nlark.com/yuque/__latex/cf7ee950cf61a6003c0ec4af7971d8a8.svg#card=math&code=x_t&height=14&width=15). Here, x is an image and ![](https://cdn.nlark.com/yuque/__latex/5101072827ad495b1c8d2357e0f297d2.svg#card=math&code=y%3D%28b%2Cc%29&height=20&width=65) is a corresponding label, where ![](https://cdn.nlark.com/yuque/__latex/92eb5ffee6ae2fec3ad71c777531578f.svg#card=math&code=b%0A&height=16&width=7) is the coordinates of the bounding box and ![](https://cdn.nlark.com/yuque/__latex/4a8a08f09d37b73795649038408b5f33.svg#card=math&code=c&height=12&width=7) is the class to which the object belongs. We denote the distribution of domain ![](https://cdn.nlark.com/yuque/__latex/02129bb861061d1a052c592e2dc6b383.svg#card=math&code=X%0A&height=16&width=14) as ![](https://cdn.nlark.com/yuque/__latex/0c3d72395d7576ab13b9e9389f865960.svg#card=math&code=P%28X%29&height=20&width=40), and ![](https://cdn.nlark.com/yuque/__latex/8f39ebed15ff4cb6b1f8405916e279d5.svg#card=math&code=P%28X_s%29%20%5Cneq%20%20P%28X_t%29&height=20&width=115). <br />False Positive(假正,**FP**):将负类预测为正类数<br />  False Negative(假负,**FN**):将正类预测为负类数

Weak Self-Training(WST):

Reducing False Negatives:
image.png
论文中采用SSD作为基本检测器,如上图原loss分为三部分,BSR-WST(CVPR2019) - 图3BSR-WST(CVPR2019) - 图4分别指的是目标多分类和背景的二分类loss,BSR-WST(CVPR2019) - 图5指的是localization loss,在SSD中即是offsetBSR-WST(CVPR2019) - 图6.
Negative examples in the BSR-WST(CVPR2019) - 图7 set have a large potential of being foregrounds,Thus, the proposed method choose BSR-WST(CVPR2019) - 图8 examples that have the lowest confidence loss value among negative examples in BSR-WST(CVPR2019) - 图9. the method do not update the network for bounding box regression since pseudo-labels usually have inaccurate bound- ing box information. The modified loss function:
image.png
Reducing False Positives: supporting RoIs denote examples having IoU value larger than some threshold BSR-WST(CVPR2019) - 图11 with the final detection BSR-WST(CVPR2019) - 图12. SRRS (Supporting Region-based Reliable Score):
image.png
首先提出的论文采用SSD,输出是BSR-WST(CVPR2019) - 图14,BSR-WST(CVPR2019) - 图15 is the BSR-WST(CVPR2019) - 图16 detection and BSR-WST(CVPR2019) - 图17 is the total number of detections (e.g., n = 8732 for SSD300), 在NMS之后的输出为BSR-WST(CVPR2019) - 图18, 论文中选用这个作为pseduo的候选框。在这里,简单来说就是提出了一个SRRS来评分每一个产生的positive pseduo label,而不是简单使用a single confidence score, 作者认为all the boxes 都趋向于最终的检测结果BSR-WST(CVPR2019) - 图19, 对于每一个BSR-WST(CVPR2019) - 图20计算每个BSR-WST(CVPR2019) - 图21和这个框的BSR-WST(CVPR2019) - 图22,再取平均值,如上述公式,最终得到一个score(SRRS),然后与一个阈值比较得到最后的pseduo label。

Adversarial Background Score Regularization(对抗背景评分机制):

backgrounds of the source domain and target domain share less common features compared to those of foregrounds. Motivated by [3], the paper propose background score regularization (BSR) in an adversarial way. BSR extracts discriminative features for target backgrounds.
image.png
上面是一个二分类交叉熵,当我们minimize the loss的时候,the value of BSR-WST(CVPR2019) - 图24趋向于t。反之当我们maximize the loss的时候,BSR-WST(CVPR2019) - 图25趋向于0或者1.
具体的BSR背景对抗规则如下:
image.png
对于source输入,都进行minimize强监督优化学习。对于target inputs,优化C(classifier)的输出BSR-WST(CVPR2019) - 图27接近t,通过minimize BSR-WST(CVPR2019) - 图28 loss。相反,优化 F(Feature extraction)的输出BSR-WST(CVPR2019) - 图29趋向于0或者1. 从而最终造成F的error增大,去欺骗C进行unkown class的判别。进而学习得到discriminative feature。
最终的一个背景对抗评分loss如下图所示:
image.png
在这里BSR-WST(CVPR2019) - 图31. 具体实施在这里,the paper enable the adversarial training using
gradient reversal layer (GRL) right after relu4 3 of SSD300. Figure 4 depicts the process of the adversarial learning.
image.png**