Basement
Different Segmentations
参考知乎
Semantic(语义分割)、Object Detection(目标检测)、Instance Segmentation(实例分割)的区别与联系如下图所示,语义分割属于对 pixel 的分类,是更 low-level 的。
Metrics
IoU
To assess performance, we rely on the standard Jaccard Index, commonly known as the PASCAL VOC intersection-over-union metric IoU = TP ⁄ (TP+FP+FN) [1], where TP, FP, and FN are the numbers of true positive, false positive, and false negative pixels, respectively, determined over the whole test set.
AP
To assess instance-level performance, we compute the average precision on the region level (AP [2]) for each class and average it across a range of overlap thresholds to avoid a bias towards a specific value.
CNN FCN
参考知乎
在下文中对 CNN 和 FCN 的区别进行了说明,如下所示。
FCN 从结构上来看,把 CNN 最后的 FC(fully connected)层改成了卷积层。从而实现了 end-to-end 的 pixel-level 分类。
Weakly Supervised Semantic Segmentation
Mainly referred to this Github
Bounding-Box
Box-driven Class-wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation
Network
Contribution
- 为了去除 background 加强 foreground,利用 BCM(a box-driven class-wise masking)模型对每一个类(Bounding-Box 标注)的 feature map 进行过滤(这样得到的结果类似于监督学习的标注)
- 为了进一步改善 Box 内错误的标注,通过计算每一类的平均填充率(多次迭代的平均)得到 FRc(the mean filling rate of class-c),然后设置门限得到每一类的优化过后的 confidence score 门限。
Results
Learning to Segment Every Thing
Network
Contribution
此方法在输入为特定情况(只有部分类是精准标注的,其他的类都是 Bounding-Box 标注)时训练得到一个 “能对所有的类进行分割” 的模型。属于 “partially supervised”。使用的是一种迁移学习方法(a novel transfer learning approach),训练一个权值传递函数(a learned weight transfer function),学习如何在一个 Bounding-Box 中去对事物进行分割。
Results
One-Shot
Image/Video Label
Strongly Supervised Semantic Segmentation
Convolutional Networks
Fully Convolutional Networks for Semantic Segmentation
Network
下图来自较早的版本
下图来自新版本
Contribution
FCN 主要就是利用卷积层替换了 CNN 的全连接层,实现 end-to-end、pixel-to-pixel 的学习。较为细节的就是采用滑动连接结构(skip architecture)对不同层的特征进行融合(FCN-32s、16s、8s),减少了因为上采样而导致的细节丢失问题。
Results
时间
IoU
Deep FCN
U-Net: Convolutional Networks for Biomedical Image Segmentation
Network
Contribution
Results
TODO
- Deep FCN
- U-Net