Monthly Summary

  • Testing in different settings.
    • FineTune using 100% label, encoder and decoder learnable.
    • LinearProbe using 100% label, encoder frozen and decoder learnable.
    • FineTune using 10% label, encoder and decoder learnable.
    • FineTune using 1% label, encoder and decoder learnable.
  • Patch Level MoCo tends to lower the performance in mIoU (not consistently), and pre-training with more epochs always shows a negative impact.
  • Patch Level Encoder Decoder MoCo tends to increase the performance in mIoU (not consistently), and pre-training with more epochs always shows a negative impact.
  • Pre-training with decoder in most cases results in faster convergence and better performance.

    Test Report & Summary

    Model: Pre-trained ResNet50 + 2 Dilated Convolution Layers

    Frozen Backbone

    VOC-Segmentation

    | | LinearProbe | FineTune | | —- | :—-: | :—-: | | RandInit | 13.39% | 18.06% | | Supervised-IN | 70.06% | 73.28% | | BYOL | 69.54% | 71.87% | | PixPro (IN-100ep) | 64.93% | 73.26% | | MoCo v1 | 63.01% | 72.40% | | MoCo v1 + PLMoCo (seg-20ep) | 62.07% | 72.33% | | MoCo v1 + PLMoCo (seg-50ep) | 61.49% | 72.13% | | MoCo v1 + PLEDMoCo (seg-20ep) | 64.00% | 73.09% | | MoCo v1 + PLEDMoCo (seg-50ep) | 63.75% | 72.62% | | MoCo v2 | 69.83% | 73.61% | | MoCo v2 + PLMoCo (seg-20ep) | 67.72% | 73.99% | | MoCo v2 + PLMoCo (seg-50ep) | 65.85% | 74.62% | | MoCo v2 + PLEDMoCo (seg-20ep) | 69.11% | 74.57% | | MoCo v2 + PLEDMoCo (seg-50ep) | 67.68% | 74.16% | | PLMoCo (seg-200ep) | - | - | | PLEDMoCo (seg-200ep) | 25.50% | 36.34% |

Cityscapes

LinearProbe FineTune
RandInit - 60.11%
Supervised-IN 61.91% 75.64%
BYOL - 75.01%
PixPro (IN-100ep) - -
MoCo v1 55.32% 76.18%
MoCo v1 + PLMoCo (seg-20ep) 57.31% 76.12%
MoCo v1 + PLMoCo (seg-50ep) 57.58% 76.15%
MoCo v1 + PLEDMoCo (seg-20ep) 59.02% 76.07%
MoCo v1 + PLEDMoCo (seg-50ep) 58.01% -
MoCo v2 63.87% 76.52%
MoCo v2 + PLMoCo (seg-20ep) 61.99% 76.53%
MoCo v2 + PLMoCo (seg-50ep) 61.00% 76.75%
MoCo v2 + PLEDMoCo (seg-20ep) 62.91% -
MoCo v2 + PLEDMoCo (seg-50ep) 61.20% -
PLMoCo (seg-200ep) - 33.59%
PLEDMoCo (seg-200ep) - -

Reduced Labeling

VOC-Segmentation

VOC-1% VOC-10% VOC-100%
RandInit 7.94% 9.89% 18.06%
Supervised-IN 39.15% 57.98% 73.28%
BYOL - 60.98% 71.87%
PixPro (IN-100ep) - 57.73% 73.26%
MoCo v1 31.02% 53.58% 72.40%
MoCo v1 + PLMoCo (seg-20ep) 33.91% 51.18% 72.33%
MoCo v1 + PLMoCo (seg-50ep) 33.79% 53.35% 72.13%
MoCo v1 + PLEDMoCo (seg-20ep) 36.63% 55.58% 73.09%
MoCo v1 + PLEDMoCo (seg-50ep) 36.61% 54.62% 72.62%
MoCo v2 37.64% 55.88% 73.61%
MoCo v2 + PLMoCo (seg-20ep) 39.15% 57.81% 73.99%
MoCo v2 + PLMoCo (seg-50ep) 38.46% 56.54% 74.62%
MoCo v2 + PLEDMoCo (seg-20ep) - 58.56% 74.57%
MoCo v2 + PLEDMoCo (seg-50ep) - 58.01% 74.16%
PLMoCo (seg-200ep) - 17.01% -
PLEDMoCo (seg-200ep) - 18.49% 36.34%

Cityscapes

Cityscapes-10% Cityscapes-100%
RandInit 30.75% 60.11%
Supervised-IN 65.57% 75.64%
BYOL 66.01% 75.01%
PixPro (IN-100ep) - -
MoCo v1 64.71% 76.18%
MoCo v1 + PLMoCo (seg-20ep) 64.83% 76.12%
MoCo v1 + PLMoCo (seg-50ep) 64.78% 76.15%
MoCo v1 + PLEDMoCo (seg-20ep) 64.80% 76.07%
MoCo v1 + PLEDMoCo (seg-50ep) 64.67% -
MoCo v2 66.53% 76.52%
MoCo v2 + PLMoCo (seg-20ep) 66.31% 76.53%
MoCo v2 + PLMoCo (seg-50ep) 64.96% 76.75%
MoCo v2 + PLEDMoCo (seg-20ep) 66.34% -
MoCo v2 + PLEDMoCo (seg-50ep) 66.43% -
PLMoCo (seg-200ep) 43.10% 33.59%
PLEDMoCo (seg-200ep) 44.60% -