Monthly Summary
- Testing in different settings.
- FineTune using 100% label, encoder and decoder learnable.
- LinearProbe using 100% label, encoder frozen and decoder learnable.
- FineTune using 10% label, encoder and decoder learnable.
- FineTune using 1% label, encoder and decoder learnable.
- Patch Level MoCo tends to lower the performance in mIoU (not consistently), and pre-training with more epochs always shows a negative impact.
- Patch Level Encoder Decoder MoCo tends to increase the performance in mIoU (not consistently), and pre-training with more epochs always shows a negative impact.
- Pre-training with decoder in most cases results in faster convergence and better performance.
Test Report & Summary
Model: Pre-trained ResNet50 + 2 Dilated Convolution LayersFrozen Backbone
VOC-Segmentation
| | LinearProbe | FineTune | | —- | :—-: | :—-: | | RandInit | 13.39% | 18.06% | | Supervised-IN | 70.06% | 73.28% | | BYOL | 69.54% | 71.87% | | PixPro (IN-100ep) | 64.93% | 73.26% | | MoCo v1 | 63.01% | 72.40% | | MoCo v1 + PLMoCo (seg-20ep) | 62.07% | 72.33% | | MoCo v1 + PLMoCo (seg-50ep) | 61.49% | 72.13% | | MoCo v1 + PLEDMoCo (seg-20ep) | 64.00% | 73.09% | | MoCo v1 + PLEDMoCo (seg-50ep) | 63.75% | 72.62% | | MoCo v2 | 69.83% | 73.61% | | MoCo v2 + PLMoCo (seg-20ep) | 67.72% | 73.99% | | MoCo v2 + PLMoCo (seg-50ep) | 65.85% | 74.62% | | MoCo v2 + PLEDMoCo (seg-20ep) | 69.11% | 74.57% | | MoCo v2 + PLEDMoCo (seg-50ep) | 67.68% | 74.16% | | PLMoCo (seg-200ep) | - | - | | PLEDMoCo (seg-200ep) | 25.50% | 36.34% |
Cityscapes
| LinearProbe | FineTune | |
|---|---|---|
| RandInit | - | 60.11% |
| Supervised-IN | 61.91% | 75.64% |
| BYOL | - | 75.01% |
| PixPro (IN-100ep) | - | - |
| MoCo v1 | 55.32% | 76.18% |
| MoCo v1 + PLMoCo (seg-20ep) | 57.31% | 76.12% |
| MoCo v1 + PLMoCo (seg-50ep) | 57.58% | 76.15% |
| MoCo v1 + PLEDMoCo (seg-20ep) | 59.02% | 76.07% |
| MoCo v1 + PLEDMoCo (seg-50ep) | 58.01% | - |
| MoCo v2 | 63.87% | 76.52% |
| MoCo v2 + PLMoCo (seg-20ep) | 61.99% | 76.53% |
| MoCo v2 + PLMoCo (seg-50ep) | 61.00% | 76.75% |
| MoCo v2 + PLEDMoCo (seg-20ep) | 62.91% | - |
| MoCo v2 + PLEDMoCo (seg-50ep) | 61.20% | - |
| PLMoCo (seg-200ep) | - | 33.59% |
| PLEDMoCo (seg-200ep) | - | - |
Reduced Labeling
VOC-Segmentation
| VOC-1% | VOC-10% | VOC-100% | |
|---|---|---|---|
| RandInit | 7.94% | 9.89% | 18.06% |
| Supervised-IN | 39.15% | 57.98% | 73.28% |
| BYOL | - | 60.98% | 71.87% |
| PixPro (IN-100ep) | - | 57.73% | 73.26% |
| MoCo v1 | 31.02% | 53.58% | 72.40% |
| MoCo v1 + PLMoCo (seg-20ep) | 33.91% | 51.18% | 72.33% |
| MoCo v1 + PLMoCo (seg-50ep) | 33.79% | 53.35% | 72.13% |
| MoCo v1 + PLEDMoCo (seg-20ep) | 36.63% | 55.58% | 73.09% |
| MoCo v1 + PLEDMoCo (seg-50ep) | 36.61% | 54.62% | 72.62% |
| MoCo v2 | 37.64% | 55.88% | 73.61% |
| MoCo v2 + PLMoCo (seg-20ep) | 39.15% | 57.81% | 73.99% |
| MoCo v2 + PLMoCo (seg-50ep) | 38.46% | 56.54% | 74.62% |
| MoCo v2 + PLEDMoCo (seg-20ep) | - | 58.56% | 74.57% |
| MoCo v2 + PLEDMoCo (seg-50ep) | - | 58.01% | 74.16% |
| PLMoCo (seg-200ep) | - | 17.01% | - |
| PLEDMoCo (seg-200ep) | - | 18.49% | 36.34% |
Cityscapes
| Cityscapes-10% | Cityscapes-100% | |
|---|---|---|
| RandInit | 30.75% | 60.11% |
| Supervised-IN | 65.57% | 75.64% |
| BYOL | 66.01% | 75.01% |
| PixPro (IN-100ep) | - | - |
| MoCo v1 | 64.71% | 76.18% |
| MoCo v1 + PLMoCo (seg-20ep) | 64.83% | 76.12% |
| MoCo v1 + PLMoCo (seg-50ep) | 64.78% | 76.15% |
| MoCo v1 + PLEDMoCo (seg-20ep) | 64.80% | 76.07% |
| MoCo v1 + PLEDMoCo (seg-50ep) | 64.67% | - |
| MoCo v2 | 66.53% | 76.52% |
| MoCo v2 + PLMoCo (seg-20ep) | 66.31% | 76.53% |
| MoCo v2 + PLMoCo (seg-50ep) | 64.96% | 76.75% |
| MoCo v2 + PLEDMoCo (seg-20ep) | 66.34% | - |
| MoCo v2 + PLEDMoCo (seg-50ep) | 66.43% | - |
| PLMoCo (seg-200ep) | 43.10% | 33.59% |
| PLEDMoCo (seg-200ep) | 44.60% | - |
