Key Points
- Patch level rotation with 16 patches and 4 rotation directions (0, 90, 180, 270) failed at an early stage. Loss stopped decreasing, and accuracy comparable to random guess. This happens probably because that the image resolution is low (224 instead of 512).
- Patch level rotation with 1 patches and 2 rotation directions (0, 180) outperformed MoCo pretraining on VOCaug dataset (small dataset).
To-do’s
- Fix patch level rotation loss plateau.
- Encoder-decoder as base encoder for MoCo.
- Encoder-decoder as base encoder and patch level image as instance for MoCo.
- NIPS and ICLR related work
