Key Points

Basic strategy: self-supervised learning to reduce reliance on human-labeling
Downstream tasks: dense prediction (segmentation, depth estimation, etc.)
Dataset: Pascal VOC 2012
Baselines: MoCo and SimCLR with semantic segmentation head (a decoder)
Main effort: design contrastive pretext tasks for segmentation
Possible direction: auto-encoder and energy-based method (why or why not)

To-do’s
[x] Intro to MMSegmentation toolbox
Convert MoCo & SimCLR model checkpoints to MMSegmentation
Related work about auto-encoder style self-supervised learning

Self-supervised Learning for Dense Prediction