Recognizing objects at vastly different scales is a fundamental challenge in computer vision (识别多尺度的目标是计算机视觉中最具有挑战性的任务)

Feature pyramid built upon image pyramid (Featured image pyramid) is used in traditional hand-engineered computer vision feature to achieve scale-invariant. (特征金字塔受传统计算机视觉中尺度不变形的图像金字塔而提出)

  • These pyramids are scale-invariant (object scales change is offset by shifting its level in the pyramid)
  • 这些金字塔具有尺度不变性(物体的尺度变化将会随着金字塔移动而移动)

FPN (Feature Pyramid Network for Object Detection) - 图1
FPN (Feature Pyramid Network for Object Detection) - 图2

01 Modified Featured Image pyramid

Problem of implying naive featured image pyramid

Computation:Inference time increase considerably
Memory:Training deep networks end-to-end on an image pyramid is infeasible

在图像金字塔上直接做目标检测的计算量和内存需求超出承受范围

The nature of Convolution

A deep ConvNet computes a features hierarchy layer by layer, and sith subsampling layers the feature hierarchy has an inherent multi-scale pyramidal shape.However, there are large semantic gaps between layers

层与层之间的语义特征信息差别比较大

Proposal: Feature Pyramid Network (FPN)

Naturally leverage the pyramidal shape of a ConvNet’s feature hierarchy while creating a feature pyramid that has strong semantics at all scales

New architecture: combining semantically strong features with semantically weak features via a top-down pathway and lateral connection

Predictions are made independently at all levels
FPN (Feature Pyramid Network for Object Detection) - 图3

Architecture - breakdown (YOLOv4)

Common object detector (通用目标检测器)

FPN (Feature Pyramid Network for Object Detection) - 图4

FPN (Feature Pyramid Network for Object Detection) - 图5

FPN

  • Goal: Feature integration (combining strong-semantics and weak-semantics)
  • FPN is a general purpose architecture: take a single-scale image as input, generating proportionally sized feature maps at multiple-levels in fully convolutional fashion.
  • FPN is independent to the backbone
    • Region Proposal Network
    • Object detector
    • Instance Segmentation
  • FPN is constructed with
    • Bottom-up pathway
    • Top-down pathway
    • Lateral connections

Implementation detail

FPN (Feature Pyramid Network for Object Detection) - 图6

Implementation detail on ResNet

FPN (Feature Pyramid Network for Object Detection) - 图7
FPN (Feature Pyramid Network for Object Detection) - 图8

点击查看【processon】

image.png

1x1 的卷积用于调整通道数

参考

https://www.bilibili.com/video/BV1dh411U7D9?from=search&seid=12203824160903509476&spm_id_from=333.337.0.0

https://docs.google.com/presentation/d/1pJ-szvh6ir71uqsJH3Ippq4HR2kObHjDyCyEADT9OLI/edit#slide=id.ga644b49656_0_8

https://www.youtube.com/watch?v=mwMopcSRx1U&t=115s

https://www.bilibili.com/video/BV1dh411U7D9?from=search&seid=12203824160903509476&spm_id_from=333.337.0.0