key elements

problems

code

The disadvantages of droid slam

  • bad performance on KITTI
  • extremely graphics card memory consumption/video frames limitation
  • Does the runtime is efficient enough?
    Why does this happen?
    1.1 Generalization performance problem
    1.2 feature points near/faraway to baseline, monocular`=stereo

    How can we solve it?

    Literature review

  1. RAFT, optical flow
  2. DeepV2D alternates between updating depth and updating camera poses, instead of bundle adjustment.
  3. BA-Net has bundle adjustment layer which is not dense.
  4. Multi-view optimization of local feature geometry: builds a neural network into the SfM pipeline to improve keypoint localization accuracy.
  5. gradslam: dense slam meets automatic differentiation, differentiable computation graphs, error backpropagated to sensor. Differentiale, no trainable parameters.
  6. DeepFactors, most complete deep SLAM, performs joint optimization of pose and depth, capable of loop closure.
  7. CodeSLAMa

some possible working directions

network architecture

  • transformer

    optimality

  • graphics card memory deduce

  • runtime optimization

    semantic slam

  • semantic slam

    multiple sensors

  • multiple sensors fusion, such as IMU, Lidar, GNSS

  • support a variety of camera models, such as fisheye

    robustness

  • generalization problem

  • multimap system