

  • Sandbox: 协议栈,可包含多个 Endpoint,可通过 Namespace、Jail 等实现
  • Endpoint: 将 Sandbox 与 Network 连接
  • Network: 可直接通信的 Endpoint 的集合,可使用 Bridge、VLAN 等实现

Docker Architecture


Network Controller


Initialize Network Controllers

Docker Daemon 管理可用的 NetworkController。在启动 Daemon 时,会创建当前操作系统下全部可用的 NetworkController,以 daemon_unix.go 为例,创建了 none、host、bridge 三种模式的网络控制器。

  1. func (daemon *Daemon) initNetworkController(config *config.Config, activeSandboxes map[string]interface{}) (libnetwork.NetworkController, error) {
  2. netOptions, err := daemon.networkOptions(config, daemon.PluginStore, activeSandboxes)
  3. if err != nil {
  4. return nil, err
  5. }
  6. controller, err := libnetwork.New(netOptions...)
  7. if err != nil {
  8. return nil, fmt.Errorf("error obtaining controller instance: %v", err)
  9. }
  10. if len(activeSandboxes) > 0 {
  11. logrus.Info("There are old running containers, the network config will not take affect")
  12. return controller, nil
  13. }
  14. // Initialize default network on "null"
  15. if n, _ := controller.NetworkByName("none"); n == nil {
  16. if _, err := controller.NewNetwork("null", "none", "", libnetwork.NetworkOptionPersist(true)); err != nil {
  17. return nil, fmt.Errorf("Error creating default \"null\" network: %v", err)
  18. }
  19. }
  20. // Initialize default network on "host"
  21. if n, _ := controller.NetworkByName("host"); n == nil {
  22. if _, err := controller.NewNetwork("host", "host", "", libnetwork.NetworkOptionPersist(true)); err != nil {
  23. return nil, fmt.Errorf("Error creating default \"host\" network: %v", err)
  24. }
  25. }
  26. // Clear stale bridge network
  27. if n, err := controller.NetworkByName("bridge"); err == nil {
  28. if err = n.Delete(); err != nil {
  29. return nil, fmt.Errorf("could not delete the default bridge network: %v", err)
  30. }
  31. if len(config.NetworkConfig.DefaultAddressPools.Value()) > 0 && !daemon.configStore.LiveRestoreEnabled {
  32. removeDefaultBridgeInterface()
  33. }
  34. }
  35. if !config.DisableBridge {
  36. // Initialize default driver "bridge"
  37. if err := initBridgeDriver(controller, config); err != nil {
  38. return nil, err
  39. }
  40. } else {
  41. removeDefaultBridgeInterface()
  42. }
  43. // Set HostGatewayIP to the default bridge's IP if it is empty
  44. if daemon.configStore.HostGatewayIP == nil && controller != nil {
  45. if n, err := controller.NetworkByName("bridge"); err == nil {
  46. v4Info, v6Info := n.Info().IpamInfo()
  47. var gateway net.IP
  48. if len(v4Info) > 0 {
  49. gateway = v4Info[0].Gateway.IP
  50. } else if len(v6Info) > 0 {
  51. gateway = v6Info[0].Gateway.IP
  52. }
  53. daemon.configStore.HostGatewayIP = gateway
  54. }
  55. }
  56. return controller, nil
  57. }

NetworkController Implementation

controller 是 libnetwork 中对 NetworkController 的实现。可以看到,controller 通过驱动表来区分不同类型的网络,使用驱动创建 Network 及 Endpoint,并将 Endpoint 加入 Sandbox 或移除出 Sandbox。
Container 通过 SandboxID 以及 SandboxKey 来找到对应的 Sandbox。Sandbox 可以使用 containerID 来确定是否归属于某个 Container。

OS Layer Sandbox


Sandbox 接口没有列举出全部功能,只是能看出其能力边界的部分功能。后续以 Namespace 方式实现的 Sandbox 为例。
通过上图,并不难看出,路由、接口等功能应该是由 netlink 提供的,Namespace 获取 netlink 方式如下,需要注意,Namespace 内 netlink 配置,仅在 Namespace 内有效。根据 Namespace 获取 netlink 的关键方法如下

  1. func GetFromPath(path string) (NsHandle, error) {
  2. fd, err := syscall.Open(path, syscall.O_RDONLY, 0)
  3. if err != nil {
  4. return -1, err
  5. }
  6. return NsHandle(fd), nil
  7. }

使用返回的 NsHandle 就可以创建具体的 SocketHandle 了,方法如下

  1. func NewHandleAt(ns netns.NsHandle, nlFamilies (*Handle, error) {
  2. return newHandle(ns, netns.None(), nlFamilies...)
  3. }
  4. // NewHandleAtFrom works as NewHandle but allows client to specify the
  5. // new and the origin netns Handle.
  6. func NewHandleAtFrom(newNs, curNs netns.NsHandle) (*Handle, error) {
  7. return newHandle(newNs, curNs)
  8. }
  9. func newHandle(newNs, curNs netns.NsHandle, nlFamilies (*Handle, error) {
  10. h := &Handle{sockets: map[int]*nl.SocketHandle{}}
  11. fams := nl.SupportedNlFamilies
  12. if len(nlFamilies) != 0 {
  13. fams = nlFamilies
  14. }
  15. for _, f := range fams {
  16. s, err := nl.GetNetlinkSocketAt(newNs, curNs, f)
  17. if err != nil {
  18. return nil, err
  19. }
  20. h.sockets[f] = &nl.SocketHandle{Socket: s}
  21. }
  22. return h, nil
  23. }

Add Interface


Bridge Network


Create Network

根据配置文件中 BridgeName 查找系统中已存在的 Link 实例,如果 BridgeName 为空,使用默认网桥 docker0。

  1. func newInterface(nlh *netlink.Handle, config *networkConfiguration) (*bridgeInterface, error) {
  2. var err error
  3. i := &bridgeInterface{nlh: nlh}
  4. // Initialize the bridge name to the default if unspecified.
  5. if config.BridgeName == "" {
  6. config.BridgeName = DefaultBridgeName
  7. }
  8. // Attempt to find an existing bridge named with the specified name.
  9. i.Link, err = nlh.LinkByName(config.BridgeName)
  10. if err != nil {
  11. logrus.Debugf("Did not find any interface with name %s: %v", config.BridgeName, err)
  12. } else if _, ok := i.Link.(*netlink.Bridge); !ok {
  13. return nil, fmt.Errorf("existing interface %s is not a bridge", i.Link.Attrs().Name)
  14. }
  15. return i, nil
  16. }

创建 bridgeNetwork 实例,并存入 networks

  1. // Create and set network handler in driver
  2. network := &bridgeNetwork{
  3. id: config.ID,
  4. endpoints: make(map[string]*bridgeEndpoint),
  5. config: config,
  6. portMapper: portmapper.New(d.config.UserlandProxyPath),
  7. bridge: bridgeIface,
  8. driver: d,
  9. }
  10. d.Lock()
  11. d.networks[config.ID] = network
  12. d.Unlock()

如果获取的 bridgeInterface 中不存在有效网桥设备,则将创建设备、sysctl 方法加入设置队列;如果使用 docker0,仅将 sysctl 方法加入设置队列

  1. bridgeAlreadyExists := bridgeIface.exists()
  2. if !bridgeAlreadyExists {
  3. bridgeSetup.queueStep(setupDevice)
  4. bridgeSetup.queueStep(setupDefaultSysctl)
  5. }
  6. // For the default bridge, set expected sysctls
  7. if config.DefaultBridge {
  8. bridgeSetup.queueStep(setupDefaultSysctl)
  9. }


  1. for _, step := range []struct {
  2. Condition bool
  3. Fn setupStep
  4. }{
  5. // Enable IPv6 on the bridge if required. We do this even for a
  6. // previously existing bridge, as it may be here from a previous
  7. // installation where IPv6 wasn't supported yet and needs to be
  8. // assigned an IPv6 link-local address.
  9. {config.EnableIPv6, setupBridgeIPv6},
  10. // We ensure that the bridge has the expectedIPv4 and IPv6 addresses in
  11. // the case of a previously existing device.
  12. {bridgeAlreadyExists && !config.InhibitIPv4, setupVerifyAndReconcile},
  13. // Enable IPv6 Forwarding
  14. {enableIPv6Forwarding, setupIPv6Forwarding},
  15. // Setup Loopback Addresses Routing
  16. {!d.config.EnableUserlandProxy, setupLoopbackAddressesRouting},
  17. // Setup IPTables.
  18. {d.config.EnableIPTables, network.setupIPTables},
  19. //We want to track firewalld configuration so that
  20. //if it is started/reloaded, the rules can be applied correctly
  21. {d.config.EnableIPTables, network.setupFirewalld},
  22. // Setup DefaultGatewayIPv4
  23. {config.DefaultGatewayIPv4 != nil, setupGatewayIPv4},
  24. // Setup DefaultGatewayIPv6
  25. {config.DefaultGatewayIPv6 != nil, setupGatewayIPv6},
  26. // Add inter-network communication rules.
  27. {d.config.EnableIPTables, setupNetworkIsolationRules},
  28. //Configure bridge networking filtering if ICC is off and IP tables are enabled
  29. {!config.EnableICC && d.config.EnableIPTables, setupBridgeNetFiltering},
  30. } {
  31. if step.Condition {
  32. bridgeSetup.queueStep(step.Fn)
  33. }
  34. }


  1. bridgeSetup.queueStep(setupDeviceUp)
  2. return bridgeSetup.apply()

Setup Device

创建 netlink.Bridge 结构体,LinkAttrs 中使用配置中的 BridgeName,然后,使用 netlink 方法创建网桥设备,如果需要设置 MAC 则随机生成 MAC 地址。

  1. func setupDevice(config *networkConfiguration, i *bridgeInterface) error {
  2. var setMac bool
  3. // We only attempt to create the bridge when the requested device name is
  4. // the default one.
  5. if config.BridgeName != DefaultBridgeName && config.DefaultBridge {
  6. return NonDefaultBridgeExistError(config.BridgeName)
  7. }
  8. // Set the bridgeInterface netlink.Bridge.
  9. i.Link = &netlink.Bridge{
  10. LinkAttrs: netlink.LinkAttrs{
  11. Name: config.BridgeName,
  12. },
  13. }
  14. // Only set the bridge's MAC address if the kernel version is > 3.3, as it
  15. // was not supported before that.
  16. kv, err := kernel.GetKernelVersion()
  17. if err != nil {
  18. logrus.Errorf("Failed to check kernel versions: %v. Will not assign a MAC address to the bridge interface", err)
  19. } else {
  20. setMac = kv.Kernel > 3 || (kv.Kernel == 3 && kv.Major >= 3)
  21. }
  22. if setMac {
  23. hwAddr := netutils.GenerateRandomMAC()
  24. i.Link.Attrs().HardwareAddr = hwAddr
  25. logrus.Debugf("Setting bridge mac address to %s", hwAddr)
  26. }
  27. if err = i.nlh.LinkAdd(i.Link); err != nil {
  28. logrus.Debugf("Failed to create bridge %s via netlink. Trying ioctl", config.BridgeName)
  29. return ioctlCreateBridge(config.BridgeName, setMac)
  30. }
  31. return err
  32. }

Bridge 设备创建、配置等,最终均通过 netlink 接口完成。

Networking Configuration

  • System Control
    • /proc/sys/net/ipv6/conf/BridgeName/accept_ra -> 0:不接受路由建议
    • /proc/sys/net/ipv4/conf/BridgeName/route_localnet -> 1:将外部流量重定向至 loopback,需要配合 iptables 使用


    • filter
      • DOCKER-ISOLATION-STAGE-1 -i BridgeInterface ! -d Network -j DROP
      • DOCKER-ISOLATION-STAGE-1 -o BridgeInterface ! -s Network -j DROP
    • nat
      • DOCKER -t nat -i BridgeInterface -j RETURN
    • filter
      • FORWARD -i BridgeInterface ! -o BridgeInterface -j ACCEPT
    • HOST IP != nil
      • nat
        • POSTROUTING -t nat -s BridgeSubnet ! -o BridgeInterface -j SNAT —to-source HOSTIP
        • POSTROUTING -t nat -m addrtype —src-type LOCAL -o BridgeInterface -j SNAT —to-source HOSTIP
    • HOST IP == nil
      • nat
        • POSTROUTING -t nat -s BridgeSubnet ! -o BridgeInterface -j MASQUERADE
        • POSTROUTING -t nat -m addrtype —src-type LOCAL -o BridgeInterface -j MASQUERADE
    • Inter Container Communication Enabled
      • filter
        • FORWARD -i BridgeInterface -o __BridgeInterface -j ACCEPT
    • Inter Container Communication Disabled
      • filter
        • FORWARD -i BridgeInterface -o __BridgeInterface -j DROP
    • nat
      • PREROUTING -m addrtype —dst-type LOCAL -j DOCKER
      • OUTPUT -m addrtype —dst-type LOCAL -j DOCKER
    • filter
      • FORWARD -o BridgeInterface -j DOCKER
      • FORWARD -o BridgeInterface -m conntrack —ctstate RELATED,ESTABLISHED -j ACCEPT
    • filter

Create Endpoint

全局有一个默认 Bridge 设备 docker0,每个 Container 有自己独立的网络协议栈,容器网络和通过 veth 对与 Bridge 设备互通。
同一节点上不同 Container 间,通过 ARP 协议,即可进行 3 层通信;Container 出 Node 网络可以通过默认网关设备 docker0,再经过 IPTABLES 重定向至 eth0。