1. ENV
  2. 1.Install cilium with Native Routing Mode
  3. kubectl apply -f https://raw.githubusercontent.com/BurlyLuo/train/main/Cilium/Cilium1.11_Native_Routing.yaml
  4. root@bpf1:~# uname -a
  5. Linux bpf1 5.11.0-051100-generic #202102142330 SMP Sun Feb 14 23:33:21 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
  6. root@bpf1:~#
  7. root@bpf1:~# kubectl get nodes -o wide
  8. NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
  9. bpf1 Ready control-plane,master 5h54m v1.23.2 192.168.2.61 <none> Ubuntu 20.04.3 LTS 5.11.0-051100-generic docker://20.10.12
  10. bpf2 Ready <none> 5h53m v1.23.2 192.168.2.62 <none> Ubuntu 20.04.3 LTS 5.11.0-051100-generic docker://20.10.12
  11. bpf3 Ready <none> 5h52m v1.23.2 192.168.2.63 <none> Ubuntu 20.04.3 LTS 5.11.0-051100-generic docker://20.10.12
  12. root@bpf1:~#
  13. root@bpf1:~# kubectl -nkube-system exec -it cilium-zlmgg -- cilium status
  14. Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), clean-cilium-state (init)
  15. KVStore: Ok Disabled
  16. Kubernetes: Ok 1.23 (v1.23.2) [linux/amd64]
  17. Kubernetes APIs: ["cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "discovery/v1::EndpointSlice", "networking.k8s.io/v1::NetworkPolicy"]
  18. KubeProxyReplacement: Strict [ens33 192.168.2.61 (Direct Routing)]
  19. Host firewall: Disabled
  20. Cilium: Ok 1.11.0 (v1.11.0-27e0848)
  21. NodeMonitor: Listening for events on 128 CPUs with 64x4096 of shared memory
  22. Cilium health daemon: Ok
  23. IPAM: IPv4: 4/254 allocated from 10.0.2.0/24,
  24. BandwidthManager: Disabled
  25. Host Routing: BPF
  26. Masquerading: BPF [ens33] 10.0.0.0/16 [IPv4: Enabled, IPv6: Disabled]
  27. Controller Status: 30/30 healthy
  28. Proxy Status: OK, ip 10.0.2.203, 0 redirects active on ports 10000-20000
  29. Hubble: Ok Current/Max Flows: 4095/4095 (100.00%), Flows/s: 19.25 Metrics: Disabled
  30. Encryption: Disabled
  31. Cluster health: 3/3 reachable (2022-01-24T09:36:49Z)
  32. root@bpf1:~#
  • 1.Pod-Pod DIFF Node ```properties 环境介绍: root@bpf1:~# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cni-89wbx 1/1 Running 0 89m 10.0.1.89 bpf3 cni-fmx5d 1/1 Running 0 89m 10.0.0.226 bpf2 cni-l666c 1/1 Running 0 89m 10.0.2.23 bpf1 root@bpf1:~# [1.IP & MAC地址信息] root@bpf1:~# kubectl -nkube-system exec -it cilium-zlmgg — cilium bpf endpoint list Defaulted container “cilium-agent” out of: cilium-agent, mount-cgroup (init), clean-cilium-state (init) IP ADDRESS LOCAL ENDPOINT INFO
    10.0.2.23:0 id=2436 flags=0x0000 ifindex=23 mac=26:FF:70:C4:3B:0F nodemac=DA:2A:CB:56:69:65
    root@bpf1:~# root@bpf1:~# kubectl -nkube-system exec -it cilium-8fcbm cilium bpf endpoint list kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] — [COMMAND] instead. Defaulted container “cilium-agent” out of: cilium-agent, mount-cgroup (init), clean-cilium-state (init) IP ADDRESS LOCAL ENDPOINT INFO
    10.0.0.226:0 id=437 flags=0x0000 ifindex=23 mac=A6:D5:1C:9F:3C:00 nodemac=AE:CC:C6:D8:DD:35
    root@bpf1:~# root@bpf1:~# kubectl -nkube-system exec -it cilium-6hdtv cilium bpf endpoint list kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] — [COMMAND] instead. Defaulted container “cilium-agent” out of: cilium-agent, mount-cgroup (init), clean-cilium-state (init) IP ADDRESS LOCAL ENDPOINT INFO 10.0.1.89:0 id=2433 flags=0x0000 ifindex=21 mac=4A:77:6F:E0:8E:C6 nodemac=BA:6F:18:B6:EB:5E
    root@bpf1:~#

我们一bpf1上的Pod去ping测bpf3上的Pod: root@bpf1:~# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cni-89wbx 1/1 Running 0 89m 10.0.1.89 bpf3

cni-l666c 1/1 Running 0 89m 10.0.2.23 bpf1

所以: [2.ping测试:] root@bpf1:~# kubectl exec -it cni-l666c — ping -c 1 10.0.1.89 PING 10.0.1.89 (10.0.1.89): 56 data bytes 64 bytes from 10.0.1.89: seq=0 ttl=62 time=1.199 ms

—- 10.0.1.89 ping statistics —- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 1.199/1.199/1.199 ms root@bpf1:~# ###################################################################################################### 有了前边的Cilium With VxLAN 模式介绍,所以这里我们就直接对cilium monitor的结果进行分析: [1.在bpf1节点的log:]

有了前边VxLAN的分析经验,这里我么应该比较容易知道具体的逻辑:

Ethernet {Contents=[..14..] Payload=[..86..] SrcMAC=26:ff:70:c4:3b:0f DstMAC=da:2a:cb:56:69:65 EthernetType=IPv4 Length=0} IPv4 {Contents=[..20..] Payload=[..64..] Version=4 IHL=5 TOS=0 Length=84 Id=13993 Flags=DF FragOffset=0 TTL=64 Protocol=ICMPv4 Checksum=60560 SrcIP=10.0.2.23 DstIP=10.0.1.89 Options=[] Padding=[]} ICMPv4 {Contents=[..8..] Payload=[..56..] TypeCode=EchoRequest Checksum=32567 Id=10240 Seq=0} Failed to decode layer: No decoder for layer type Payload CPU 05: MARK 0x0 FROM 2436 from-endpoint: 98 bytes (98 captured), state new, , identity 31020->unknown, orig-ip 0.0.0.0 CPU 05: MARK 0x0 FROM 2436 DEBUG: Conntrack lookup 1/2: src=10.0.2.23:10240 dst=10.0.1.89:0 CPU 05: MARK 0x0 FROM 2436 DEBUG: Conntrack lookup 2/2: nexthdr=1 flags=1 # 1.这里的iptables处理是bpf1节点上的Pod的内部经过处理,不再赘述。 CPU 05: MARK 0x0 FROM 2436 DEBUG: CT verdict: New, revnat=0 CPU 05: MARK 0x0 FROM 2436 DEBUG: Successfully mapped addr=10.0.1.89 to identity=31020

CPU 05: MARK 0x0 FROM 2436 DEBUG: Conntrack create: proxy-port=0 revnat=0 src-identity=31020 lb=0.0.0.0

Ethernet {Contents=[..14..] Payload=[..86..] SrcMAC=00:0c:29:67:92:63 DstMAC=00:0c:29:dd:24:3a EthernetType=IPv4 Length=0} IPv4 {Contents=[..20..] Payload=[..64..] Version=4 IHL=5 TOS=0 Length=84 Id=13993 Flags=DF FragOffset=0 TTL=63 Protocol=ICMPv4 Checksum=60816 SrcIP=10.0.2.23 DstIP=10.0.1.89 Options=[] Padding=[]} ICMPv4 {Contents=[..8..] Payload=[..56..] TypeCode=EchoRequest Checksum=32567 Id=10240 Seq=0} # 2.这里我们看到实际上到了HOST NS中,正如Blog[https://cilium.io/blog/2021/05/11/cni-benchmark]中描述:只是在HOST NS做了路由查询而已。所以从这个角度来看,我们cilium在此做的努力使得我们的不同节点Pod的通信接近于Node和Node上的普通进程之间的通信。这点优化实际上是一种突破性的进展。真正的解决掉了我们曾经提到的网络平面下沉问题。 这样我们既享受到了Pod带来的便利,又保持了性能的需求。那么如果再结合外部的网络设备实现联动。 这里关于联动:我们就多说一点:cilium的形式:cilium node只配置默认路由,发布自己所有路由出去,不从外部学习BGP路由。这样做的好处是,减轻了cilium node的路由学习以及存储压力,保证其性能。不好的地方在于:所有的网络查询需要依靠顶层的核心的路由器去处理,这样实际上在一个cluster下的node节点之间的通信,把leaf层处理的提高到speaf层处理了,增加了处理的path。

其中cilium中关于和外部网络设备联动的说的较少,那我们可以参考Calico的网络模式,其中介绍了很多种类型供我们参考。 所以这块:我们建议参考calico的网络模型介绍. https://projectcalico.docs.tigera.io/reference/architecture/design/

Failed to decode layer: No decoder for layer type Payload

CPU 05: MARK 0x0 FROM 780 to-network: 98 bytes (98 captured), state new, orig-ip 0.0.0.0

Ethernet {Contents=[..14..] Payload=[..86..] SrcMAC=00:0c:29:dd:24:3a DstMAC=00:0c:29:67:92:63 EthernetType=IPv4 Length=0} IPv4 {Contents=[..20..] Payload=[..64..] Version=4 IHL=5 TOS=0 Length=84 Id=58761 Flags= FragOffset=0 TTL=63 Protocol=ICMPv4 Checksum=32432 SrcIP=10.0.1.89 DstIP=10.0.2.23 Options=[] Padding=[]} ICMPv4 {Contents=[..8..] Payload=[..56..] TypeCode=EchoReply Checksum=34615 Id=10240 Seq=0} Failed to decode layer: No decoder for layer type Payload CPU 05: MARK 0x0 FROM 780 from-network: 98 bytes (98 captured), state new, interface ens33, orig-ip 0.0.0.0 CPU 05: MARK 0x0 FROM 780 DEBUG: Successfully mapped addr=10.0.1.89 to identity=31020 CPU 05: MARK 0x0 FROM 780 DEBUG: Attempting local delivery for container id 2436 from seclabel 31020 CPU 05: MARK 0x0 FROM 2436 DEBUG: Conntrack lookup 1/2: src=10.0.1.89:0 dst=10.0.2.23:10240 CPU 05: MARK 0x0 FROM 2436 DEBUG: Conntrack lookup 2/2: nexthdr=1 flags=0 CPU 05: MARK 0x0 FROM 2436 DEBUG: CT entry found lifetime=16788223, revnat=0

CPU 05: MARK 0x0 FROM 2436 DEBUG: CT verdict: Reply, revnat=0

Ethernet {Contents=[..14..] Payload=[..86..] SrcMAC=da:2a:cb:56:69:65 DstMAC=26:ff:70:c4:3b:0f EthernetType=IPv4 Length=0} IPv4 {Contents=[..20..] Payload=[..64..] Version=4 IHL=5 TOS=0 Length=84 Id=58761 Flags= FragOffset=0 TTL=62 Protocol=ICMPv4 Checksum=32688 SrcIP=10.0.1.89 DstIP=10.0.2.23 Options=[] Padding=[]} ICMPv4 {Contents=[..8..] Payload=[..56..] TypeCode=EchoReply Checksum=34615 Id=10240 Seq=0} Failed to decode layer: No decoder for layer type Payloa

CPU 05: MARK 0x0 FROM 2436 to-endpoint: 98 bytes (98 captured), state reply, interface lxc7a0358c608b2, , identity 31020->31020, orig-ip 10.0.1.89, to endpoint 2436

###################################################################################################### 以上是数据出方向上的分析。

![](https://cdn.nlark.com/yuque/0/2022/png/5357487/1643020660722-167a3148-a032-404a-8b1a-3e95663a1c76.png#clientId=ubcc60103-4087-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=490&id=u65bd3acf&margin=%5Bobject%20Object%5D&originHeight=375&originWidth=882&originalType=url&ratio=1&rotation=0&showTitle=false&status=done&style=none&taskId=ue5b322d3-2b93-4ef8-8572-37e8fb5a380&title=&width=1153)<br />![](https://cdn.nlark.com/yuque/0/2022/png/5357487/1643024522570-325d02fb-9381-485f-a44f-41202b53a051.png#clientId=u0ce5d02f-7e54-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=u8fc33515&margin=%5Bobject%20Object%5D&originHeight=569&originWidth=484&originalType=url&ratio=1&rotation=0&showTitle=false&status=done&style=none&taskId=uba7043d2-e394-4bdf-86dc-2ab44e404c7&title=)
```properties
接下来分析数据收测的逻辑:
------------------------------------------------------------------------------
Ethernet        {Contents=[..14..] Payload=[..86..] SrcMAC=00:0c:29:67:92:63 DstMAC=00:0c:29:dd:24:3a EthernetType=IPv4 Length=0}
IPv4    {Contents=[..20..] Payload=[..64..] Version=4 IHL=5 TOS=0 Length=84 Id=13993 Flags=DF FragOffset=0 TTL=63 Protocol=ICMPv4 Checksum=60816 SrcIP=10.0.2.23 DstIP=10.0.1.89 Options=[] Padding=[]}
ICMPv4  {Contents=[..8..] Payload=[..56..] TypeCode=EchoRequest Checksum=32567 Id=10240 Seq=0}
  Failed to decode layer: No decoder for layer type Payload
CPU 05: MARK 0x0 FROM 2077 from-network: 98 bytes (98 captured), state new, interface ens33, orig-ip 0.0.0.0
CPU 05: MARK 0x0 FROM 2077 DEBUG: Successfully mapped addr=10.0.2.23 to identity=31020
CPU 05: MARK 0x0 FROM 2077 DEBUG: Attempting local delivery for container id 2433 from seclabel 31020          # 1.此过程是收到来自bpf1节点的ICMP Request消息。
CPU 05: MARK 0x0 FROM 2433 DEBUG: Conntrack lookup 1/2: src=10.0.2.23:10240 dst=10.0.1.89:0
CPU 05: MARK 0x0 FROM 2433 DEBUG: Conntrack lookup 2/2: nexthdr=1 flags=0
CPU 05: MARK 0x0 FROM 2433 DEBUG: CT verdict: New, revnat=0
CPU 05: MARK 0x0 FROM 2433 DEBUG: Conntrack create: proxy-port=0 revnat=0 src-identity=31020 lb=0.0.0.0
------------------------------------------------------------------------------
Ethernet        {Contents=[..14..] Payload=[..86..] SrcMAC=ba:6f:18:b6:eb:5e DstMAC=4a:77:6f:e0:8e:c6 EthernetType=IPv4 Length=0}
IPv4    {Contents=[..20..] Payload=[..64..] Version=4 IHL=5 TOS=0 Length=84 Id=13993 Flags=DF FragOffset=0 TTL=62 Protocol=ICMPv4 Checksum=61072 SrcIP=10.0.2.23 DstIP=10.0.1.89 Options=[] Padding=[]}
ICMPv4  {Contents=[..8..] Payload=[..56..] TypeCode=EchoRequest Checksum=32567 Id=10240 Seq=0}
  Failed to decode layer: No decoder for layer type Payload
CPU 05: MARK 0x0 FROM 2433 to-endpoint: 98 bytes (98 captured), state new, interface lxc313092b52ea8, , identity 31020->31020, orig-ip 10.0.2.23, to endpoint 2433
------------------------------------------------------------------------------
Ethernet        {Contents=[..14..] Payload=[..86..] SrcMAC=4a:77:6f:e0:8e:c6 DstMAC=ba:6f:18:b6:eb:5e EthernetType=IPv4 Length=0}
IPv4    {Contents=[..20..] Payload=[..64..] Version=4 IHL=5 TOS=0 Length=84 Id=58761 Flags= FragOffset=0 TTL=64 Protocol=ICMPv4 Checksum=32176 SrcIP=10.0.1.89 DstIP=10.0.2.23 Options=[] Padding=[]}
ICMPv4  {Contents=[..8..] Payload=[..56..] TypeCode=EchoReply Checksum=34615 Id=10240 Seq=0}
  Failed to decode layer: No decoder for layer type Payload
CPU 05: MARK 0x0 FROM 2433 from-endpoint: 98 bytes (98 captured), state new, , identity 31020->unknown, orig-ip 0.0.0.0
CPU 05: MARK 0x0 FROM 2433 DEBUG: Conntrack lookup 1/2: src=10.0.1.89:0 dst=10.0.2.23:10240     # 2.此过程是Src_IP 和 Dst_IP 对调,此时ICMP Replay的消息。
CPU 05: MARK 0x0 FROM 2433 DEBUG: Conntrack lookup 2/2: nexthdr=1 flags=1
CPU 05: MARK 0x0 FROM 2433 DEBUG: CT entry found lifetime=16788112, revnat=0
CPU 05: MARK 0x0 FROM 2433 DEBUG: CT verdict: Reply, revnat=0
CPU 05: MARK 0x0 FROM 2433 DEBUG: Successfully mapped addr=10.0.2.23 to identity=31020
------------------------------------------------------------------------------
Ethernet        {Contents=[..14..] Payload=[..86..] SrcMAC=00:0c:29:dd:24:3a DstMAC=00:0c:29:67:92:63 EthernetType=IPv4 Length=0}
IPv4    {Contents=[..20..] Payload=[..64..] Version=4 IHL=5 TOS=0 Length=84 Id=58761 Flags= FragOffset=0 TTL=63 Protocol=ICMPv4 Checksum=32432 SrcIP=10.0.1.89 DstIP=10.0.2.23 Options=[] Padding=[]}
ICMPv4  {Contents=[..8..] Payload=[..56..] TypeCode=EchoReply Checksum=34615 Id=10240 Seq=0}
  Failed to decode layer: No decoder for layer type Payload
CPU 05: MARK 0x0 FROM 2077 to-network: 98 bytes (98 captured), state new, orig-ip 0.0.0.0
------------------------------------------------------------------------------
此以上为通过cilium monitor -vv 观察到的过程。下边我们通过tcpdump来进一步的证实。
  • 2.tcpdump 抓包分析

20221011-Pod-Pod(Different Node)-[Native Routing Mode] - 图1

关于tcpdump的分析,我们需要借助于此图以便我们能清楚的理解cilium native routing模式下的datapath:
ENV:
root@bpf1:~# kubectl get pod -o wide 
NAME        READY   STATUS    RESTARTS   AGE   IP           NODE   NOMINATED NODE   READINESS GATES
cni-89wbx   1/1     Running   0          89m   10.0.1.89    bpf3   <none>           <none>
cni-fmx5d   1/1     Running   0          89m   10.0.0.226   bpf2   <none>           <none>
cni-l666c   1/1     Running   0          89m   10.0.2.23    bpf1   <none>           <none>
root@bpf1:~# 
[1.IP & MAC地址信息]
root@bpf1:~# kubectl -nkube-system exec -it cilium-zlmgg -- cilium bpf endpoint list 
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), clean-cilium-state (init)
IP ADDRESS       LOCAL ENDPOINT INFO                                                                        
10.0.2.23:0      id=2436  flags=0x0000 ifindex=23  mac=26:FF:70:C4:3B:0F nodemac=DA:2A:CB:56:69:65                                                                           
root@bpf1:~# 
root@bpf1:~# kubectl -nkube-system exec -it cilium-8fcbm cilium bpf endpoint list 
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), clean-cilium-state (init)
IP ADDRESS       LOCAL ENDPOINT INFO                                                                       
10.0.0.226:0     id=437   flags=0x0000 ifindex=23  mac=A6:D5:1C:9F:3C:00 nodemac=AE:CC:C6:D8:DD:35                                                                         
root@bpf1:~# 
root@bpf1:~# kubectl -nkube-system exec -it cilium-6hdtv cilium bpf endpoint list 
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), clean-cilium-state (init)
IP ADDRESS       LOCAL ENDPOINT INFO
10.0.1.89:0      id=2433  flags=0x0000 ifindex=21  mac=4A:77:6F:E0:8E:C6 nodemac=BA:6F:18:B6:EB:5E                                                                        
root@bpf1:~# 

我们一bpf1上的Pod去ping测bpf3上的Pod:
root@bpf1:~# kubectl get pod -o wide 
NAME        READY   STATUS    RESTARTS   AGE   IP           NODE   NOMINATED NODE   READINESS GATES
cni-89wbx   1/1     Running   0          89m   10.0.1.89    bpf3   <none>           <none>

cni-l666c   1/1     Running   0          89m   10.0.2.23    bpf1   <none>           <none>


[ping 测]
root@bpf1:~# kubectl exec -it cni-l666c  -- ping -c 1  10.0.1.89
PING 10.0.1.89 (10.0.1.89): 56 data bytes
64 bytes from 10.0.1.89: seq=0 ttl=62 time=0.796 ms

--- 10.0.1.89 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.796/0.796/0.796 ms
root@bpf1:~# 
[1.在bpf1节点上的pod eth0 和 其对应的lxc网卡]
root@bpf1:~# kubectl exec -it cni-l666c -- tcpdump -pne -i eth0 
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:22:49.457412 26:ff:70:c4:3b:0f > da:2a:cb:56:69:65, ethertype IPv4 (0x0800), length 98: 10.0.2.23 > 10.0.1.89: ICMP echo request, id 22016, seq 0, length 64
12:22:49.458010 da:2a:cb:56:69:65 > 26:ff:70:c4:3b:0f, ethertype IPv4 (0x0800), length 98: 10.0.1.89 > 10.0.2.23: ICMP echo reply, id 22016, seq 0, length 64
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
root@bpf1:~# 

root@bpf1:~# kubectl exec -it cni-l666c -- ethtool -S eth0
NIC statistics:
     peer_ifindex: 23
     rx_queue_0_xdp_packets: 0
     rx_queue_0_xdp_bytes: 0
     rx_queue_0_drops: 0
     rx_queue_0_xdp_redirect: 0
     rx_queue_0_xdp_drops: 0
     rx_queue_0_xdp_tx: 0
     rx_queue_0_xdp_tx_errors: 0
     tx_queue_0_xdp_xmit: 0
     tx_queue_0_xdp_xmit_errors: 0

23: lxc7a0358c608b2@if22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether da:2a:cb:56:69:65 brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::d82a:cbff:fe56:6965/64 scope link 
       valid_lft forever preferred_lft forever
root@bpf1:~# tcpdump -pne -i lxc7a0358c608b2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lxc7a0358c608b2, link-type EN10MB (Ethernet), capture size 262144 bytes
12:22:49.457418 26:ff:70:c4:3b:0f > da:2a:cb:56:69:65, ethertype IPv4 (0x0800), length 98: 10.0.2.23 > 10.0.1.89: ICMP echo request, id 22016, seq 0, length 64
^C
3 packets captured
3 packets received by filter
0 packets dropped by kernel
root@bpf1:~# 


[2.bpf1 ens33网卡抓包]
root@bpf1:~# tcpdump -pne -i ens33 icmp 
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
12:22:49.457521 00:0c:29:67:92:63 > 00:0c:29:dd:24:3a, ethertype IPv4 (0x0800), length 98: 10.0.2.23 > 10.0.1.89: ICMP echo request, id 22016, seq 0, length 64
12:22:49.458010 00:0c:29:dd:24:3a > 00:0c:29:67:92:63, ethertype IPv4 (0x0800), length 98: 10.0.1.89 > 10.0.2.23: ICMP echo reply, id 22016, seq 0, length 64
^C
10 packets captured
10 packets received by filter
0 packets dropped by kernel
root@bpf1:~# 



[3.bpf3 节点ens33抓包]
root@bpf3:~# tcpdump -pne -i ens33 icmp 
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
12:22:49.456888 00:0c:29:67:92:63 > 00:0c:29:dd:24:3a, ethertype IPv4 (0x0800), length 98: 10.0.2.23 > 10.0.1.89: ICMP echo request, id 22016, seq 0, length 64
12:22:49.457032 00:0c:29:dd:24:3a > 00:0c:29:67:92:63, ethertype IPv4 (0x0800), length 98: 10.0.1.89 > 10.0.2.23: ICMP echo reply, id 22016, seq 0, length 64
^C
6 packets captured
6 packets received by filter
0 packets dropped by kernel
root@bpf3:~#

[4.bpf3 上pod的eth0 和 lxc网卡]
root@bpf3:~# kubectl exec -it cni-89wbx -- ethtool -S eth0
NIC statistics:
     peer_ifindex: 21
     rx_queue_0_xdp_packets: 0
     rx_queue_0_xdp_bytes: 0
     rx_queue_0_drops: 0
     rx_queue_0_xdp_redirect: 0
     rx_queue_0_xdp_drops: 0
     rx_queue_0_xdp_tx: 0
     rx_queue_0_xdp_tx_errors: 0
     tx_queue_0_xdp_xmit: 0
     tx_queue_0_xdp_xmit_errors: 0
root@bpf3:~# ip a| grep 21 -A3
21: lxc313092b52ea8@if20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether ba:6f:18:b6:eb:5e brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::b86f:18ff:feb6:eb5e/64 scope link 
       valid_lft forever preferred_lft forever
root@bpf3:~# tcpdump -pne -i lxc313092b52ea8
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lxc313092b52ea8, link-type EN10MB (Ethernet), capture size 262144 bytes
12:22:49.456999 4a:77:6f:e0:8e:c6 > ba:6f:18:b6:eb:5e, ethertype IPv4 (0x0800), length 98: 10.0.1.89 > 10.0.2.23: ICMP echo reply, id 22016, seq 0, length 64
^C
3 packets captured
3 packets received by filter
0 packets dropped by kernel
root@bpf3:~# 

root@bpf3:~# kubectl exec -it cni-89wbx -- tcpdump -pne -i eth0
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:22:49.456888 ba:6f:18:b6:eb:5e > 4a:77:6f:e0:8e:c6, ethertype IPv4 (0x0800), length 98: 10.0.2.23 > 10.0.1.89: ICMP echo request, id 22016, seq 0, length 64
12:22:49.456997 4a:77:6f:e0:8e:c6 > ba:6f:18:b6:eb:5e, ethertype IPv4 (0x0800), length 98: 10.0.1.89 > 10.0.2.23: ICMP echo reply, id 22016, seq 0, length 64
12:22:54.482426 4a:77:6f:e0:8e:c6 > ba:6f:18:b6:eb:5e, ethertype ARP (0x0806), length 42: Request who-has 10.0.1.72 tell 10.0.1.89, length 28
12:22:54.482479 ba:6f:18:b6:eb:5e > 4a:77:6f:e0:8e:c6, ethertype ARP (0x0806), length 42: Reply 10.0.1.72 is-at ba:6f:18:b6:eb:5e, length 28
^C
4 packets captured
4 packets received by filter
0 packets dropped by kernel
root@bpf3:~#