介绍

集群网络环境

网卡名称:ens192

  1. 10.103.22.183 master01
  2. 10.103.22.184 node02
  3. 10.103.22.185 node03

什么是Calico

Calico是一个用于容器、虚拟机和基于本地主机的工作负载的开源网络和网络安全解决方案。Calico支持广泛的平台,包括Kubernetes、OpenShift、Docker EE、OpenStack和bare metal服务。Calico将灵活的网络功能与随处运行的安全实施相结合,提供了一个具有本地Linux内核性能和真正的云本地可伸缩性的解决方案。

为什么使用Calico

  1. 针对网络安全的最佳实践

Calico丰富的网络策略模型使得锁定通信变得很容易,所以唯一的流量就是你想要的流量。您可以将Calico的安全增强看作是用它自己的个人防火墙来包装您的每个工作负载,当您部署新服务或将应用程序向上或向下扩展时,防火墙会实时动态地重新配置。

Calico的策略引擎可以在主机网络层(如果使用Istio & Envoy)和服务网格层(如果使用Istio & Envoy)执行相同的策略模型,保护您的基础设施不受工作负载影响。

  1. 性能

Calico使用Linux内核内置的高度优化的转发和访问控制功能来提供本地Linux网络数据平面性能,通常不需要任何与第一代SDN网络相关的encap/decap开销。

  1. 扩容

Calico的核心设计原则利用了最佳实践的云原生设计模式,并结合了全球最大的互联网运营商所信任的基于标准的网络协议。其结果是一个具有非凡可伸缩性的解决方案,多年来一直在生产中大规模运行。Calico的开发测试周期包括定期测试数千个节点集群。无论您运行的是10个节点集群、100个节点集群还是更多,您都可以获得最大Kubernetes集群所要求的性能和可伸缩性特性的改进。

  1. 互操作

Calico支持Kubernetes工作负载和非Kubernetes或遗留工作负载无缝且安全地通信。Kubernetes pods是您网络上的一等公民,能够与网络上的任何其他工作负载进行通信。此外,Calico可以无缝地扩展,以保护与Kubernetes一起的现有基于主机的工作负载(无论是在公共云中,还是在VMs上的on-prem或裸机服务器上)。所有工作负载都受制于相同的网络策略模型,因此允许流的惟一流量就是您希望流的流量。

  1. 和linux有很多相似之处

Calico使用现有系统管理员已经熟悉的Linux术语。输入您喜欢的Linux网络命令,您将得到您期望的结果。在绝大多数部署中,离开应用程序的包是通过网络传输的包,没有封装、隧道或覆盖。系统和网络管理员用来获得可见性和分析网络问题的所有现有工具都像现在一样工作。

  1. 针对Kubernetes网络策略的支持

Calico的网络策略引擎在API的开发过程中形成了Kubernetes网络策略的原始参考实现。Calico的独特之处在于它实现了API定义的全部特性,为用户提供了API定义时所设想的所有功能和灵活性。对于需要更强大功能的用户,Calico支持一组扩展的网络策略功能,这些功能与Kubernetes API一起无缝地工作,为用户定义网络策略提供了更大的灵活性。

组件介绍

Calico主要组件

  • calico/node: 该agent作为Calico守护进程的一部分运行。它管理接口,路由和接点的状态报告及强制性策略。
  • BIRD: 一个BGP的客户端,由Felix程序广播路由。
  • Etcd: 一个可选的分布式数据库存储。
  • Calico Controller: Calico策略控制器。

calico/node

calico/node是一个由两个容器组成的Pod

  1. 一个calico/node容器运行两个守护进程。

a. Felix

b. the Bird BGP daemon (optional)

  1. A calico-CNI插件, 响应来自节点上的kubelet的CNI请求。

Felix

这个Felix组件是Calico网络的核心。它运行在集群中的每个节点上,它主要负责接口、路由的管理、状态报告及强制性策略。

接口和路由的管理

Felix守护进程负责编程接口并在内核路由表中创建路由,以便在创建pod时为它们提供可路由的IP地址。Felix创建虚拟的网络接口,并且针对每个pod从Calico IPAM中分配一个IP地址。 接口一般以cali前辍开头,除非明确指定

状态报告

Felix通过监视工具(如Prometheus)公开用于实例状态报告的度量。

强制性策略

Felix负责网络策略的实施。Felix监视Pod上的标签,并与定义的网络策略对象进行比较,以决定是否允许或拒绝Pod的流量。Felix将有关接口及其IP地址和主机网络状态的信息写入etcd。

typha

Typha守护程序位于数据存储区(例如Kubernetes API服务器)和Felix的许多实例之间。Typha的主要目的是通过减少每个节点对数据存储的影响来增加规模。Felixconfd等服务连接到Typha,而不是直接连接到数据存储,因为Typha代表其所有客户端维护单个数据存储连接。它缓存数据存储区状态并删除重复数据事件,以便可以将其散发给许多侦听器。

BIRD

BIRD是一个BGP守护进程,它将Felix编写的路由信息分发给集群节点上的其他BIRD代理。BIRD agent是和Calico守护进程的Pod一起安装的。这确保了流量是跨节点可路由的。默认情况下,Calico创建一个完整的网格拓扑。这意味着每个BIRD代理都需要连接到集群中的其他所有BIRD代理。

对于较大的部署,BIRD可以配置为路由反射器。路由反射器拓扑允许将BIRD设置为其他BIRD代理通信的集中点。它还减少了每个BGP代理打开连接的数量。

ETCD

Calico使用一个称为etcd的分布式数据存储,存储Calico资源配置和网络策略规则。Felix守护进程与etcd数据存储进行通信,用于发布每个节点的路由、节点和接口信息。

为了获得更高的可用性,应该为大型部署设置多节点etcd集群。在这个设置中,etcd确保在etcd集群中复制Calico配置,使它们始终处于最后已知的良好状态。

一个可选的部署模型是使用Kubernetes API服务器作为分布式数据存储,从而消除了构建和维护etcd数据存储的需要。

安装calico

50个节点

  1. curl https://docs.projectcalico.org/manifests/calico.yaml -O

大于50个节点

  1. curl https://docs.projectcalico.org/manifests/calico-typha.yaml -o calico.yaml

调整calico-typha副本数

  1. apiVersion: apps/v1beta1
  2. kind: Deployment
  3. metadata:
  4. name: calico-typha
  5. ...
  6. spec:
  7. ...
  8. replicas: <number of replicas>
  • 官网建议是200个节点增加一个calico-typha的副本数
  • 一般建议calico-typha副本数为3保证其高可用
  • calico-typha 不得多于k8s节点的数量
  • calico-typha的副本数为0时 felix不会启动

安装

执行

使用的是第二种安装方式(大于50个节点的)

  1. kubectl apply -f calico-typha.yaml

修改配置

  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. name: calico-typha
  5. namespace: kube-system
  6. labels:
  7. k8s-app: calico-typha
  8. spec:
  9. replicas: 2
  10. revisionHistoryLimit: 2
  11. selector:
  12. matchLabels:
  13. k8s-app: calico-typha
  • calico-typha副本数为2
  1. #configmap配置calico分配pod ip地址段
  2. - name: CALICO_IPV4POOL_CIDR
  3. value: "192.110.0.0/16"
  • 配置pods的地址池

查询

  1. kubectl get pods -n kube-system -o wide |grep calico
  2. calico-kube-controllers-6d4bfc7c57-jpfwx 1/1 Running 2 50m 192.110.140.65 node02 <none> <none>
  3. calico-node-prnv8 1/1 Running 0 50m 10.103.22.183 master01 <none> <none>
  4. calico-node-rsg5h 1/1 Running 0 50m 10.103.22.185 node03 <none> <none>
  5. calico-node-tljfz 1/1 Running 0 50m 10.103.22.184 node02 <none> <none>
  6. calico-typha-9dfb6964-q6mtq 1/1 Running 0 50m 10.103.22.185 node03 <none> <none>
  7. calico-typha-9dfb6964-whndn 1/1 Running 0 50m 10.103.22.184 node02 <none> <none>
  • 运行正常

查看路由

  1. ip route
  2. default via 10.103.22.1 dev ens192 proto static metric 100
  3. 10.103.22.0/24 dev ens192 proto kernel scope link src 10.103.22.184 metric 100
  4. 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
  5. blackhole 192.110.140.64/26 proto bird
  6. 192.110.140.65 dev cali9fe3f3b71b4 scope link
  7. 192.110.140.68 dev calie60333d71a4 scope link
  8. 192.110.140.69 dev calibb530faa265 scope link
  9. 192.110.186.192/26 via 10.103.22.185 dev tunl0 proto bird onlink
  10. 192.110.241.64/26 via 10.103.22.183 dev tunl0 proto bird onlink
  • calico已经完成路由的添加

安装calicoctl命令

Calico提供一个名为calicoctl的命令行实用程序,用于管理Calico配置。运行calicoctl实用程序的主机需要连接到Calico etcd数据存储。另外,可以将calicoctl配置为连接到Kubernetes API数据存储。

您可以在任何可以通过网络访问Calico数据存储的主机上以二进制或容器的形式运行calicoctl。其中有三种的安装方式:

  • 在单一的主机上作为二进制进行安装
  • 在单一的主机上作为容器进行安装
  • 作为kubernetes pod进行安装

二进制安装calicoctl(目前使用的方式)

下载calicoctl 二进制文件

  1. $ curl -O -L https://github.com/projectcalico/calicoctl/releases/download/v3.11.1/calicoctl

设置文件为可执行

  1. $ chmod +x calicoctl

把calicoctl移动到可搜索的路径

  1. $ mv calicoctl /usr/local/bin

配置calicoctl的配置文件

  1. $ cat /etc/calico/calicoctl.cfg
  2. apiVersion: projectcalico.org/v3
  3. kind: CalicoAPIConfig
  4. metadata:
  5. spec:
  6. datastoreType: "kubernetes"
  7. kubeconfig: "/root/.kube/config"

查询结果

  1. calicoctl node status
  2. Calico process is running.
  3. IPv4 BGP status
  4. +---------------+-------------------+-------+----------+-------------+
  5. | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
  6. +---------------+-------------------+-------+----------+-------------+
  7. | 10.103.22.184 | node-to-node mesh | up | 05:58:02 | Established |
  8. | 10.103.22.185 | node-to-node mesh | up | 05:58:02 | Established |
  9. +---------------+-------------------+-------+----------+-------------+
  10. IPv6 BGP status
  11. No IPv6 peers found.
  • 已经显示BGP的状态

容器安装calicoctl

  1. $ docker pull calico/ctl:v3.11.1

calicoctl作为Kubernetes pod

使用与数据存储类型匹配的YAML将calicoctl容器部署到节点。

  • etcd
  1. $ kubectl apply -f https://docs.projectcalico.org/v3.11/manifests/calicoctl-etcd.yaml

  • Kubernetes API存储
  1. $ kubectl apply -f https://docs.projectcalico.org/v3.11/manifests/calicoctl.yaml
  • 可以使用kubectl命令显示如下:
  1. $ kubectl exec -ti -n kube-system calicoctl -- /calicoctl get profiles -o wide
  2. NAME TAGS
  3. kns.default kns.default
  4. kns.kube-system kns.kube-system
  • 建议设置个别名
  1. $ alias calicoctl="kubectl exec -i -n kube-system calicoctl /calicoctl -- "

[warning]为了能使用calicoctl别名,重定向文件到输入

  1. calicoctl create -f - < my_manifest.yaml

配置BGP路由反射器及对等体

BGP协议配置

缺省的节点到节点的BGP网格,当集群节点太大时同步路由就会分成损耗西能,所以必须修改

调整默认的node-to-node的模式需要先创建一个default的BGP协议配置

默认BGP peering 状态

  1. calicoctl node status
  2. Calico process is running.
  3. IPv4 BGP status
  4. +---------------+-------------------+-------+------------+-------------+
  5. | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
  6. +---------------+-------------------+-------+------------+-------------+
  7. | 10.103.22.183 | node-to-node mesh | up | 2020-10-24 | Established |
  8. | 10.103.22.184 | node-to-node mesh | up | 2020-12-10 | Established |
  9. | 10.103.22.185 | node-to-node mesh | up | 2020-10-24 | Established |
  10. +---------------+-------------------+-------+------------+-------------+
  11. IPv6 BGP status
  12. No IPv6 peers found.
  • 使用的node to node模式
  • 建立的BGP对等体数量为:2的n次方

创建default BGPConfiguration

vim calico-default.yaml

  1. apiVersion: projectcalico.org/v3
  2. kind: BGPConfiguration
  3. metadata:
  4. name: default
  5. spec:
  6. logSeverityScreen: Info
  7. nodeToNodeMeshEnabled: false
  8. asNumber: 63400
  • calico apply -f calico-default.yaml
  • 不使用node-node的模式
  • logSeverityScreen日志级别
  • nodeToNodeMeshEnabled:node to node模式是否开启

配置node作为路由反射器

Calico 可以配置扮演成一个路由反射器。每个节点要充当路由反射器必须有一个集群ID——通常为一个未使用的IPv4地址。

配置master01为路由反射器

  1. #查看node节点
  2. calicoctl get node
  3. NAME
  4. master01
  5. node02
  6. node03
  7. #配置master01为路由反射器
  8. calicoctl patch node master01 -p '{"spec": {"bgp": {"routeReflectorClusterID": "244.0.0.1"}}}'
  • 配置一个节点作为路由反射器,集群ID 244.0.0.1

路由反射器添加label

  1. #为路由反射器添加label
  2. kubectl label node master01 route-reflector=true
  • 常规情况下,给这个节点打上标签,标明这个是路由反射器。允许它通过BGPPeer resource选择。

设置BGPPeer

vim calico-bgppeer.yaml

  1. apiVersion: projectcalico.org/v3
  2. kind: BGPPeer
  3. metadata:
  4. name: peer-with-route-reflectors
  5. spec:
  6. nodeSelector: all()
  7. peerSelector: route-reflector == 'true'
  • calico apply -f calico-bgppeer.yaml
  • 使用标签器区分路由反射器节点和非路由反射器节点

查看BGP peering 状态

  • 您可以使用calicoctl查看一个特定节点的边界网关协议连接的当前状态。这是用于确认您的配置是根据需要的行为。
  1. #master01查看
  2. calicoctl node status
  3. Calico process is running.
  4. IPv4 BGP status
  5. +---------------+---------------+-------+----------+-------------+
  6. | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
  7. +---------------+---------------+-------+----------+-------------+
  8. | 10.103.22.184 | node specific | up | 14:06:27 | Established |
  9. | 10.103.22.185 | node specific | up | 14:06:25 | Established |
  10. +---------------+---------------+-------+----------+-------------+
  11. IPv6 BGP status
  12. No IPv6 peers found.
  • master01和 node02还有node03 建立BGP对等关系
  • 由于master01是路由反射器会个所有的节点去建立BGP对等关系
  1. #node02查看
  2. calicoctl node status
  3. Calico process is running.
  4. IPv4 BGP status
  5. +---------------+---------------+-------+----------+-------------+
  6. | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
  7. +---------------+---------------+-------+----------+-------------+
  8. | 10.103.22.183 | node specific | up | 14:06:27 | Established |
  9. +---------------+---------------+-------+----------+-------------+
  10. IPv6 BGP status
  11. No IPv6 peers found.
  • node01和master01已经建立对等关系
  • node节点只会和路由反射器建立对等关系
  • 这样就减少了BGP对等关系的数量
  • 建立的BGP对等体数量为:n

配置全局的 BGP 对等体

  1. apiVersion: projectcalico.org/v3
  2. kind: BGPPeer
  3. metadata:
  4. name: my-global-peer
  5. spec:
  6. peerIP: 192.20.30.40
  7. asNumber: 64567
  • 下面的示例创建一个全局BGP对等点,它将每个Calico节点配置为在AS 64567中使用192.20.30.40的对等点。

配置每节点的 BGP peer

每个节点的BGP对等点应用于集群中的一个或多个节点。您可以通过精确地指定节点的名称或使用标签选择器来选择节点。

  1. apiVersion: projectcalico.org/v3
  2. kind: BGPPeer
  3. metadata:
  4. name: rack1-tor
  5. spec:
  6. peerIP: 192.20.30.40
  7. asNumber: 64567
  8. nodeSelector: rack == rack-1

调整global AS number

  • 改变默认的global AS number

默认的,所有的calico 节点使用64512 autonomous system, 除非特殊指定。下面的命令把它改成64513.

  1. calicoctl patch bgpconfiguration default -p '{"spec": {"asNumber": “64513”}}'
  • 针对特定的节点改变AS number,如下所示
  1. calicoctl patch node node-1 -p '{"spec": {"bgp": {“asNumber”: “64514”}}}'

宣告k8s集群子网和外部子网

添加BGPConfiguration配置

  1. serviceClusterIPs:
  2. - cidr: 192.110.0.0/16
  3. serviceExternalIPs:
  4. - cidr: 10.103.23.0/24
  • serviceClusterIPs 宣告k8s集群的子网
  • serviceExternalIPs 宣告k8s集群外的其他子网

注意:

serviceClusterIPs使用场景:在k8s子网内的主机没有加入k8s集群的机器,需要直接通过ip而不是通过ingress代理访问pod服务

serviceExternalIPs 使用场景:不在k8s子网内的主机没有加入k8s集群的机器,需要直接通过ip而不是通过ingress代理访问pod服务

以上能实现的前提条件是需要访问pod的机器安装了BIRD服务,并将其与群集中运行的Calico节点进行对等,以便它可以学习路由

IP IN IP模式

封装类型

您可以配置每个IP池不同封装配置。然而,你不能一个IP池内混合封装类型。

  • Configure IP in IP encapsulation for only cross subnet traffic
  • Configure IP in IP encapsulation for all inter workload traffic

IPv4/6 地址支持

IP in IP和 VXLAN只支持IPv4地址。

IPIP实践

Calico 只有一个选项来选择性地封装流量 ,跨越子网边界。我们建议使用IP in IP的cross subnet选项把开销降到最低。

  • cross subnet 是指当节点在同一个子网时使用路由的方式,不同子网时使用IPIP封装的方式。

注意:切换封装模式会导到正在连接的进程中断。

Always(calico默认使用方式)

配置IPPool

ipipMode设置Always

vim ippool-always.yaml

  1. apiVersion: projectcalico.org/v3
  2. kind: IPPool
  3. metadata:
  4. name: ippool-ipip-always
  5. spec:
  6. cidr: 192.168.0.0/16
  7. ipipMode: Always
  8. natOutgoing: true
  • 所有的流量包都用IPIP进行封装
  • calico官方默认使用的是ipip-always的方式
  • calicoctl apply -f calico-ippool-default.yaml

查看IPPool

  1. calicoctl get IPPool default-ipv4-ippool -o yaml
  2. apiVersion: projectcalico.org/v3
  3. kind: IPPool
  4. metadata:
  5. creationTimestamp: "2020-12-21T05:57:19Z"
  6. name: default-ipv4-ippool
  7. resourceVersion: "591742"
  8. uid: 84960860-4a7c-4b40-b767-946c2baf981e
  9. spec:
  10. blockSize: 26
  11. cidr: 192.110.0.0/16
  12. ipipMode: Always
  13. natOutgoing: true
  14. nodeSelector: all()
  15. vxlanMode: Never

抓包验证

查看pods所在节点

  1. kubectl get pods -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  3. web-dr8xc 1/1 Running 1 39h 192.110.186.198 node03 <none> <none>
  4. web-fbns2 1/1 Running 1 39h 192.110.186.197 node03 <none> <none>
  5. web-ftlqt 1/1 Running 1 39h 192.110.140.68 node02 <none> <none>

进入web-ftlqt pod(节点在node02),长ping 192.110.186.197(节点在node03)

  1. kubectl exec -it web-ftlqt -- /bin/bash
  2. ping 192.110.186.197
  3. PING 192.110.186.197 (192.110.186.197): 48 data bytes
  4. 56 bytes from 192.110.186.197: icmp_seq=0 ttl=62 time=1.270 ms
  5. 56 bytes from 192.110.186.197: icmp_seq=1 ttl=62 time=0.537 ms

在node03节点上进行抓包
查看网卡信息

  1. 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  2. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  3. inet 127.0.0.1/8 scope host lo
  4. valid_lft forever preferred_lft forever
  5. 2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
  6. link/ether 00:50:56:a0:ce:ff brd ff:ff:ff:ff:ff:ff
  7. inet 10.103.22.185/24 brd 10.103.22.255 scope global noprefixroute ens192
  8. valid_lft forever preferred_lft forever
  9. 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
  10. link/ether 02:42:c0:81:9f:46 brd ff:ff:ff:ff:ff:ff
  11. inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
  12. valid_lft forever preferred_lft forever
  13. 4: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
  14. link/ether 46:40:96:15:95:e7 brd ff:ff:ff:ff:ff:ff
  15. 5: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
  16. link/ether 12:0c:d9:ad:73:ac brd ff:ff:ff:ff:ff:ff
  17. inet 192.100.182.252/32 brd 192.100.182.252 scope global kube-ipvs0
  18. valid_lft forever preferred_lft forever
  19. inet 192.100.0.10/32 brd 192.100.0.10 scope global kube-ipvs0
  20. valid_lft forever preferred_lft forever
  21. inet 192.100.0.1/32 brd 192.100.0.1 scope global kube-ipvs0
  22. valid_lft forever preferred_lft forever
  23. inet 192.100.227.87/32 brd 192.100.227.87 scope global kube-ipvs0
  24. valid_lft forever preferred_lft forever
  25. 6: cali5ed4c62f10d@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
  26. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
  27. 7: cali7dd0f951039@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
  28. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1
  29. 8: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
  30. link/ipip 0.0.0.0 brd 0.0.0.0
  31. inet 192.110.186.192/32 scope global tunl0
  32. valid_lft forever preferred_lft forever
  33. 9: cali7d765aed89d@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
  34. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 2
  • ens192 主机网卡
  • tunl0 calico创建的隧道网卡
  • cali开头的设备为 calico为pod创建的网卡
  • kube-ipvs0 ipvs创建的网卡

先在ens192网卡上抓包

  1. #在ens192网卡(服务器网卡)上抓包
  2. tcpdump -i ens192 -nn dst 192.110.140.68
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
  • 没有数据包,说明pods发送过来的包经过了封装,所以node03节点ens192网卡看不到192.110.140.68的数据包

在tunl0网卡上抓包

  1. #在tunl0网卡(calico ipip模式的隧道设备)上抓包
  2. tcpdump -i tunl0 -nn dst 192.110.140.68
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on tunl0, link-type RAW (Raw IP), capture size 262144 bytes
  5. 11:30:40.303384 IP 192.110.186.197 > 192.110.140.68: ICMP echo reply, id 19712, seq 5633, length 56
  6. 11:30:41.304847 IP 192.110.186.197 > 192.110.140.68: ICMP echo reply, id 19712, seq 5889, length 56
  7. 11:30:42.305407 IP 192.110.186.197 > 192.110.140.68: ICMP echo reply, id 19712, seq 6145, length 56
  8. 11:30:43.306508 IP 192.110.186.197 > 192.110.140.68: ICMP echo reply, id 19712, seq 6401, length 56
  9. 11:30:44.310433 IP 192.110.186.197 > 192.110.140.68: ICMP echo reply, id 19712, seq 6657, length 56
  10. 11:30:45.309017 IP 192.110.186.197 > 192.110.140.68: ICMP echo reply, id 19712, seq 6913, length 56
  11. 11:30:46.310418 IP 192.110.186.197 > 192.110.140.68: ICMP echo reply, id 19712, seq 7169, length 56
  12. 7 packets captured
  13. 7 packets received by filter
  14. 0 packets dropped by kernel
  • 有数据包通过
  • 可以看出数据包通过tunl0设备到node03的pos上
  • 数据包经过封装,到达node03的pod

CrossSubnet

配置IPPool

设置ipipMode为CrossSubnet

vimippool-crosssubnet.yaml

  1. apiVersion: projectcalico.org/v3
  2. kind: IPPool
  3. metadata:
  4. name: ippool-ipip-crosssubnet
  5. spec:
  6. cidr: 192.168.0.0/16
  7. ipipMode: CrossSubnet
  8. natOutgoing: true

使用自带IPPool配置

  1. apiVersion: projectcalico.org/v3
  2. kind: IPPool
  3. metadata:
  4. name: default-ipv4-ippool
  5. spec:
  6. cidr: 192.110.0.0/16
  7. ipipMode: CrossSubnet
  8. natOutgoing: true
  9. nodeSelector: all()
  10. vxlanMode: Never
  • IP in IP封装可以选择性的执行
  • 节点在同一个子网时使用路由的方式,不同子网时使用IPIP封装的方式。
  • calicoctl apply -f calico-ippool-default.yaml

查看IPPool

  1. calicoctl get IPPool default-ipv4-ippool -o yaml
  2. apiVersion: projectcalico.org/v3
  3. kind: IPPool
  4. metadata:
  5. creationTimestamp: "2020-12-21T05:57:19Z"
  6. name: default-ipv4-ippool
  7. resourceVersion: "799024"
  8. uid: 84960860-4a7c-4b40-b767-946c2baf981e
  9. spec:
  10. blockSize: 26
  11. cidr: 192.110.0.0/16
  12. ipipMode: CrossSubnet
  13. natOutgoing: true
  14. nodeSelector: all()
  15. vxlanMode: Never
  • ipipMode已经调整为CrossSubnet

查看路由(CrossSubnet没生效前)

  1. ip route
  2. default via 10.103.22.1 dev ens192 proto static metric 100
  3. 10.103.22.0/24 dev ens192 proto kernel scope link src 10.103.22.184 metric 100
  4. 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
  5. blackhole 192.110.140.64/26 proto bird
  6. 192.110.140.65 dev cali9fe3f3b71b4 scope link
  7. 192.110.140.68 dev calie60333d71a4 scope link
  8. 192.110.140.69 dev calibb530faa265 scope link
  9. 192.110.186.192/26 via 10.103.22.185 dev tunl0 proto bird onlink
  10. 192.110.241.64/26 via 10.103.22.183 dev tunl0 proto bird onlink
  • 没有生效之前,pod网段的路由都是到tunl0
  • 还是使用ipip模式进行封装然后通过tunl0隧道设备进行传输数据包

查看路由(CrossSubnet生效)

  1. ip route
  2. default via 10.103.22.1 dev ens192 proto static metric 100
  3. 10.103.22.0/24 dev ens192 proto kernel scope link src 10.103.22.184 metric 100
  4. 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
  5. blackhole 192.110.140.64/26 proto bird
  6. 192.110.140.65 dev cali9fe3f3b71b4 scope link
  7. 192.110.140.66 dev cali827b87b13d2 scope link
  8. 192.110.140.69 dev calibb530faa265 scope link
  9. 192.110.186.192/26 via 10.103.22.185 dev ens192 proto bird
  10. 192.110.241.64/26 via 10.103.22.183 dev ens192 proto bird
  • CrossSubnet 生效
  • pod网段路由都是到ens192网卡

抓包验证

查看pods所在节点

  1. kubectl get pods -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  3. web-2djb2 1/1 Running 0 11m 192.110.140.66 node02 <none> <none>
  4. web-6gv9d 1/1 Running 0 11m 192.110.186.194 node03 <none> <none>
  5. web-ccct6 1/1 Running 0 11m 192.110.186.193 node03 <none> <none>

进入web-2djb2 pod(节点在node02),长ping 192.110.186.194(节点在node03)

  1. kubectl exec -it web-2djb2 -- /bin/bash
  2. ping 192.110.186.194
  3. PING 192.110.186.194 (192.110.186.194): 48 data bytes
  4. 56 bytes from 192.110.186.194: icmp_seq=0 ttl=62 time=1.251 ms
  5. 56 bytes from 192.110.186.194: icmp_seq=1 ttl=62 time=0.500 ms
  6. 56 bytes from 192.110.186.194: icmp_seq=2 ttl=62 time=6.477 ms
  7. 56 bytes from 192.110.186.194: icmp_seq=3 ttl=62 time=0.763 ms
  8. 56 bytes from 192.110.186.194: icmp_seq=4 ttl=62 time=0.530 ms
  9. 56 bytes from 192.110.186.194: icmp_seq=5 ttl=62 time=0.470 ms
  10. 56 bytes from 192.110.186.194: icmp_seq=6 ttl=62 time=0.590 ms
  11. 56 bytes from 192.110.186.194: icmp_seq=7 ttl=62 time=0.528 ms
  12. 56 bytes from 192.110.186.194: icmp_seq=8 ttl=62 time=0.430 ms
  13. ^C--- 192.110.186.194 ping statistics ---
  14. 9 packets transmitted, 9 packets received, 0% packet loss
  15. round-trip min/avg/max/stddev = 0.430/1.282/6.477/1.852 ms

在node03节点上进行抓包
查看网卡信息

  1. 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  2. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  3. inet 127.0.0.1/8 scope host lo
  4. valid_lft forever preferred_lft forever
  5. 2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
  6. link/ether 00:50:56:a0:ce:ff brd ff:ff:ff:ff:ff:ff
  7. inet 10.103.22.185/24 brd 10.103.22.255 scope global noprefixroute ens192
  8. valid_lft forever preferred_lft forever
  9. 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
  10. link/ether 02:42:c0:81:9f:46 brd ff:ff:ff:ff:ff:ff
  11. inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
  12. valid_lft forever preferred_lft forever
  13. 4: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
  14. link/ether 46:40:96:15:95:e7 brd ff:ff:ff:ff:ff:ff
  15. 5: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
  16. link/ether 12:0c:d9:ad:73:ac brd ff:ff:ff:ff:ff:ff
  17. inet 192.100.182.252/32 brd 192.100.182.252 scope global kube-ipvs0
  18. valid_lft forever preferred_lft forever
  19. inet 192.100.0.10/32 brd 192.100.0.10 scope global kube-ipvs0
  20. valid_lft forever preferred_lft forever
  21. inet 192.100.0.1/32 brd 192.100.0.1 scope global kube-ipvs0
  22. valid_lft forever preferred_lft forever
  23. inet 192.100.227.87/32 brd 192.100.227.87 scope global kube-ipvs0
  24. valid_lft forever preferred_lft forever
  25. 6: cali5ed4c62f10d@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
  26. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
  27. 8: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
  28. link/ipip 0.0.0.0 brd 0.0.0.0
  29. inet 192.110.186.192/32 scope global tunl0
  30. valid_lft forever preferred_lft forever
  31. 11: cali2b2abdd19e8@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default
  32. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1
  33. 12: cali64dc30e7c56@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default
  34. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 2
  • ens192 主机网卡
  • tunl0 calico创建的隧道网卡
  • cali开头的设备为 calico为pod创建的网卡
  • kube-ipvs0 ipvs创建的网卡

先在ens192网卡上抓包

  1. #在ens192网卡(服务器网卡)上抓包
  2. tcpdump -i ens192 -nn dst 192.110.140.66
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
  5. 14:50:29.840816 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 0, length 56
  6. 14:50:30.841764 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 256, length 56
  7. 14:50:31.846405 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 512, length 56
  8. 14:50:32.847888 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 768, length 56
  9. 14:50:33.849056 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 1024, length 56
  10. 14:50:34.850389 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 1280, length 56
  11. 14:50:35.851610 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 1536, length 56
  12. 14:50:36.852774 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 1792, length 56
  13. 8 packets captured
  14. 8 packets received by filter
  15. 0 packets dropped by kernel
  • 有数据包通过
  • 数据包通过ens192网卡直接路由到node03的pod上
  • 数据包没有封装直接路由到node03的pod上

在tunl0网卡上抓包

  1. #在tunl0网卡(calico ipip模式的隧道设备)上抓包
  2. tcpdump -i tunl0 -nn dst 192.110.140.66
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on tunl0, link-type RAW (Raw IP), capture size 262144 bytes
  • 没有数据包通过
  • 说明数据包直接路由传输
  • 没有进行封装

不同子网节点验证(目前没有环境没有验证)

实验结果

不同子网会使用IPIP模式进行数据包封装传输

Never(不使用ipip)

配置IPPool

ipipMode设置Never

vim ippool-never.yaml

  1. apiVersion: projectcalico.org/v3
  2. kind: IPPool
  3. metadata:
  4. name: ippool-ipip-never
  5. spec:
  6. cidr: 192.168.0.0/16
  7. ipipMode: Never
  8. natOutgoing: true

使用自带IPPool配置

  1. apiVersion: projectcalico.org/v3
  2. kind: IPPool
  3. metadata:
  4. name: default-ipv4-ippool
  5. spec:
  6. cidr: 192.110.0.0/16
  7. ipipMode: Never
  8. natOutgoing: true
  9. nodeSelector: all()
  10. vxlanMode: Never
  • 不使用IPIP模式
  • 必须所有节点都在一个子网内
  • calicoctl apply -f calico-ippool-default.yaml

查看IPPool

  1. calicoctl get IPPool default-ipv4-ippool -o yaml
  2. apiVersion: projectcalico.org/v3
  3. kind: IPPool
  4. metadata:
  5. creationTimestamp: "2020-12-21T05:57:19Z"
  6. name: default-ipv4-ippool
  7. resourceVersion: "804703"
  8. uid: 84960860-4a7c-4b40-b767-946c2baf981e
  9. spec:
  10. blockSize: 26
  11. cidr: 192.110.0.0/16
  12. ipipMode: Never
  13. natOutgoing: true
  14. nodeSelector: all()
  15. vxlanMode: Never
  • ipipMode已经设置成Never

查看路由

  1. ip route
  2. default via 10.103.22.1 dev ens192 proto static metric 100
  3. 10.103.22.0/24 dev ens192 proto kernel scope link src 10.103.22.184 metric 100
  4. 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
  5. blackhole 192.110.140.64/26 proto bird
  6. 192.110.140.65 dev cali9fe3f3b71b4 scope link
  7. 192.110.140.66 dev cali827b87b13d2 scope link
  8. 192.110.140.69 dev calibb530faa265 scope link
  9. 192.110.186.192/26 via 10.103.22.185 dev ens192 proto bird
  10. 192.110.241.64/26 via 10.103.22.183 dev ens192 proto bird
  • Never生效
  • pod网段路由都是到ens192网卡

抓包验证

查看pods所在节点

  1. kubectl get pods -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  3. web-2djb2 1/1 Running 0 11m 192.110.140.66 node02 <none> <none>
  4. web-6gv9d 1/1 Running 0 11m 192.110.186.194 node03 <none> <none>
  5. web-ccct6 1/1 Running 0 11m 192.110.186.193 node03 <none> <none>

进入web-2djb2 pod(节点在node02),长ping 192.110.186.194(节点在node03)

  1. kubectl exec -it web-2djb2 -- /bin/bash
  2. ping 192.110.186.194
  3. PING 192.110.186.194 (192.110.186.194): 48 data bytes
  4. 56 bytes from 192.110.186.194: icmp_seq=0 ttl=62 time=1.251 ms
  5. 56 bytes from 192.110.186.194: icmp_seq=1 ttl=62 time=0.500 ms
  6. 56 bytes from 192.110.186.194: icmp_seq=2 ttl=62 time=6.477 ms
  7. 56 bytes from 192.110.186.194: icmp_seq=3 ttl=62 time=0.763 ms
  8. 56 bytes from 192.110.186.194: icmp_seq=4 ttl=62 time=0.530 ms
  9. 56 bytes from 192.110.186.194: icmp_seq=5 ttl=62 time=0.470 ms
  10. 56 bytes from 192.110.186.194: icmp_seq=6 ttl=62 time=0.590 ms
  11. 56 bytes from 192.110.186.194: icmp_seq=7 ttl=62 time=0.528 ms
  12. 56 bytes from 192.110.186.194: icmp_seq=8 ttl=62 time=0.430 ms
  13. ^C--- 192.110.186.194 ping statistics ---
  14. 9 packets transmitted, 9 packets received, 0% packet loss
  15. round-trip min/avg/max/stddev = 0.430/1.282/6.477/1.852 ms

在node03节点上进行抓包
查看网卡信息

  1. 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  2. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  3. inet 127.0.0.1/8 scope host lo
  4. valid_lft forever preferred_lft forever
  5. 2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
  6. link/ether 00:50:56:a0:ce:ff brd ff:ff:ff:ff:ff:ff
  7. inet 10.103.22.185/24 brd 10.103.22.255 scope global noprefixroute ens192
  8. valid_lft forever preferred_lft forever
  9. 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
  10. link/ether 02:42:c0:81:9f:46 brd ff:ff:ff:ff:ff:ff
  11. inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
  12. valid_lft forever preferred_lft forever
  13. 4: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
  14. link/ether 46:40:96:15:95:e7 brd ff:ff:ff:ff:ff:ff
  15. 5: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
  16. link/ether 12:0c:d9:ad:73:ac brd ff:ff:ff:ff:ff:ff
  17. inet 192.100.182.252/32 brd 192.100.182.252 scope global kube-ipvs0
  18. valid_lft forever preferred_lft forever
  19. inet 192.100.0.10/32 brd 192.100.0.10 scope global kube-ipvs0
  20. valid_lft forever preferred_lft forever
  21. inet 192.100.0.1/32 brd 192.100.0.1 scope global kube-ipvs0
  22. valid_lft forever preferred_lft forever
  23. inet 192.100.227.87/32 brd 192.100.227.87 scope global kube-ipvs0
  24. valid_lft forever preferred_lft forever
  25. 6: cali5ed4c62f10d@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
  26. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
  27. 8: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
  28. link/ipip 0.0.0.0 brd 0.0.0.0
  29. inet 192.110.186.192/32 scope global tunl0
  30. valid_lft forever preferred_lft forever
  31. 11: cali2b2abdd19e8@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default
  32. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1
  33. 12: cali64dc30e7c56@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default
  34. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 2
  • ens192 主机网卡
  • tunl0 calico创建的隧道网卡
  • cali开头的设备为 calico为pod创建的网卡
  • kube-ipvs0 ipvs创建的网卡

先在ens192网卡上抓包

  1. #在ens192网卡(服务器网卡)上抓包
  2. tcpdump -i ens192 -nn dst 192.110.140.66
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
  5. 14:50:29.840816 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 0, length 56
  6. 14:50:30.841764 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 256, length 56
  7. 14:50:31.846405 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 512, length 56
  8. 14:50:32.847888 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 768, length 56
  9. 14:50:33.849056 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 1024, length 56
  10. 14:50:34.850389 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 1280, length 56
  11. 14:50:35.851610 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 1536, length 56
  12. 14:50:36.852774 IP 192.110.186.194 > 192.110.140.66: ICMP echo reply, id 18688, seq 1792, length 56
  13. 8 packets captured
  14. 8 packets received by filter
  15. 0 packets dropped by kernel
  • 有数据包通过
  • 数据包通过ens192网卡直接路由到node03的pod上
  • 数据包没有封装直接路由到node03的pod上

在tunl0网卡上抓包

  1. #在tunl0网卡(calico ipip模式的隧道设备)上抓包
  2. tcpdump -i tunl0 -nn dst 192.110.140.66
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on tunl0, link-type RAW (Raw IP), capture size 262144 bytes
  • 没有数据包通过
  • 说明数据包直接路由传输
  • 没有进行封装

VXLAN模式

封装类型

您可以配置每个IP池不同封装配置。然而,你不能一个IP池内混合封装类型。

  • Configure VXLAN encapsulation for only cross subnet traffic
  • Configure VXLAN encapsulation for all inter workload traffic

CrossSubnet

配置IPPool

设置ipipMode为CrossSubnet

vim ippool-vxlan-crosssubnet.yaml

  1. apiVersion: projectcalico.org/v3
  2. kind: IPPool
  3. metadata:
  4. name: ippool-vxlan-cross-subnet-1
  5. spec:
  6. cidr: 192.168.0.0/16
  7. vxlanMode: CrossSubnet
  8. natOutgoing: true

使用自带IPPool配置

  1. apiVersion: projectcalico.org/v3
  2. kind: IPPool
  3. metadata:
  4. name: default-ipv4-ippool
  5. spec:
  6. cidr: 192.110.0.0/16
  7. ipipMode: Never
  8. natOutgoing: true
  9. nodeSelector: all()
  10. vxlanMode: CrossSubnet
  • vxlan封装可以选择性的执行
  • 节点在同一个子网时使用路由的方式,不同子网时使用IPIP封装的方式。
  • calicoctl apply -f calico-vxlan-test.yaml

查看IPPool

  1. calicoctl get IPPool default-ipv4-ippool -o yaml
  2. apiVersion: projectcalico.org/v3
  3. kind: IPPool
  4. metadata:
  5. creationTimestamp: "2020-12-21T05:57:19Z"
  6. name: default-ipv4-ippool
  7. resourceVersion: "808677"
  8. uid: 84960860-4a7c-4b40-b767-946c2baf981e
  9. spec:
  10. blockSize: 26
  11. cidr: 192.110.0.0/16
  12. ipipMode: Never
  13. natOutgoing: true
  14. nodeSelector: all()
  15. vxlanMode: CrossSubnet
  • vxlanMode已经调整为CrossSubnet

查看主机网卡

  1. ip addr
  2. 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  3. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  4. inet 127.0.0.1/8 scope host lo
  5. valid_lft forever preferred_lft forever
  6. 2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
  7. link/ether 00:50:56:a0:ce:ff brd ff:ff:ff:ff:ff:ff
  8. inet 10.103.22.185/24 brd 10.103.22.255 scope global noprefixroute ens192
  9. valid_lft forever preferred_lft forever
  10. 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
  11. link/ether 02:42:c0:81:9f:46 brd ff:ff:ff:ff:ff:ff
  12. inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
  13. valid_lft forever preferred_lft forever
  14. 4: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
  15. link/ether 46:40:96:15:95:e7 brd ff:ff:ff:ff:ff:ff
  16. 5: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
  17. link/ether 12:0c:d9:ad:73:ac brd ff:ff:ff:ff:ff:ff
  18. inet 192.100.182.252/32 brd 192.100.182.252 scope global kube-ipvs0
  19. valid_lft forever preferred_lft forever
  20. inet 192.100.0.10/32 brd 192.100.0.10 scope global kube-ipvs0
  21. valid_lft forever preferred_lft forever
  22. inet 192.100.0.1/32 brd 192.100.0.1 scope global kube-ipvs0
  23. valid_lft forever preferred_lft forever
  24. inet 192.100.227.87/32 brd 192.100.227.87 scope global kube-ipvs0
  25. valid_lft forever preferred_lft forever
  26. 6: cali5ed4c62f10d@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
  27. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
  28. 8: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
  29. link/ipip 0.0.0.0 brd 0.0.0.0
  30. 11: cali2b2abdd19e8@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default
  31. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1
  32. 12: cali64dc30e7c56@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default
  33. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 2
  34. 13: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
  35. link/ether 66:43:c9:45:ef:85 brd ff:ff:ff:ff:ff:ff
  36. inet 192.168.186.192/32 scope global vxlan.calico
  37. valid_lft forever preferred_lft forever
  • 已经生成vxlan.calico设备(隧道)

抓包验证

查看pods所在节点

  1. kubectl get pods -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  3. web-2djb2 1/1 Running 1 98m 192.110.140.67 node02 <none> <none>
  4. web-6gv9d 1/1 Running 1 98m 192.110.186.197 node03 <none> <none>
  5. web-ccct6 1/1 Running 1 98m 192.110.186.195 node03 <none> <none>

进入web-2djb2 pod(节点在node02),长ping 192.110.186.195(节点在node03)

  1. kubectl exec -it web-2djb2 -- /bin/bash
  2. ping 192.110.186.195
  3. PING 192.110.186.195 (192.110.186.195): 48 data bytes
  4. 56 bytes from 192.110.186.195: icmp_seq=0 ttl=62 time=1.827 ms
  5. 56 bytes from 192.110.186.195: icmp_seq=1 ttl=62 time=0.528 ms
  6. 56 bytes from 192.110.186.195: icmp_seq=2 ttl=62 time=0.703 ms
  7. 56 bytes from 192.110.186.195: icmp_seq=3 ttl=62 time=1.418 ms
  8. ^C--- 192.110.186.195 ping statistics ---
  9. 4 packets transmitted, 4 packets received, 0% packet loss
  10. round-trip min/avg/max/stddev = 0.528/1.119/1.827/0.527 ms

在node03节点上进行抓包
查看网卡信息

  1. 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  2. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  3. inet 127.0.0.1/8 scope host lo
  4. valid_lft forever preferred_lft forever
  5. 2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
  6. link/ether 00:50:56:a0:ce:ff brd ff:ff:ff:ff:ff:ff
  7. inet 10.103.22.185/24 brd 10.103.22.255 scope global noprefixroute ens192
  8. valid_lft forever preferred_lft forever
  9. 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
  10. link/ether 02:42:41:42:1b:cf brd ff:ff:ff:ff:ff:ff
  11. inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
  12. valid_lft forever preferred_lft forever
  13. 4: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
  14. link/ether 0a:17:1c:2f:75:0f brd ff:ff:ff:ff:ff:ff
  15. 5: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
  16. link/ether 0a:de:71:6e:3a:86 brd ff:ff:ff:ff:ff:ff
  17. inet 192.100.182.252/32 brd 192.100.182.252 scope global kube-ipvs0
  18. valid_lft forever preferred_lft forever
  19. inet 192.100.227.87/32 brd 192.100.227.87 scope global kube-ipvs0
  20. valid_lft forever preferred_lft forever
  21. inet 192.100.0.10/32 brd 192.100.0.10 scope global kube-ipvs0
  22. valid_lft forever preferred_lft forever
  23. inet 192.100.0.1/32 brd 192.100.0.1 scope global kube-ipvs0
  24. valid_lft forever preferred_lft forever
  25. 6: cali2b2abdd19e8@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
  26. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
  27. 7: cali5ed4c62f10d@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
  28. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1
  29. 8: cali64dc30e7c56@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
  30. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 2
  31. 9: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
  32. link/ipip 0.0.0.0 brd 0.0.0.0
  33. 10: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
  34. link/ether 66:43:c9:45:ef:85 brd ff:ff:ff:ff:ff:ff
  35. inet 192.110.186.198/32 scope global vxlan.calico
  36. valid_lft forever preferred_lft forever
  • ens192 主机网卡
  • vxlan calico创建的隧道网卡
  • cali开头的设备为 calico为pod创建的网卡
  • kube-ipvs0 ipvs创建的网卡

先在ens192网卡上抓包

  1. #在ens192网卡(服务器网卡)上抓包
  2. tcpdump -i ens192 -nn dst 192.110.140.67
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
  5. 16:13:23.748235 IP 192.110.186.195 > 192.110.140.67: ICMP echo reply, id 24064, seq 0, length 56
  6. 16:13:24.747423 IP 192.110.186.195 > 192.110.140.67: ICMP echo reply, id 24064, seq 256, length 56
  7. 16:13:25.748290 IP 192.110.186.195 > 192.110.140.67: ICMP echo reply, id 24064, seq 512, length 56
  8. 16:13:26.754372 IP 192.110.186.195 > 192.110.140.67: ICMP echo reply, id 24064, seq 768, length 56
  9. ^C
  10. 4 packets captured
  11. 5 packets received by filter
  12. 0 packets dropped by kernel
  • 有数据包通过
  • 数据包通过ens192网卡直接路由到node03的pod上
  • 数据包没有封装直接路由到node03的pod上

在vxlan.calico网卡上抓包

  1. #在vxlan.calico网卡(calico vxlan模式的隧道设备)上抓包
  2. tcpdump -i vxlan.calico -nn dst 192.110.140.67
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on vxlan.calico, link-type EN10MB (Ethernet), capture size 262144 bytes
  5. ^C
  6. 0 packets captured
  7. 0 packets received by filter
  8. 0 packets dropped by kernel
  • 没有数据包通过
  • 说明数据包直接路由传输
  • 没有进行封装

不同子网节点验证(目前没有环境没有验证)

实验结果

不同子网会使用IPIP模式进行数据包封装传输

Always

配置IPPool

ipipMode设置Always

vim ippool-always.yaml

  1. apiVersion: projectcalico.org/v3
  2. kind: IPPool
  3. metadata:
  4. name: default-ipv4-ippool
  5. spec:
  6. cidr: 192.110.0.0/16
  7. ipipMode: Never
  8. natOutgoing: true
  9. nodeSelector: all()
  10. vxlanMode: Always
  • natOutgoing: true所有的流量包都用IPIP进行封装
  • calicoctl apply -f calico-vxlan-test.yaml

查看IPPool

  1. calicoctl get IPPool default-ipv4-ippool -o yaml
  2. apiVersion: projectcalico.org/v3
  3. kind: IPPool
  4. metadata:
  5. creationTimestamp: "2020-12-21T05:57:19Z"
  6. name: default-ipv4-ippool
  7. resourceVersion: "815620"
  8. uid: 84960860-4a7c-4b40-b767-946c2baf981e
  9. spec:
  10. blockSize: 26
  11. cidr: 192.110.0.0/16
  12. ipipMode: Never
  13. natOutgoing: true
  14. nodeSelector: all()
  15. vxlanMode: Always
  • vxlanMode已经调整成Always

抓包验证

查看pods所在节点

  1. kubectl get pods -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  3. web-2djb2 1/1 Running 1 98m 192.110.140.67 node02 <none> <none>
  4. web-6gv9d 1/1 Running 1 98m 192.110.186.197 node03 <none> <none>
  5. web-ccct6 1/1 Running 1 98m 192.110.186.195 node03 <none> <none>

进入web-ftlqt pod(节点在node02),长ping 192.110.186.195(节点在node03)

  1. kubectl exec -it web-2djb2 -- /bin/bash
  2. ping 192.110.186.195
  3. PING 192.110.186.195 (192.110.186.195): 48 data bytes
  4. 56 bytes from 192.110.186.195: icmp_seq=0 ttl=62 time=0.996 ms
  5. 56 bytes from 192.110.186.195: icmp_seq=1 ttl=62 time=0.725 ms
  6. 56 bytes from 192.110.186.195: icmp_seq=2 ttl=62 time=2.307 ms
  7. 56 bytes from 192.110.186.195: icmp_seq=3 ttl=62 time=2.684 ms
  8. 56 bytes from 192.110.186.195: icmp_seq=4 ttl=62 time=0.621 ms
  9. ^C--- 192.110.186.195 ping statistics ---
  10. 5 packets transmitted, 5 packets received, 0% packet loss
  11. round-trip min/avg/max/stddev = 0.621/1.467/2.684/0.857 ms

在node03节点上进行抓包
查看网卡信息

  1. 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  2. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  3. inet 127.0.0.1/8 scope host lo
  4. valid_lft forever preferred_lft forever
  5. 2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
  6. link/ether 00:50:56:a0:ce:ff brd ff:ff:ff:ff:ff:ff
  7. inet 10.103.22.185/24 brd 10.103.22.255 scope global noprefixroute ens192
  8. valid_lft forever preferred_lft forever
  9. 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
  10. link/ether 02:42:41:42:1b:cf brd ff:ff:ff:ff:ff:ff
  11. inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
  12. valid_lft forever preferred_lft forever
  13. 4: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
  14. link/ether 0a:17:1c:2f:75:0f brd ff:ff:ff:ff:ff:ff
  15. 5: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
  16. link/ether 0a:de:71:6e:3a:86 brd ff:ff:ff:ff:ff:ff
  17. inet 192.100.182.252/32 brd 192.100.182.252 scope global kube-ipvs0
  18. valid_lft forever preferred_lft forever
  19. inet 192.100.227.87/32 brd 192.100.227.87 scope global kube-ipvs0
  20. valid_lft forever preferred_lft forever
  21. inet 192.100.0.10/32 brd 192.100.0.10 scope global kube-ipvs0
  22. valid_lft forever preferred_lft forever
  23. inet 192.100.0.1/32 brd 192.100.0.1 scope global kube-ipvs0
  24. valid_lft forever preferred_lft forever
  25. 6: cali2b2abdd19e8@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
  26. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
  27. 7: cali5ed4c62f10d@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
  28. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1
  29. 8: cali64dc30e7c56@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
  30. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 2
  31. 9: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
  32. link/ipip 0.0.0.0 brd 0.0.0.0
  33. 10: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
  34. link/ether 66:43:c9:45:ef:85 brd ff:ff:ff:ff:ff:ff
  35. inet 192.110.186.198/32 scope global vxlan.calico
  36. valid_lft forever preferred_lft forever
  • ens192 主机网卡
  • vxlan calico创建的隧道网卡
  • cali开头的设备为 calico为pod创建的网卡
  • kube-ipvs0 ipvs创建的网卡

先在ens192网卡上抓包

  1. #在ens192网卡(服务器网卡)上抓包
  2. tcpdump -i ens192 -nn dst 192.110.186.195
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
  5. ^C
  6. 0 packets captured
  7. 0 packets received by filter
  8. 0 packets dropped by kernel
  • 没有数据包,说明pods发送过来的包经过了封装,所以node03节点ens192网卡看不到192.110.186.195的数据包

在vxlan.calico网卡上抓包

  1. #在vxlan.calico网卡(calico vxlan模式的隧道设备)上抓包
  2. tcpdump -i vxlan.calico -nn dst 192.110.186.195
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on vxlan.calico, link-type EN10MB (Ethernet), capture size 262144 bytes
  5. 16:32:58.895301 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 27392, seq 15104, length 56
  6. 16:32:59.896622 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 27392, seq 15360, length 56
  7. 16:33:00.897224 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 27392, seq 15616, length 56
  8. 16:33:01.899095 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 27392, seq 15872, length 56
  9. 16:33:02.900233 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 27392, seq 16128, length 56
  10. 16:33:03.901294 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 27392, seq 16384, length 56
  11. 16:33:04.902684 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 27392, seq 16640, length 56
  12. ^C
  13. 7 packets captured
  14. 7 packets received by filter
  15. 0 packets dropped by kernel
  • 有数据包通过
  • 可以看出数据包通过vxlan.calico设备到node03的pos上
  • 数据包经过封装,到达node03的pod

Never(不使用vxlan)

配置IPPool

ipipMode设置Never

vim ippool-never.yaml

  1. apiVersion: projectcalico.org/v3
  2. kind: IPPool
  3. metadata:
  4. name: ippool-vxlan
  5. spec:
  6. cidr: 192.168.0.0/16
  7. vxlanMode: Never
  8. natOutgoing: true

使用自带IPPool配置

  1. apiVersion: projectcalico.org/v3
  2. kind: IPPool
  3. metadata:
  4. name: default-ipv4-ippool
  5. spec:
  6. cidr: 192.110.0.0/16
  7. ipipMode: Never
  8. natOutgoing: true
  9. nodeSelector: all()
  10. vxlanMode: Never
  • 不使用vxlan模式
  • 必须所有节点都在一个子网内
  • calicoctl apply -f calico-vxlan-test.yaml

查看IPPool

  1. apiVersion: projectcalico.org/v3
  2. kind: IPPool
  3. metadata:
  4. creationTimestamp: "2020-12-21T05:57:19Z"
  5. name: default-ipv4-ippool
  6. resourceVersion: "816709"
  7. uid: 84960860-4a7c-4b40-b767-946c2baf981e
  8. spec:
  9. blockSize: 26
  10. cidr: 192.110.0.0/16
  11. ipipMode: Never
  12. natOutgoing: true
  13. nodeSelector: all()
  14. vxlanMode: Never
  • vxlanMode已经设置成Never

查看路由

  1. ip route
  2. default via 10.103.22.1 dev ens192 proto static metric 100
  3. 10.103.22.0/24 dev ens192 proto kernel scope link src 10.103.22.184 metric 100
  4. 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
  5. blackhole 192.110.140.64/26 proto bird
  6. 192.110.140.67 dev cali827b87b13d2 scope link
  7. 192.110.140.68 dev cali9fe3f3b71b4 scope link
  8. 192.110.140.69 dev calibb530faa265 scope link
  9. 192.110.186.192/26 via 10.103.22.185 dev ens192 proto bird
  10. 192.110.241.64/26 via 10.103.22.183 dev ens192 proto bird
  • Never生效
  • pod网段路由都是到ens192网卡

抓包验证

查看pods所在节点

  1. kubectl get pods -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  3. web-2djb2 1/1 Running 1 124m 192.110.140.67 node02 <none> <none>
  4. web-6gv9d 1/1 Running 1 124m 192.110.186.197 node03 <none> <none>
  5. web-ccct6 1/1 Running 1 124m 192.110.186.195 node03 <none> <none>

进入web-2djb2 pod(节点在node02),长ping 192.110.186.194(节点在node03)

  1. kubectl exec -it web-2djb2 -- /bin/bash
  2. pping 192.110.186.195
  3. PING 192.110.186.195 (192.110.186.195): 48 data bytes
  4. 56 bytes from 192.110.186.195: icmp_seq=0 ttl=62 time=0.828 ms
  5. 56 bytes from 192.110.186.195: icmp_seq=1 ttl=62 time=0.464 ms
  6. 56 bytes from 192.110.186.195: icmp_seq=2 ttl=62 time=1.384 ms
  7. 56 bytes from 192.110.186.195: icmp_seq=3 ttl=62 time=0.444 ms
  8. 56 bytes from 192.110.186.195: icmp_seq=4 ttl=62 time=0.418 ms
  9. ^C--- 192.110.186.195 ping statistics ---
  10. 5 packets transmitted, 5 packets received, 0% packet loss
  11. round-trip min/avg/max/stddev = 0.418/0.708/1.384/0.370 ms

在node03节点上进行抓包
查看网卡信息

  1. 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  2. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  3. inet 127.0.0.1/8 scope host lo
  4. valid_lft forever preferred_lft forever
  5. 2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
  6. link/ether 00:50:56:a0:ce:ff brd ff:ff:ff:ff:ff:ff
  7. inet 10.103.22.185/24 brd 10.103.22.255 scope global noprefixroute ens192
  8. valid_lft forever preferred_lft forever
  9. 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
  10. link/ether 02:42:41:42:1b:cf brd ff:ff:ff:ff:ff:ff
  11. inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
  12. valid_lft forever preferred_lft forever
  13. 4: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
  14. link/ether 0a:17:1c:2f:75:0f brd ff:ff:ff:ff:ff:ff
  15. 5: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
  16. link/ether 0a:de:71:6e:3a:86 brd ff:ff:ff:ff:ff:ff
  17. inet 192.100.182.252/32 brd 192.100.182.252 scope global kube-ipvs0
  18. valid_lft forever preferred_lft forever
  19. inet 192.100.227.87/32 brd 192.100.227.87 scope global kube-ipvs0
  20. valid_lft forever preferred_lft forever
  21. inet 192.100.0.10/32 brd 192.100.0.10 scope global kube-ipvs0
  22. valid_lft forever preferred_lft forever
  23. inet 192.100.0.1/32 brd 192.100.0.1 scope global kube-ipvs0
  24. valid_lft forever preferred_lft forever
  25. 6: cali2b2abdd19e8@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
  26. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
  27. 7: cali5ed4c62f10d@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
  28. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1
  29. 8: cali64dc30e7c56@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
  30. link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 2
  31. 9: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
  32. link/ipip 0.0.0.0 brd 0.0.0.0
  33. 10: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
  34. link/ether 66:43:c9:45:ef:85 brd ff:ff:ff:ff:ff:ff
  35. inet 192.110.186.198/32 scope global vxlan.calico
  36. valid_lft forever preferred_lft forever
  • ens192 主机网卡
  • vxlan calico创建的隧道网卡
  • cali开头的设备为 calico为pod创建的网卡
  • kube-ipvs0 ipvs创建的网卡

先在ens192网卡上抓包

  1. #在ens192网卡(服务器网卡)上抓包
  2. tcpdump -i ens192 -nn dst 192.110.186.195
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
  5. 16:41:35.376592 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 30208, seq 0, length 56
  6. 16:41:36.377526 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 30208, seq 256, length 56
  7. 16:41:37.378774 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 30208, seq 512, length 56
  8. 16:41:38.380282 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 30208, seq 768, length 56
  9. 16:41:39.380485 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 30208, seq 1024, length 56
  10. 16:41:40.382600 IP 192.110.140.67 > 192.110.186.195: ICMP echo request, id 30208, seq 1280, length 56
  11. ^C
  12. 6 packets captured
  13. 6 packets received by filter
  14. 0 packets dropped by kernel
  • 有数据包通过
  • 数据包通过ens192网卡直接路由到node03的pod上
  • 数据包没有封装直接路由到node03的pod上
  1. 在vxlan.calico网卡上抓包
  1. #在vxlan.calico网卡(calico vxlan模式的隧道设备)上抓包
  2. tcpdump -i vxlan.calico -nn dst 192.110.186.195
  3. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  4. listening on vxlan.calico, link-type EN10MB (Ethernet), capture size 262144 bytes
  5. ^C
  6. 0 packets captured
  7. 0 packets received by filter
  8. 0 packets dropped by kernel
  • 没有数据包通过
  • 说明数据包直接路由传输
  • 没有进行封装

calico 封装类型总结

IP IN IP

  1. crosssubnet 在跨子网的环境中使用
    1. 同子网使用路由模式
    2. 不同子网使用ipip封装
    3. 使用比较灵活,高效
  2. always 使用ipip封装
    1. 所有环境都使用ipip封装
    2. 对包进行封装会降低性能
  3. never 不是用ipip封装
    1. 不对包进行封装
    2. 使用路由模式
    3. 在同子网中使用,不能跨子网

VXLAN

  1. crosssubnet 在跨子网的环境中使用
    1. 同子网使用路由模式
    2. 不同子网使用vxlan封装
    3. 使用比较灵活,高效
  2. always 使用vxlan封装
    1. 所有环境都使用vxlan封装
    2. 对包进行封装会降低性能
  3. never 不是用vxlan封装
    1. 不对包进行封装
    2. 使用路由模式
    3. 在同子网中使用,不能跨子网

Never

  • ipip和vxlan都配置成never模式
  • calico使用路由模式
  • 在同子网中使用,不能跨子网