背景
实现 GPU 集群和 CPU 集群,计划两个集群Pod IP 之间可以直接访问。GPU集群跨了两个网络段的。在CPU gateway 节点配置路由规则,路由在同一个主机网络GPU机器。加路由规则比较麻烦,由于GPU机器跨两个网络配置集中部分机器,配置不方便。
两个集群网络直接互联。自动根据对端ETCD网络段变化添加更新集群节点本地路由规则。
配置 cluster mesh
Cilium ds 上 cluster 配置项目
Cilum-ds 上面确认有下面配置[1.2.5] 是在ds上面, 1.8 1.9 版本在cilium-operator-deploy上面
- name: CILIUM_CLUSTERMESH_CONFIGvalue: "/var/lib/cilium/clustermesh/"- name: CILIUM_CLUSTER_NAMEvalueFrom:configMapKeyRef:key: cluster-namename: cilium-configoptional: true- name: CILIUM_CLUSTER_IDvalueFrom:configMapKeyRef:key: cluster-idname: cilium-configoptional: truevolumeMounts:- name: clustermesh-secretsmountPath: /var/lib/cilium/clustermeshreadOnly: truevolumes:....- name: clustermesh-secretssecret:defaultMode: 420optional: truesecretName: cilium-clustermesh
- CILIUM_CLUSTERMESH_CONFIG: 挂载 etcd 证书位置
- CILIUM_CLUSTER_ID : mesh id, 每个节点都必须唯一
注意:
Cluster ID不要随意修改,如果修改以后导致就 workload【pod】 无法访问,需要重新启动以后才可以访问。必须慎重。
如果没有配置 修改以后
$ kubectl apply -f -n kube-system cilium-ds.yaml
Cilium configmap 上需要有
---apiVersion: v1kind: ConfigMapmetadata:name: cilium-confignamespace: kube-systemdata:##### .....##### 增加下面两个配置cluster-name: "cluster<id>" # 命名 需要集群唯一cluster-id: "<id>" # id: 1 ~ 255 全集群唯一
配置Secretfile
每个集群需要四个问题,以集群cluster-1和cluster2集群互联为例子。cluster-1, cluster-2 必须和填写在cluster-name名字一致,否则互联失败。
集群描述文件, 描述就此集群配置文件路径
cluster1
endpoints:- https://172.xx.xx.xx:2379 # IP以实际具体为准ca-file: '/var/lib/cilium/clustermesh/cluster2.etcd-client-ca.crt'key-file: '/var/lib/cilium/clustermesh/cluster2.etcd-client.key'cert-file: '/var/lib/cilium/clustermesh/cluster2.etcd-client.crt'
cluster1, cluster1.etcd-client.key, cluster1.etcd-client.crt, cluster1.etcd-client-ca.crt 连接 etcd 三个key文件
注意: TLS和集群etcd配置必须按照这个命名规则
- 配置文件:<cluster-name>- TLS相关文件 <cluster-name>.etcd-client.key, <cluster-name>.etcd-client.crt, <cluster-name>.etcd-client-ca.crt
加入文件到cluster2集群上面
kubectl --debug create secret generic -n kube-system --from-file=./cluster1 --from-file=./cluster1.etcd-client-ca.crt --from-file=./cluster1.etcd-client.key --from-file=./cluster1.etcd-client.crt
注意 cluster1 配置到cluster2, 统一cluster2文件配置cluster1上。配置以后才可以进行互相访问。
挂载以后如下:
(cluster2 cilium container xx) $ ls /var/lib/cilium/clustermesh/clustercluster1.etcd-client-ca.crtcluster1.etcd-client.crtcluster1.etcd-client.key(cluster2 node1) $ cat /var/lib/cilium/clustermesh/cluster1endpoints:- https://172.xx.xx.xx:2379 # IP以实际具体为准ca-file: '/var/lib/cilium/clustermesh/cluster1.etcd-client-ca.crt'key-file: '/var/lib/cilium/clustermesh/cluster1.etcd-client.key'cert-file: '/var/lib/cilium/clustermesh/cluster1.etcd-client.crt'
生效clustermesh配置
提交上去以后,clium 对应/var/lib/cilium/clustermesh/已经有更新文件,需要重新启动cilium pod 、 逐个删除当前运行pod,重新生成pod以后,这些配置才生效。所以通过这个规则,这个重新启动当前cilium-apent达到滚动更新效果。
全部重新创建所有pod
$ kubectl delete pod -n kube-system -l k8s-app=cilium
测试
Verify clustermesh syncing
Check cluster status:
(cluster1 node1) $ cilium statusKVStore: Ok etcd: ...Kubernetes: Ok 1.17+ (v1.17.6-3) [linux/amd64]...ClusterMesh: 2/2 clusters ready, 0 global-services
More verbose:
(cluster1 node1) $ cilium status --verboseKVStore: Ok etcd: ...Kubernetes: Ok 1.17+ (v1.17.6-3) [linux/amd64]...ClusterMesh: 1/1 clusters ready, 0 global-servicescluster2: ready, xx nodes, xx identities, 0 services, 0 failures (last: never)└ etcd: 1/1 connected, ...
List all nodes of all clusters in the mesh:
(cluster1 node1) $ cilium node listName IPv4 Address Endpoint CIDR IPv6 Address Endpoint CIDRcluster1/node1 10.xx.xx.xx 10.xx.xx.xx/24cluster1/node2 10.xx.xx.xx 10.xx.xx.xx/24...cluster2/node1 10.xx.xx.xx 10.xx.xx.xx/24cluster2/node2 10.xx.xx.xx 10.xx.xx.xx/24...
cilium 网络包追踪
(cluster1 node1) cilium monitor
参考
kubernetes multi-cluster
https://arthurchiao.art/blog/cilium-clustermesh/
