1.Kubeadm高可用安装k8s集群

0.网段说明

集群安装时会涉及到三个网段:
宿主机网段:就是安装k8s的服务器
Pod网段:k8s Pod的网段,相当于容器的IP
Service网段:k8s service网段,service用于集群容器通信。

一般service网段会设置为10.96.0.0/12
Pod网段会设置成10.244.0.0/12或者172.16.0.1/12
宿主机网段可能是192.168.0.0/24

需要注意的是这三个网段不能有任何交叉。
比如如果宿主机的IP是10.105.0.x
那么service网段就不能是10.96.0.0/12,因为10.96.0.0/12网段可用IP是:
10.96.0.1 ~ 10.111.255.255
所以10.105是在这个范围之内的,属于网络交叉,此时service网段需要更换,
可以更改为192.168.0.0/16网段(注意如果service网段是192.168开头的子网掩码最好不要是12,最好为16,因为子网掩码是12他的起始IP为192.160.0.1 不是192.168.0.1)。
同样的道理,技术别的网段也不能重复。可以通过http://tools.jb51.net/aideddesign/ip_net_calc/计算:

所以一般的推荐是,直接第一个开头的就不要重复,比如你的宿主机是192开头的,那么你的service可以是10.96.0.0/12.
如果你的宿主机是10开头的,就直接把service的网段改成192.168.0.0/16
如果你的宿主机是172开头的,就直接把pod网段改成192.168.0.0/12

注意搭配,均为10网段、172网段、192网段的搭配,第一个开头数字不一样就免去了网段冲突的可能性,也可以减去计算的步骤。

1.1Kubeadm基础环境配置

1.修改apt源

所有节点配置
vim /etc/apt/sources.list

  1. deb http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
  2. deb-src http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
  3. deb http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
  4. deb-src http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
  5. deb http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
  6. deb-src http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
  7. deb http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
  8. deb-src http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
  9. deb http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
  10. deb-src http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse

ubuntu镜像-ubuntu下载地址-ubuntu安装教程-阿里巴巴开源镜像站 (aliyun.com)

  1. # 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释
  2. deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal main restricted universe multiverse
  3. deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal main restricted universe multiverse
  4. deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
  5. deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
  6. deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
  7. deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
  8. deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-security main restricted universe multiverse
  9. deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-security main restricted universe multiverse
  10. # 预发布软件源,不建议启用
  11. # deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse
  12. # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse

https://mirrors.tuna.tsinghua.edu.cn/help/ubuntu/

2.修改hosts文件

所有节点配置
vim /etc/hosts

  1. 192.168.68.20 k8s-master01
  2. 192.168.68.21 k8s-master02
  3. 192.168.68.18 k8s-master-lb # 如果不是高可用集群,该IP为Master01的IP
  4. 192.168.68.22 k8s-node01
  5. 192.168.68.23 k8s-node02

3.关闭swap分区

所有节点配置
vim /etc/fstab

  1. 临时禁用:swapoff -a
  2. 永久禁用:
  3. vim /etc/fstab 注释掉swap条目,重启系统

4.时间同步

所有节点

  1. apt install ntpdate -y
  2. rm -rf /etc/localtime
  3. ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
  4. ntpdate cn.ntp.org.cn
  5. # 加入到crontab -e
  6. */5 * * * * /usr/sbin/ntpdate time2.aliyun.com
  7. # 加入到开机自动同步,vim /etc/rc.local
  8. ntpdate cn.ntp.org.cn

5.配置limit

所有节点
ulimit -SHn 65535
vim /etc/security/limits.conf
# 末尾添加如下内容

  1. * soft nofile 655360
  2. * hard nofile 131072
  3. * soft nproc 655350
  4. * hard nproc 655350
  5. * soft memlock unlimited
  6. * hard memlock unlimited

6.修改内核参数

  1. cat >/etc/sysctl.conf <<EOF
  2. # Controls source route verification
  3. net.ipv4.conf.default.rp_filter = 1
  4. net.ipv4.ip_nonlocal_bind = 1
  5. net.ipv4.ip_forward = 1
  6. # Do not accept source routing
  7. net.ipv4.conf.default.accept_source_route = 0
  8. # Controls the System Request debugging functionality of the kernel
  9. kernel.sysrq = 0
  10. # Controls whether core dumps will append the PID to the core filename.
  11. # Useful for debugging multi-threaded
  12. applications. kernel.core_uses_pid = 1
  13. # Controls the use of TCP syncookies
  14. net.ipv4.tcp_syncookies = 1
  15. # Disable netfilter on bridges.
  16. net.bridge.bridge-nf-call-ip6tables = 0
  17. net.bridge.bridge-nf-call-iptables = 0
  18. net.bridge.bridge-nf-call-arptables = 0
  19. # Controls the default maxmimum size of a mesage queue
  20. kernel.msgmnb = 65536
  21. # # Controls the maximum size of a message, in bytes
  22. kernel.msgmax = 65536
  23. # Controls the maximum shared segment size, in bytes
  24. kernel.shmmax = 68719476736
  25. # # Controls the maximum number of shared memory segments, in pages
  26. kernel.shmall = 4294967296
  27. # TCP kernel paramater
  28. net.ipv4.tcp_mem = 786432 1048576 1572864
  29. net.ipv4.tcp_rmem = 4096 87380 4194304
  30. net.ipv4.tcp_wmem = 4096 16384 4194304 n
  31. et.ipv4.tcp_window_scaling = 1
  32. net.ipv4.tcp_sack = 1
  33. # socket buffer
  34. net.core.wmem_default = 8388608
  35. net.core.rmem_default = 8388608
  36. net.core.rmem_max = 16777216
  37. net.core.wmem_max = 16777216
  38. net.core.netdev_max_backlog = 262144
  39. net.core.somaxconn = 20480
  40. net.core.optmem_max = 81920
  41. # TCP conn
  42. net.ipv4.tcp_max_syn_backlog = 262144
  43. net.ipv4.tcp_syn_retries = 3
  44. net.ipv4.tcp_retries1 = 3
  45. net.ipv4.tcp_retries2 = 15
  46. # tcp conn reuse
  47. net.ipv4.tcp_timestamps = 0
  48. net.ipv4.tcp_tw_reuse = 0
  49. net.ipv4.tcp_tw_recycle = 0
  50. net.ipv4.tcp_fin_timeout = 1
  51. net.ipv4.tcp_max_tw_buckets = 20000
  52. net.ipv4.tcp_max_orphans = 3276800
  53. net.ipv4.tcp_synack_retries = 1
  54. net.ipv4.tcp_syncookies = 1
  55. # keepalive conn
  56. net.ipv4.tcp_keepalive_time = 300
  57. net.ipv4.tcp_keepalive_intvl = 30
  58. net.ipv4.tcp_keepalive_probes = 3
  59. net.ipv4.ip_local_port_range = 10001 65000
  60. # swap
  61. vm.overcommit_memory = 0
  62. vm.swappiness = 10
  63. #net.ipv4.conf.eth1.rp_filter = 0
  64. #net.ipv4.conf.lo.arp_ignore = 1
  65. #net.ipv4.conf.lo.arp_announce = 2
  66. #net.ipv4.conf.all.arp_ignore = 1
  67. #net.ipv4.conf.all.arp_announce = 2
  68. EOF

安装常用软件
apt install wget jq psmisc vim net-tools telnet lvm2 git -y
所有节点安装ipvsadm

  1. apt install ipvsadm ipset conntrack -y
  2. modprobe -- ip_vs
  3. modprobe -- ip_vs_rr
  4. modprobe -- ip_vs_wrr
  5. modprobe -- ip_vs_sh
  6. modprobe -- nf_conntrack
  7. vim /etc/modules-load.d/ipvs.conf
  8. # 加入以下内容
  9. ip_vs
  10. ip_vs_lc
  11. ip_vs_wlc
  12. ip_vs_rr
  13. ip_vs_wrr
  14. ip_vs_lblc
  15. ip_vs_lblcr
  16. ip_vs_dh
  17. ip_vs_sh
  18. ip_vs_fo
  19. ip_vs_nq
  20. ip_vs_sed
  21. ip_vs_ftp
  22. ip_vs_sh
  23. nf_conntrack
  24. ip_tables
  25. ip_set
  26. xt_set
  27. ipt_set
  28. ipt_rpfilter
  29. ipt_REJECT
  30. ipip

然后执行systemctl enable —now systemd-modules-load.service即可
开启一些k8s集群中必须的内核参数,所有节点配置k8s内核

  1. cat <<EOF > /etc/sysctl.d/k8s.conf
  2. net.ipv4.ip_forward = 1
  3. net.bridge.bridge-nf-call-iptables = 1
  4. net.bridge.bridge-nf-call-ip6tables = 1
  5. fs.may_detach_mounts = 1
  6. net.ipv4.conf.all.route_localnet = 1
  7. vm.overcommit_memory=1
  8. vm.panic_on_oom=0
  9. fs.inotify.max_user_watches=89100
  10. fs.file-max=52706963
  11. fs.nr_open=52706963
  12. net.netfilter.nf_conntrack_max=2310720
  13. net.ipv4.tcp_keepalive_time = 600
  14. net.ipv4.tcp_keepalive_probes = 3
  15. net.ipv4.tcp_keepalive_intvl =15
  16. net.ipv4.tcp_max_tw_buckets = 36000
  17. net.ipv4.tcp_tw_reuse = 1
  18. net.ipv4.tcp_max_orphans = 327680
  19. net.ipv4.tcp_orphan_retries = 3
  20. net.ipv4.tcp_syncookies = 1
  21. net.ipv4.tcp_max_syn_backlog = 16384
  22. net.ipv4.ip_conntrack_max = 65536
  23. net.ipv4.tcp_max_syn_backlog = 16384
  24. net.ipv4.tcp_timestamps = 0
  25. net.core.somaxconn = 16384
  26. EOF

sysctl —system
所有节点配置完内核后,重启服务器,保证重启后内核依旧加载
reboot
查看
lsmod | grep —color=auto -e ip_vs -e nf_conntrack

8.Master01节点免密钥登录其他节点

安装过程中生成配置文件和证书均在Master01上操作,集群管理也在Master01上操作,阿里云或者AWS上需要单独一台kubectl服务器。密钥配置如下:
ssh-keygen -t rsa -f /root/.ssh/id_rsa -C "192.168.68.20@k8s-master01" -N ""
for i in k8s-master01 k8s-master02 k8s-node01 k8s-node02;do ssh-copy-id -i /root/.ssh/id_rsa.pub $i;done

1.2基本组件安装

1.所有节点安装docker

docker20.10版本K8S不支持需要安装19.03
将官方 Docker 版本库的 GPG 密钥添加到系统中:
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
将 Docker 版本库添加到APT源:
add-apt-repository “deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable”
我们用新添加的 Docker 软件包来进行升级更新。
apt update
确保要从 Docker 版本库,而不是默认的 Ubuntu 版本库进行安装:
apt-cache policy docker-ce
安装 Docker :
apt install docker-ce=5:19.03.15~3-0~ubuntu-focal docker-ce-cli=5:19.03.15~3-0~ubuntu-focal -y
启动docker
systemctl status docker

  1. mkdir /etc/docker
  2. cat > /etc/docker/daemon.json <<EOF
  3. {
  4. "exec-opts": ["native.cgroupdriver=systemd"],
  5. "registry-mirrors": ["http://hub-mirror.c.163.com,"https://registry.docker-cn.com","https://docker.mirrors.ustc.edu.cn""]
  6. }
  7. EOF

设置开机自启
**systemctl daemon-reload && systemctl enable --now docker**

2.所有节点安装k8s组件

apt install kubeadm kubelet kubectl -y
如果无法下载先配置k8s软件源
默认配置的pause镜像使用gcr.io仓库,国内可能无法访问,所以这里配置Kubelet使用阿里云的pause镜像:

  1. # 查看 kubelet 配置
  2. systemctl status kubelet
  3. cd /etc/systemd/system/kubelet.service.d
  4. # 添加 Environment="KUBELET_POD_INFRA_CONTAINER= --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.4.1" 如下
  5. cat 10-kubeadm.conf
  6. # Note: This dropin only works with kubeadm and kubelet v1.11+
  7. [Service]
  8. Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
  9. Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
  10. Environment="KUBELET_CGROUP_ARGS=--system-reserved=memory=10Gi --kube-reserved=memory=400Mi --eviction-hard=imagefs.available<15%,memory.available<300Mi,nodefs.available<10%,nodefs.inodesFree<5% --cgroup-driver=systemd"
  11. Environment="KUBELET_POD_INFRA_CONTAINER= --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1"
  12. # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
  13. EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
  14. # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
  15. # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
  16. EnvironmentFile=-/etc/sysconfig/kubelet
  17. ExecStart=
  18. ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_CGROUP_ARGS $KUBELET_EXTRA_ARGS
  19. # 重启 kubelet 服务
  20. systemctl daemon-reload
  21. systemctl restart kubelet

3.所有节点安装k8s软件源

  1. apt-get update && apt-get install -y apt-transport-https
  2. curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
  3. vim /etc/apt/sources.list.d/kubernetes.list
  4. deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
  5. apt-get update
  6. apt-get install -y kubelet=1.20.15-00 kubeadm=1.20.15-00 kubectl=1.20.15-00

1.3高可用组件安装

(注意:如果不是高可用集群,haproxy和keepalived无需安装)
公有云要用公有云自带的负载均衡,比如阿里云的SLB,腾讯云的ELB,用来替代haproxy和keepalived,因为公有云大部分都是不支持keepalived的,另外如果用阿里云的话,kubectl控制端不能放在master节点,推荐使用腾讯云,因为阿里云的slb有回环的问题,也就是slb代理的服务器不能反向访问SLB,但是腾讯云修复了这个问题。
所有Master节点通过apt安装HAProxy和KeepAlived:
apt install keepalived haproxy -y
所有Master节点配置HAProxy(详细配置参考HAProxy文档,所有Master节点的HAProxy配置相同):

  1. vim /etc/haproxy/haproxy.cfg
  2. global
  3. maxconn 2000
  4. ulimit-n 16384
  5. log 127.0.0.1 local0 err
  6. stats timeout 30s
  7. defaults
  8. log global
  9. mode http
  10. option httplog
  11. timeout connect 5000
  12. timeout client 50000
  13. timeout server 50000
  14. timeout http-request 15s
  15. timeout http-keep-alive 15s
  16. frontend monitor-in
  17. bind *:33305
  18. mode http
  19. option httplog
  20. monitor-uri /monitor
  21. frontend k8s-master
  22. bind 0.0.0.0:16443
  23. bind 127.0.0.1:16443
  24. mode tcp
  25. option tcplog
  26. tcp-request inspect-delay 5s
  27. default_backend k8s-master
  28. backend k8s-master
  29. mode tcp
  30. option tcplog
  31. option tcp-check
  32. balance roundrobin
  33. default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
  34. server k8s-master01 192.168.68.20:6443 check
  35. server k8s-master02 192.168.68.21:6443 check

所有Master节点配置KeepAlived,配置不一样,注意区分
# vim /etc/keepalived/keepalived.conf ,注意每个节点的IP和网卡(interface参数)
M01节点的配置:

  1. vim /etc/keepalived/keepalived.conf
  2. ! Configuration File for keepalived
  3. global_defs {
  4. router_id LVS_DEVEL
  5. script_user root
  6. enable_script_security
  7. }
  8. vrrp_script chk_apiserver {
  9. script "/etc/keepalived/check_apiserver.sh"
  10. interval 5
  11. weight -5
  12. fall 2
  13. rise 1
  14. }
  15. vrrp_instance VI_1 {
  16. state MASTER
  17. interface ens33
  18. mcast_src_ip 192.168.68.20
  19. virtual_router_id 51
  20. priority 101
  21. advert_int 2
  22. authentication {
  23. auth_type PASS
  24. auth_pass K8SHA_KA_AUTH
  25. }
  26. virtual_ipaddress {
  27. 192.168.68.18
  28. }
  29. track_script {
  30. chk_apiserver
  31. }
  32. }

M02节点的配置:

  1. ! Configuration File for keepalived
  2. global_defs {
  3. router_id LVS_DEVEL
  4. script_user root
  5. enable_script_security
  6. }
  7. vrrp_script chk_apiserver {
  8. script "/etc/keepalived/check_apiserver.sh"
  9. interval 5
  10. weight -5
  11. fall 2
  12. rise 1
  13. }
  14. vrrp_instance VI_1 {
  15. state BACKUP
  16. interface ens33
  17. mcast_src_ip 192.168.68.21
  18. virtual_router_id 51
  19. priority 100
  20. advert_int 2
  21. authentication {
  22. auth_type PASS
  23. auth_pass K8SHA_KA_AUTH
  24. }
  25. virtual_ipaddress {
  26. 192.168.68.18
  27. }
  28. track_script {
  29. chk_apiserver
  30. }
  31. }

所有master节点配置KeepAlived健康检查文件:

  1. vim /etc/keepalived/check_apiserver.sh
  2. #!/bin/bash
  3. err=0
  4. for k in $(seq 1 3)
  5. do
  6. check_code=$(pgrep haproxy)
  7. if [[ $check_code == "" ]]; then
  8. err=$(expr $err + 1)
  9. sleep 1
  10. continue
  11. else
  12. err=0
  13. break
  14. fi
  15. done
  16. if [[ $err != "0" ]]; then
  17. echo "systemctl stop keepalived"
  18. /usr/bin/systemctl stop keepalived
  19. exit 1
  20. else
  21. exit 0
  22. fi
  23. chmod +x /etc/keepalived/check_apiserver.sh

启动haproxy和keepalived

  1. systemctl daemon-reload
  2. systemctl enable --now haproxy
  3. systemctl enable --now keepalived
  4. systemctl restart keepalived haproxy

重要:如果安装了keepalived和haproxy,需要测试keepalived是否是正常的
测试VIP

  1. root:kubelet.service.d/ # ping 192.168.68.18 -c 4 [11:34:19]
  2. PING 192.168.68.18 (192.168.68.18) 56(84) bytes of data.
  3. 64 bytes from 192.168.68.18: icmp_seq=1 ttl=64 time=0.032 ms
  4. 64 bytes from 192.168.68.18: icmp_seq=2 ttl=64 time=0.043 ms
  5. 64 bytes from 192.168.68.18: icmp_seq=3 ttl=64 time=0.033 ms
  6. 64 bytes from 192.168.68.18: icmp_seq=4 ttl=64 time=0.038 ms
  7. --- 192.168.68.18 ping statistics ---
  8. 4 packets transmitted, 4 received, 0% packet loss, time 3075ms
  9. rtt min/avg/max/mdev = 0.032/0.036/0.043/0.004 ms
  1. root:~/ # telnet 192.168.68.18 16443 [11:44:03]
  2. Trying 192.168.68.18...
  3. Connected to 192.168.68.18.
  4. Escape character is '^]'.
  5. Connection closed by foreign host.

1.4.集群初始化

1.以下操作只在master01节点执行

M01节点创建kubeadm-config.yaml配置文件如下:
Master01:(# 注意,如果不是高可用集群,192.168.68.18:16443改为m01的地址,16443改为apiserver的端口,默认是6443,
注意更改kubernetesVersion的值和自己服务器kubeadm的版本一致:kubeadm version)
注意:以下文件内容,宿主机网段、podSubnet网段、serviceSubnet网段不能重复
vim kubeadm-config.yaml

  1. apiVersion: kubeadm.k8s.io/v1beta2
  2. bootstrapTokens:
  3. - groups:
  4. - system:bootstrappers:kubeadm:default-node-token
  5. token: 7t2weq.bjbawausm0jaxury
  6. ttl: 24h0m0s
  7. usages:
  8. - signing
  9. - authentication
  10. kind: InitConfiguration
  11. localAPIEndpoint:
  12. advertiseAddress: 192.168.68.20
  13. bindPort: 6443
  14. nodeRegistration:
  15. criSocket: /var/run/dockershim.sock
  16. name: k8s-master01
  17. taints:
  18. - effect: NoSchedule
  19. key: node-role.kubernetes.io/master
  20. ---
  21. apiServer:
  22. certSANs:
  23. - 192.168.68.18
  24. timeoutForControlPlane: 4m0s
  25. apiVersion: kubeadm.k8s.io/v1beta2
  26. certificatesDir: /etc/kubernetes/pki
  27. clusterName: kubernetes
  28. controlPlaneEndpoint: 192.168.68.18:16443
  29. controllerManager: {}
  30. dns:
  31. type: CoreDNS
  32. etcd:
  33. local:
  34. dataDir: /var/lib/etcd
  35. imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
  36. kind: ClusterConfiguration
  37. kubernetesVersion: v1.20.15
  38. networking:
  39. dnsDomain: cluster.local
  40. podSubnet: 172.16.0.0/12
  41. serviceSubnet: 10.96.0.0/16
  42. scheduler: {}

更新kubeadm文件
kubeadm config migrate --old-config kubeadm-config.yaml --new-config new.yaml
将new.yaml文件复制到其他master节点,
for i in k8s-master02; do scp new.yaml $i:/root/; done
所有Master节点提前下载镜像,可以节省初始化时间(其他节点不需要更改任何配置,包括IP地址也不需要更改):
kubeadm config images pull --config /root/new.yaml

2.所有节点设置开机自启动kubelet

systemctl enable --now kubelet(如果启动失败无需管理,初始化成功以后即可启动)

3.Master01节点初始化

M01节点初始化,初始化以后会在/etc/kubernetes目录下生成对应的证书和配置文件,之后其他Master节点加入M01即可:
kubeadm init --config /root/new.yaml --upload-certs
如果初始化失败,重置后再次初始化,命令如下:
kubeadm reset -f ; ipvsadm --clear ; rm -rf ~/.kube
初始化成功以后,会产生Token值,用于其他节点加入时使用,因此要记录下初始化成功生成的token值(令牌值):

  1. Your Kubernetes control-plane has initialized successfully!
  2. To start using your cluster, you need to run the following as a regular user:
  3. mkdir -p $HOME/.kube
  4. sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  5. sudo chown $(id -u):$(id -g) $HOME/.kube/config
  6. Alternatively, if you are the root user, you can run:
  7. export KUBECONFIG=/etc/kubernetes/admin.conf
  8. You should now deploy a pod network to the cluster.
  9. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  10. https://kubernetes.io/docs/concepts/cluster-administration/addons/
  11. You can now join any number of the control-plane node running the following command on each as root:
  12. kubeadm join 192.168.68.18:16443 --token 7t2weq.bjbawausm0jaxury \
  13. --discovery-token-ca-cert-hash sha256:d9c52c2db865df54fd9db22e911ffb7adf12d1244103005fc1885933c6d27673 \
  14. --control-plane --certificate-key 8da0fbf16c11810bb4930846896d21d633af936839c3941890d3c5cf85b98316
  15. Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
  16. As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
  17. "kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
  18. Then you can join any number of worker nodes by running the following on each as root:
  19. kubeadm join 192.168.68.18:16443 --token 7t2weq.bjbawausm0jaxury \
  20. --discovery-token-ca-cert-hash sha256:d9c52c2db865df54fd9db22e911ffb7adf12d1244103005fc1885933c6d27673

4.Master01节点配置环境变量,用于访问Kubernetes集群:

  1. cat <<EOF >> /root/.bashrc
  2. export KUBECONFIG=/etc/kubernetes/admin.conf
  3. EOF
  4. cat <<EOF >> /root/.zshrc
  5. export KUBECONFIG=/etc/kubernetes/admin.conf
  6. EOF

查看节点状态:kubectl get node

采用初始化安装方式,所有的系统组件均以容器的方式运行并且在kube-system命名空间内,此时可以查看Pod状态:kubectl get po -n kube-system

5.配置k8s命令补全

  1. 临时可以使用
  2. apt install -y bash-completion
  3. source /usr/share/bash-completion/bash_completion
  4. source <(kubectl completion bash)
  5. 永久使用
  6. 编辑~/.bashrc
  7. echo "source <(kubectl completion bash)" >>~/.bashrc
  8. vim ~/.bashrc
  9. 添加以下内容
  10. source <(kubectl completion bash)
  11. 保存退出,输入以下命令
  12. source ~/.bashrc

1.Kubectl 自动补全

BASH
  1. apt install -y bash-completion
  2. source <(kubectl completion bash) # 在 bash 中设置当前 shell 的自动补全,要先安装 bash-completion 包。
  3. echo "source <(kubectl completion bash)" >> ~/.bashrc # 在您的 bash shell 中永久的添加自动补全
  4. source ~/.bashrc

您还可以为 kubectl 使用一个速记别名,该别名也可以与 completion 一起使用:

  1. alias k=kubectl
  2. complete -F __start_kubectl k

ZSH
  1. apt install -y bash-completion
  2. source <(kubectl completion zsh) # 在 zsh 中设置当前 shell 的自动补全
  3. echo "[[ $commands[kubectl] ]] && source <(kubectl completion zsh)" >> ~/.zshrc # 在您的 zsh shell 中永久的添加自动补全
  4. source ~/.zshrc

1.5.Master和node节点

注意:以下步骤是上述init命令产生的Token过期了才需要执行以下步骤,如果没有过期不需要执行
Token过期后生成新的token:
kubeadm token create —print-join-command
Master需要生成__—certificate-key
kubeadm init phase upload-certs —upload-certs
Token没有过期直接执行Join就行了
其他master加入集群,master02执行

  1. kubeadm join 192.168.68.18:16443 --token 7t2weq.bjbawausm0jaxury \
  2. --discovery-token-ca-cert-hash sha256:d9c52c2db865df54fd9db22e911ffb7adf12d1244103005fc1885933c6d27673 \
  3. --control-plane --certificate-key 8da0fbf16c11810bb4930846896d21d633af936839c3941890d3c5cf85b98316

查看当前状态:kubectl get node
Node节点的配置
Node节点上主要部署公司的一些业务应用,生产环境中不建议Master节点部署系统组件之外的其他Pod,测试环境可以允许Master节点部署Pod以节省系统资源。

  1. kubeadm join 192.168.68.18:16443 --token 7t2weq.bjbawausm0jaxury \
  2. --discovery-token-ca-cert-hash sha256:d9c52c2db865df54fd9db22e911ffb7adf12d1244103005fc1885933c6d27673

所有节点初始化完成后,查看集群状态
kubectl get node

注意,这样集群无法连接访问,需要安装Calico组件

1.6.网络组件Calico组件的安装

calico-etcd.yaml

  1. ---
  2. # Source: calico/templates/calico-etcd-secrets.yaml
  3. # The following contains k8s Secrets for use with a TLS enabled etcd cluster.
  4. # For information on populating Secrets, see http://kubernetes.io/docs/user-guide/secrets/
  5. apiVersion: v1
  6. kind: Secret
  7. type: Opaque
  8. metadata:
  9. name: calico-etcd-secrets
  10. namespace: kube-system
  11. data:
  12. # Populate the following with etcd TLS configuration if desired, but leave blank if
  13. # not using TLS for etcd.
  14. # The keys below should be uncommented and the values populated with the base64
  15. # encoded contents of each file that would be associated with the TLS data.
  16. # Example command for encoding a file contents: cat <file> | base64 -w 0
  17. # etcd-key: null
  18. # etcd-cert: null
  19. # etcd-ca: null
  20. ---
  21. # Source: calico/templates/calico-config.yaml
  22. # This ConfigMap is used to configure a self-hosted Calico installation.
  23. kind: ConfigMap
  24. apiVersion: v1
  25. metadata:
  26. name: calico-config
  27. namespace: kube-system
  28. data:
  29. # Configure this with the location of your etcd cluster.
  30. etcd_endpoints: "http://<ETCD_IP>:<ETCD_PORT>"
  31. # If you're using TLS enabled etcd uncomment the following.
  32. # You must also populate the Secret below with these files.
  33. etcd_ca: "" # "/calico-secrets/etcd-ca"
  34. etcd_cert: "" # "/calico-secrets/etcd-cert"
  35. etcd_key: "" # "/calico-secrets/etcd-key"
  36. # Typha is disabled.
  37. typha_service_name: "none"
  38. # Configure the backend to use.
  39. calico_backend: "bird"
  40. # Configure the MTU to use for workload interfaces and tunnels.
  41. # By default, MTU is auto-detected, and explicitly setting this field should not be required.
  42. # You can override auto-detection by providing a non-zero value.
  43. veth_mtu: "0"
  44. # The CNI network configuration to install on each node. The special
  45. # values in this config will be automatically populated.
  46. cni_network_config: |-
  47. {
  48. "name": "k8s-pod-network",
  49. "cniVersion": "0.3.1",
  50. "plugins": [
  51. {
  52. "type": "calico",
  53. "log_level": "info",
  54. "log_file_path": "/var/log/calico/cni/cni.log",
  55. "etcd_endpoints": "__ETCD_ENDPOINTS__",
  56. "etcd_key_file": "__ETCD_KEY_FILE__",
  57. "etcd_cert_file": "__ETCD_CERT_FILE__",
  58. "etcd_ca_cert_file": "__ETCD_CA_CERT_FILE__",
  59. "mtu": __CNI_MTU__,
  60. "ipam": {
  61. "type": "calico-ipam"
  62. },
  63. "policy": {
  64. "type": "k8s"
  65. },
  66. "kubernetes": {
  67. "kubeconfig": "__KUBECONFIG_FILEPATH__"
  68. }
  69. },
  70. {
  71. "type": "portmap",
  72. "snat": true,
  73. "capabilities": {"portMappings": true}
  74. },
  75. {
  76. "type": "bandwidth",
  77. "capabilities": {"bandwidth": true}
  78. }
  79. ]
  80. }
  81. ---
  82. # Source: calico/templates/calico-kube-controllers-rbac.yaml
  83. # Include a clusterrole for the kube-controllers component,
  84. # and bind it to the calico-kube-controllers serviceaccount.
  85. kind: ClusterRole
  86. apiVersion: rbac.authorization.k8s.io/v1
  87. metadata:
  88. name: calico-kube-controllers
  89. rules:
  90. # Pods are monitored for changing labels.
  91. # The node controller monitors Kubernetes nodes.
  92. # Namespace and serviceaccount labels are used for policy.
  93. - apiGroups: [""]
  94. resources:
  95. - pods
  96. - nodes
  97. - namespaces
  98. - serviceaccounts
  99. verbs:
  100. - watch
  101. - list
  102. - get
  103. # Watch for changes to Kubernetes NetworkPolicies.
  104. - apiGroups: ["networking.k8s.io"]
  105. resources:
  106. - networkpolicies
  107. verbs:
  108. - watch
  109. - list
  110. ---
  111. kind: ClusterRoleBinding
  112. apiVersion: rbac.authorization.k8s.io/v1
  113. metadata:
  114. name: calico-kube-controllers
  115. roleRef:
  116. apiGroup: rbac.authorization.k8s.io
  117. kind: ClusterRole
  118. name: calico-kube-controllers
  119. subjects:
  120. - kind: ServiceAccount
  121. name: calico-kube-controllers
  122. namespace: kube-system
  123. ---
  124. ---
  125. # Source: calico/templates/calico-node-rbac.yaml
  126. # Include a clusterrole for the calico-node DaemonSet,
  127. # and bind it to the calico-node serviceaccount.
  128. kind: ClusterRole
  129. apiVersion: rbac.authorization.k8s.io/v1
  130. metadata:
  131. name: calico-node
  132. rules:
  133. # The CNI plugin needs to get pods, nodes, and namespaces.
  134. - apiGroups: [""]
  135. resources:
  136. - pods
  137. - nodes
  138. - namespaces
  139. verbs:
  140. - get
  141. # EndpointSlices are used for Service-based network policy rule
  142. # enforcement.
  143. - apiGroups: ["discovery.k8s.io"]
  144. resources:
  145. - endpointslices
  146. verbs:
  147. - watch
  148. - list
  149. - apiGroups: [""]
  150. resources:
  151. - endpoints
  152. - services
  153. verbs:
  154. # Used to discover service IPs for advertisement.
  155. - watch
  156. - list
  157. # Pod CIDR auto-detection on kubeadm needs access to config maps.
  158. - apiGroups: [""]
  159. resources:
  160. - configmaps
  161. verbs:
  162. - get
  163. - apiGroups: [""]
  164. resources:
  165. - nodes/status
  166. verbs:
  167. # Needed for clearing NodeNetworkUnavailable flag.
  168. - patch
  169. ---
  170. apiVersion: rbac.authorization.k8s.io/v1
  171. kind: ClusterRoleBinding
  172. metadata:
  173. name: calico-node
  174. roleRef:
  175. apiGroup: rbac.authorization.k8s.io
  176. kind: ClusterRole
  177. name: calico-node
  178. subjects:
  179. - kind: ServiceAccount
  180. name: calico-node
  181. namespace: kube-system
  182. ---
  183. # Source: calico/templates/calico-node.yaml
  184. # This manifest installs the calico-node container, as well
  185. # as the CNI plugins and network config on
  186. # each master and worker node in a Kubernetes cluster.
  187. kind: DaemonSet
  188. apiVersion: apps/v1
  189. metadata:
  190. name: calico-node
  191. namespace: kube-system
  192. labels:
  193. k8s-app: calico-node
  194. spec:
  195. selector:
  196. matchLabels:
  197. k8s-app: calico-node
  198. updateStrategy:
  199. type: RollingUpdate
  200. rollingUpdate:
  201. maxUnavailable: 1
  202. template:
  203. metadata:
  204. labels:
  205. k8s-app: calico-node
  206. spec:
  207. nodeSelector:
  208. kubernetes.io/os: linux
  209. hostNetwork: true
  210. tolerations:
  211. # Make sure calico-node gets scheduled on all nodes.
  212. - effect: NoSchedule
  213. operator: Exists
  214. # Mark the pod as a critical add-on for rescheduling.
  215. - key: CriticalAddonsOnly
  216. operator: Exists
  217. - effect: NoExecute
  218. operator: Exists
  219. serviceAccountName: calico-node
  220. # Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
  221. # deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
  222. terminationGracePeriodSeconds: 0
  223. priorityClassName: system-node-critical
  224. initContainers:
  225. # This container installs the CNI binaries
  226. # and CNI network config file on each node.
  227. - name: install-cni
  228. image: registry.cn-beijing.aliyuncs.com/dotbalo/cni:v3.22.0
  229. command: ["/opt/cni/bin/install"]
  230. envFrom:
  231. - configMapRef:
  232. # Allow KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT to be overridden for eBPF mode.
  233. name: kubernetes-services-endpoint
  234. optional: true
  235. env:
  236. # Name of the CNI config file to create.
  237. - name: CNI_CONF_NAME
  238. value: "10-calico.conflist"
  239. # The CNI network config to install on each node.
  240. - name: CNI_NETWORK_CONFIG
  241. valueFrom:
  242. configMapKeyRef:
  243. name: calico-config
  244. key: cni_network_config
  245. # The location of the etcd cluster.
  246. - name: ETCD_ENDPOINTS
  247. valueFrom:
  248. configMapKeyRef:
  249. name: calico-config
  250. key: etcd_endpoints
  251. # CNI MTU Config variable
  252. - name: CNI_MTU
  253. valueFrom:
  254. configMapKeyRef:
  255. name: calico-config
  256. key: veth_mtu
  257. # Prevents the container from sleeping forever.
  258. - name: SLEEP
  259. value: "false"
  260. volumeMounts:
  261. - mountPath: /host/opt/cni/bin
  262. name: cni-bin-dir
  263. - mountPath: /host/etc/cni/net.d
  264. name: cni-net-dir
  265. - mountPath: /calico-secrets
  266. name: etcd-certs
  267. securityContext:
  268. privileged: true
  269. # Adds a Flex Volume Driver that creates a per-pod Unix Domain Socket to allow Dikastes
  270. # to communicate with Felix over the Policy Sync API.
  271. - name: flexvol-driver
  272. image: registry.cn-beijing.aliyuncs.com/dotbalo/pod2daemon-flexvol:v3.22.0
  273. volumeMounts:
  274. - name: flexvol-driver-host
  275. mountPath: /host/driver
  276. securityContext:
  277. privileged: true
  278. containers:
  279. # Runs calico-node container on each Kubernetes node. This
  280. # container programs network policy and routes on each
  281. # host.
  282. - name: calico-node
  283. image: registry.cn-beijing.aliyuncs.com/dotbalo/node:v3.22.0
  284. envFrom:
  285. - configMapRef:
  286. # Allow KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT to be overridden for eBPF mode.
  287. name: kubernetes-services-endpoint
  288. optional: true
  289. env:
  290. # The location of the etcd cluster.
  291. - name: ETCD_ENDPOINTS
  292. valueFrom:
  293. configMapKeyRef:
  294. name: calico-config
  295. key: etcd_endpoints
  296. # Location of the CA certificate for etcd.
  297. - name: ETCD_CA_CERT_FILE
  298. valueFrom:
  299. configMapKeyRef:
  300. name: calico-config
  301. key: etcd_ca
  302. # Location of the client key for etcd.
  303. - name: ETCD_KEY_FILE
  304. valueFrom:
  305. configMapKeyRef:
  306. name: calico-config
  307. key: etcd_key
  308. # Location of the client certificate for etcd.
  309. - name: ETCD_CERT_FILE
  310. valueFrom:
  311. configMapKeyRef:
  312. name: calico-config
  313. key: etcd_cert
  314. # Set noderef for node controller.
  315. - name: CALICO_K8S_NODE_REF
  316. valueFrom:
  317. fieldRef:
  318. fieldPath: spec.nodeName
  319. # Choose the backend to use.
  320. - name: CALICO_NETWORKING_BACKEND
  321. valueFrom:
  322. configMapKeyRef:
  323. name: calico-config
  324. key: calico_backend
  325. # Cluster type to identify the deployment type
  326. - name: CLUSTER_TYPE
  327. value: "k8s,bgp"
  328. # Auto-detect the BGP IP address.
  329. - name: IP
  330. value: "autodetect"
  331. # Enable IPIP
  332. - name: CALICO_IPV4POOL_IPIP
  333. value: "Always"
  334. # Enable or Disable VXLAN on the default IP pool.
  335. - name: CALICO_IPV4POOL_VXLAN
  336. value: "Never"
  337. # Set MTU for tunnel device used if ipip is enabled
  338. - name: FELIX_IPINIPMTU
  339. valueFrom:
  340. configMapKeyRef:
  341. name: calico-config
  342. key: veth_mtu
  343. # Set MTU for the VXLAN tunnel device.
  344. - name: FELIX_VXLANMTU
  345. valueFrom:
  346. configMapKeyRef:
  347. name: calico-config
  348. key: veth_mtu
  349. # Set MTU for the Wireguard tunnel device.
  350. - name: FELIX_WIREGUARDMTU
  351. valueFrom:
  352. configMapKeyRef:
  353. name: calico-config
  354. key: veth_mtu
  355. # The default IPv4 pool to create on startup if none exists. Pod IPs will be
  356. # chosen from this range. Changing this value after installation will have
  357. # no effect. This should fall within `--cluster-cidr`.
  358. - name: CALICO_IPV4POOL_CIDR
  359. value: "POD_CIDR"
  360. # Disable file logging so `kubectl logs` works.
  361. - name: CALICO_DISABLE_FILE_LOGGING
  362. value: "true"
  363. # Set Felix endpoint to host default action to ACCEPT.
  364. - name: FELIX_DEFAULTENDPOINTTOHOSTACTION
  365. value: "ACCEPT"
  366. # Disable IPv6 on Kubernetes.
  367. - name: FELIX_IPV6SUPPORT
  368. value: "false"
  369. - name: FELIX_HEALTHENABLED
  370. value: "true"
  371. securityContext:
  372. privileged: true
  373. resources:
  374. requests:
  375. cpu: 250m
  376. lifecycle:
  377. preStop:
  378. exec:
  379. command:
  380. - /bin/calico-node
  381. - -shutdown
  382. livenessProbe:
  383. exec:
  384. command:
  385. - /bin/calico-node
  386. - -felix-live
  387. - -bird-live
  388. periodSeconds: 10
  389. initialDelaySeconds: 10
  390. failureThreshold: 6
  391. timeoutSeconds: 10
  392. readinessProbe:
  393. exec:
  394. command:
  395. - /bin/calico-node
  396. - -felix-ready
  397. - -bird-ready
  398. periodSeconds: 10
  399. timeoutSeconds: 10
  400. volumeMounts:
  401. # For maintaining CNI plugin API credentials.
  402. - mountPath: /host/etc/cni/net.d
  403. name: cni-net-dir
  404. readOnly: false
  405. - mountPath: /lib/modules
  406. name: lib-modules
  407. readOnly: true
  408. - mountPath: /run/xtables.lock
  409. name: xtables-lock
  410. readOnly: false
  411. - mountPath: /var/run/calico
  412. name: var-run-calico
  413. readOnly: false
  414. - mountPath: /var/lib/calico
  415. name: var-lib-calico
  416. readOnly: false
  417. - mountPath: /calico-secrets
  418. name: etcd-certs
  419. - name: policysync
  420. mountPath: /var/run/nodeagent
  421. # For eBPF mode, we need to be able to mount the BPF filesystem at /sys/fs/bpf so we mount in the
  422. # parent directory.
  423. - name: sysfs
  424. mountPath: /sys/fs/
  425. # Bidirectional means that, if we mount the BPF filesystem at /sys/fs/bpf it will propagate to the host.
  426. # If the host is known to mount that filesystem already then Bidirectional can be omitted.
  427. mountPropagation: Bidirectional
  428. - name: cni-log-dir
  429. mountPath: /var/log/calico/cni
  430. readOnly: true
  431. volumes:
  432. # Used by calico-node.
  433. - name: lib-modules
  434. hostPath:
  435. path: /lib/modules
  436. - name: var-run-calico
  437. hostPath:
  438. path: /var/run/calico
  439. - name: var-lib-calico
  440. hostPath:
  441. path: /var/lib/calico
  442. - name: xtables-lock
  443. hostPath:
  444. path: /run/xtables.lock
  445. type: FileOrCreate
  446. - name: sysfs
  447. hostPath:
  448. path: /sys/fs/
  449. type: DirectoryOrCreate
  450. # Used to install CNI.
  451. - name: cni-bin-dir
  452. hostPath:
  453. path: /opt/cni/bin
  454. - name: cni-net-dir
  455. hostPath:
  456. path: /etc/cni/net.d
  457. # Used to access CNI logs.
  458. - name: cni-log-dir
  459. hostPath:
  460. path: /var/log/calico/cni
  461. # Mount in the etcd TLS secrets with mode 400.
  462. # See https://kubernetes.io/docs/concepts/configuration/secret/
  463. - name: etcd-certs
  464. secret:
  465. secretName: calico-etcd-secrets
  466. defaultMode: 0400
  467. # Used to create per-pod Unix Domain Sockets
  468. - name: policysync
  469. hostPath:
  470. type: DirectoryOrCreate
  471. path: /var/run/nodeagent
  472. # Used to install Flex Volume Driver
  473. - name: flexvol-driver-host
  474. hostPath:
  475. type: DirectoryOrCreate
  476. path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
  477. ---
  478. apiVersion: v1
  479. kind: ServiceAccount
  480. metadata:
  481. name: calico-node
  482. namespace: kube-system
  483. ---
  484. # Source: calico/templates/calico-kube-controllers.yaml
  485. # See https://github.com/projectcalico/kube-controllers
  486. apiVersion: apps/v1
  487. kind: Deployment
  488. metadata:
  489. name: calico-kube-controllers
  490. namespace: kube-system
  491. labels:
  492. k8s-app: calico-kube-controllers
  493. spec:
  494. # The controllers can only have a single active instance.
  495. replicas: 1
  496. selector:
  497. matchLabels:
  498. k8s-app: calico-kube-controllers
  499. strategy:
  500. type: Recreate
  501. template:
  502. metadata:
  503. name: calico-kube-controllers
  504. namespace: kube-system
  505. labels:
  506. k8s-app: calico-kube-controllers
  507. spec:
  508. nodeSelector:
  509. kubernetes.io/os: linux
  510. tolerations:
  511. # Mark the pod as a critical add-on for rescheduling.
  512. - key: CriticalAddonsOnly
  513. operator: Exists
  514. - key: node-role.kubernetes.io/master
  515. effect: NoSchedule
  516. serviceAccountName: calico-kube-controllers
  517. priorityClassName: system-cluster-critical
  518. # The controllers must run in the host network namespace so that
  519. # it isn't governed by policy that would prevent it from working.
  520. hostNetwork: true
  521. containers:
  522. - name: calico-kube-controllers
  523. image: registry.cn-beijing.aliyuncs.com/dotbalo/kube-controllers:v3.22.0
  524. env:
  525. # The location of the etcd cluster.
  526. - name: ETCD_ENDPOINTS
  527. valueFrom:
  528. configMapKeyRef:
  529. name: calico-config
  530. key: etcd_endpoints
  531. # Location of the CA certificate for etcd.
  532. - name: ETCD_CA_CERT_FILE
  533. valueFrom:
  534. configMapKeyRef:
  535. name: calico-config
  536. key: etcd_ca
  537. # Location of the client key for etcd.
  538. - name: ETCD_KEY_FILE
  539. valueFrom:
  540. configMapKeyRef:
  541. name: calico-config
  542. key: etcd_key
  543. # Location of the client certificate for etcd.
  544. - name: ETCD_CERT_FILE
  545. valueFrom:
  546. configMapKeyRef:
  547. name: calico-config
  548. key: etcd_cert
  549. # Choose which controllers to run.
  550. - name: ENABLED_CONTROLLERS
  551. value: policy,namespace,serviceaccount,workloadendpoint,node
  552. volumeMounts:
  553. # Mount in the etcd TLS secrets.
  554. - mountPath: /calico-secrets
  555. name: etcd-certs
  556. livenessProbe:
  557. exec:
  558. command:
  559. - /usr/bin/check-status
  560. - -l
  561. periodSeconds: 10
  562. initialDelaySeconds: 10
  563. failureThreshold: 6
  564. timeoutSeconds: 10
  565. readinessProbe:
  566. exec:
  567. command:
  568. - /usr/bin/check-status
  569. - -r
  570. periodSeconds: 10
  571. volumes:
  572. # Mount in the etcd TLS secrets with mode 400.
  573. # See https://kubernetes.io/docs/concepts/configuration/secret/
  574. - name: etcd-certs
  575. secret:
  576. secretName: calico-etcd-secrets
  577. defaultMode: 0440
  578. ---
  579. apiVersion: v1
  580. kind: ServiceAccount
  581. metadata:
  582. name: calico-kube-controllers
  583. namespace: kube-system
  584. ---
  585. # This manifest creates a Pod Disruption Budget for Controller to allow K8s Cluster Autoscaler to evict
  586. apiVersion: policy/v1beta1
  587. kind: PodDisruptionBudget
  588. metadata:
  589. name: calico-kube-controllers
  590. namespace: kube-system
  591. labels:
  592. k8s-app: calico-kube-controllers
  593. spec:
  594. maxUnavailable: 1
  595. selector:
  596. matchLabels:
  597. k8s-app: calico-kube-controllers
  598. ---
  599. # Source: calico/templates/calico-typha.yaml
  600. ---
  601. # Source: calico/templates/configure-canal.yaml
  602. ---
  603. # Source: calico/templates/kdd-crds.yaml

修改calico-etcd.yaml的以下位置
sed -i 's#etcd_endpoints: "http://<ETCD_IP>:<ETCD_PORT>"#etcd_endpoints: "https://192.168.68.20:2379,https://192.168.68.21:2379"#g' calico-etcd.yaml
ETCD_CA=cat /etc/kubernetes/pki/etcd/ca.crt | base64 | tr -d ‘\n’<br />`ETCD_CERT=`cat /etc/kubernetes/pki/etcd/server.crt | base64 | tr -d '\n'
ETCD_KEY=cat /etc/kubernetes/pki/etcd/server.key | base64 | tr -d ‘\n’<br />`sed -i "s@# etcd-key: null@etcd-key: ${ETCD_KEY}@g; s@# etcd-cert: null@etcd-cert: ${ETCD_CERT}@g; s@# etcd-ca: null@etcd-ca: ${ETCD_CA}@g" calico-etcd.yaml`<br />`sed -i 's#etcd_ca: ""#etcd_ca: "/calico-secrets/etcd-ca"#g; s#etcd_cert: ""#etcd_cert: "/calico-secrets/etcd-cert"#g; s#etcd_key: "" #etcd_key: "/calico-secrets/etcd-key" #g' calico-etcd.yaml`<br />`POD_SUBNET=`cat /etc/kubernetes/manifests/kube-controller-manager.yaml | grep cluster-cidr= | awk -F= '{print $NF}'
sed -i 's@# - name: CALICO_IPV4POOL_CIDR@- name: CALICO_IPV4POOL_CIDR@g; s@# value: "172.16.0.0/12"@ value: '"${POD_SUBNET}"'@g' calico-etcd.yaml
kubectl apply -f calico-etcd.yaml
部署calico 时报错,下载的官方的calico.yaml 文件 【数据存储在Kubernetes API Datastore服务中】https://docs.projectcalico.org/manifests/calico.yaml
报错内容:
Warning: policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
解决方法: 其实报错很很明显了 calico.yaml 文件里的 用的是 v1beta1 PodDisruptionBudget ,而在K8S 1.21 版本之后就不支持v1beta1 PodDisruptionBudget ,改为支持
v1 PodDisruptionBudget

使用 Kubernetes API 数据存储安装 Calico,50 个节点或更少
1.下载 Kubernetes API 数据存储的 Calico 网络清单。
curl https://docs.projectcalico.org/manifests/calico.yaml -O
2.如果您使用的是 Pod CIDR ,请跳到下一步。如果您将不同的 pod CIDR 与 kubeadm 配合使用,则无需进行任何更改 - Calico 将根据正在运行的配置自动检测 CIDR。对于其他平台,请确保取消对清单中CALICO_IPV4POOL_CIDR变量的注释,并将其设置为与所选 pod CIDR 相同的值。192.168.0.0/16

  1. - name: CALICO_IPV4POOL_CIDR
  2. value: "172.16.0.0/12"
  3. 修改calico.yaml取消注释,
  4. valuepodSubnet一样

3.根据需要自定义清单。
4.使用以下命令应用清单。
kubectl apply -f calico.yaml

使用 Kubernetes API 数据存储安装 Calico,超过 50 个节点

  1. 下载 Kubernetes API 数据存储的 Calico 网络清单。
  2. $ curl https://docs.projectcalico.org/manifests/calico-typha.yaml -o calico.yaml
  3. 如果您使用的是 Pod CIDR ,请跳到下一步。如果您将不同的 pod CIDR 与 kubeadm 配合使用,则无需进行任何更改 - Calico 将根据正在运行的配置自动检测 CIDR。对于其他平台,请确保取消对清单中CALICO_IPV4POOL_CIDR变量的注释,并将其设置为与所选 pod CIDR 相同的值。192.168.0.0/16
  4. 将副本计数修改为命名 中的所需数字。Deploymentcalico-typha
  5. apiVersion: apps/v1beta1 kind: Deployment metadata: name: calico-typha … spec: … replicas:
  6. 我们建议每 200 个节点至少一个副本,并且不超过 20 个副本。在生产环境中,我们建议至少使用三个副本,以减少滚动升级和故障的影响。副本数应始终小于节点数,否则滚动升级将停止。此外,Typha 仅当 Typha 实例数少于节点数时,才有助于扩展。
  7. 警告:如果将 Typha 部署副本计数设置为 0,则 Felix 将不会启动。typha_service_name
  8. 如果需要,自定义清单。
  9. 应用清单。
  10. $ kubectl apply -f calico.yaml

使用 etcd 数据存储安装 Calico
注意:不建议将etcd数据库用于新安装。但是,如果您将Calico作为OpenStack和Kubernetes的网络插件运行,则可以选择它。

  1. 下载 etcd 的 Calico 网络清单。
  2. $ curl https://docs.projectcalico.org/manifests/calico-etcd.yaml -o calico.yaml
  3. 如果您使用的是 Pod CIDR ,请跳到下一步。如果您将不同的 pod CIDR 与 kubeadm 配合使用,则无需进行任何更改 - Calico 将根据正在运行的配置自动检测 CIDR。对于其他平台,请确保取消对清单中CALICO_IPV4POOL_CIDR变量的注释,并将其设置为与所选 pod CIDR 相同的值。192.168.0.0/16
  4. 在 命名中, 将 的值设置为 etcd 服务器的 IP 地址和端口。
  5. ConfigMap
  6. calico-config
  7. etcd_endpoints
  8. 提示:您可以使用逗号作为分隔符指定多个。etcd_endpoint
  9. 如果需要,自定义清单。
  10. 使用以下命令应用清单。
  11. $ kubectl apply -f calico.yaml

1.7.Metrics部署和Dashboard部署

1.Metrics部署

每个节点都下载并修改名称
docker pull bitnami/metrics-server:0.5.2
docker tag bitnami/metrics-server:0.5.2 k8s.gcr.io/metrics-server/metrics-server:v0.5.2
vim components0.5.2.yaml

  1. $ cat components0.5.2.yaml
  2. apiVersion: v1
  3. kind: ServiceAccount
  4. metadata:
  5. labels:
  6. k8s-app: metrics-server
  7. name: metrics-server
  8. namespace: kube-system
  9. ---
  10. apiVersion: rbac.authorization.k8s.io/v1
  11. kind: ClusterRole
  12. metadata:
  13. labels:
  14. k8s-app: metrics-server
  15. rbac.authorization.k8s.io/aggregate-to-admin: "true"
  16. rbac.authorization.k8s.io/aggregate-to-edit: "true"
  17. rbac.authorization.k8s.io/aggregate-to-view: "true"
  18. name: system:aggregated-metrics-reader
  19. rules:
  20. - apiGroups:
  21. - metrics.k8s.io
  22. resources:
  23. - pods
  24. - nodes
  25. verbs:
  26. - get
  27. - list
  28. - watch
  29. ---
  30. apiVersion: rbac.authorization.k8s.io/v1
  31. kind: ClusterRole
  32. metadata:
  33. labels:
  34. k8s-app: metrics-server
  35. name: system:metrics-server
  36. rules:
  37. - apiGroups:
  38. - ""
  39. resources:
  40. - pods
  41. - nodes
  42. - nodes/stats
  43. - namespaces
  44. - configmaps
  45. verbs:
  46. - get
  47. - list
  48. - watch
  49. ---
  50. apiVersion: rbac.authorization.k8s.io/v1
  51. kind: RoleBinding
  52. metadata:
  53. labels:
  54. k8s-app: metrics-server
  55. name: metrics-server-auth-reader
  56. namespace: kube-system
  57. roleRef:
  58. apiGroup: rbac.authorization.k8s.io
  59. kind: Role
  60. name: extension-apiserver-authentication-reader
  61. subjects:
  62. - kind: ServiceAccount
  63. name: metrics-server
  64. namespace: kube-system
  65. ---
  66. apiVersion: rbac.authorization.k8s.io/v1
  67. kind: ClusterRoleBinding
  68. metadata:
  69. labels:
  70. k8s-app: metrics-server
  71. name: metrics-server:system:auth-delegator
  72. roleRef:
  73. apiGroup: rbac.authorization.k8s.io
  74. kind: ClusterRole
  75. name: system:auth-delegator
  76. subjects:
  77. - kind: ServiceAccount
  78. name: metrics-server
  79. namespace: kube-system
  80. ---
  81. apiVersion: rbac.authorization.k8s.io/v1
  82. kind: ClusterRoleBinding
  83. metadata:
  84. labels:
  85. k8s-app: metrics-server
  86. name: system:metrics-server
  87. roleRef:
  88. apiGroup: rbac.authorization.k8s.io
  89. kind: ClusterRole
  90. name: system:metrics-server
  91. subjects:
  92. - kind: ServiceAccount
  93. name: metrics-server
  94. namespace: kube-system
  95. ---
  96. apiVersion: v1
  97. kind: Service
  98. metadata:
  99. labels:
  100. k8s-app: metrics-server
  101. name: metrics-server
  102. namespace: kube-system
  103. spec:
  104. ports:
  105. - name: https
  106. port: 443
  107. protocol: TCP
  108. targetPort: https
  109. selector:
  110. k8s-app: metrics-server
  111. ---
  112. apiVersion: apps/v1
  113. kind: Deployment
  114. metadata:
  115. labels:
  116. k8s-app: metrics-server
  117. name: metrics-server
  118. namespace: kube-system
  119. spec:
  120. selector:
  121. matchLabels:
  122. k8s-app: metrics-server
  123. strategy:
  124. rollingUpdate:
  125. maxUnavailable: 0
  126. template:
  127. metadata:
  128. labels:
  129. k8s-app: metrics-server
  130. spec:
  131. containers:
  132. - args:
  133. - --cert-dir=/tmp
  134. - --secure-port=4443
  135. - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
  136. - --kubelet-use-node-status-port
  137. - --metric-resolution=15s
  138. - --kubelet-insecure-tls
  139. image: k8s.gcr.io/metrics-server/metrics-server:v0.5.2
  140. imagePullPolicy: IfNotPresent
  141. livenessProbe:
  142. failureThreshold: 3
  143. httpGet:
  144. path: /livez
  145. port: https
  146. scheme: HTTPS
  147. periodSeconds: 10
  148. name: metrics-server
  149. ports:
  150. - containerPort: 4443
  151. name: https
  152. protocol: TCP
  153. readinessProbe:
  154. failureThreshold: 3
  155. httpGet:
  156. path: /readyz
  157. port: https
  158. scheme: HTTPS
  159. initialDelaySeconds: 20
  160. periodSeconds: 10
  161. resources:
  162. requests:
  163. cpu: 100m
  164. memory: 200Mi
  165. securityContext:
  166. readOnlyRootFilesystem: true
  167. runAsNonRoot: true
  168. runAsUser: 1000
  169. volumeMounts:
  170. - mountPath: /tmp
  171. name: tmp-dir
  172. nodeSelector:
  173. kubernetes.io/os: linux
  174. priorityClassName: system-cluster-critical
  175. serviceAccountName: metrics-server
  176. volumes:
  177. - emptyDir: {}
  178. name: tmp-dir
  179. ---
  180. apiVersion: apiregistration.k8s.io/v1
  181. kind: APIService
  182. metadata:
  183. labels:
  184. k8s-app: metrics-server
  185. name: v1beta1.metrics.k8s.io
  186. spec:
  187. group: metrics.k8s.io
  188. groupPriorityMinimum: 100
  189. insecureSkipTLSVerify: true
  190. service:
  191. name: metrics-server
  192. namespace: kube-system
  193. version: v1beta1
  194. versionPriority: 100

**kubectl **apply** -f **components0.5.2.yaml

1.23.5存在问题下面的方法
地址:https://github.com/kubernetes-sigs/metrics-server
将Master01节点的front-proxy-ca.crt复制到所有Node节点
**scp /etc/kubernetes/pki/front-proxy-ca.crt k8s-node01:/etc/kubernetes/pki/front-proxy-ca.crt**
**scp /etc/kubernetes/pki/front-proxy-ca.crt k8s-node(其他节点自行拷贝):/etc/kubernetes/pki/front-proxy-ca.crt**
增加证书
vim components.yaml
- —requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt # change to front-proxy-ca.crt for kubeadm
安装metrics server
**kubectl create -f **components.yaml

2.Dashboard部署

地址:https://github.com/kubernetes/dashboard
kubectl apply -f [https://raw.githubusercontent.com/kubernetes/dashboard/v2.4.0/aio/deploy/recommended.yaml](https://raw.githubusercontent.com/kubernetes/dashboard/v2.4.0/aio/deploy/recommended.yaml)
kubectl apply -f recommended.yaml
创建管理员用户vim admin.yaml

  1. apiVersion: v1
  2. kind: ServiceAccount
  3. metadata:
  4. name: admin-user
  5. namespace: kube-system
  6. ---
  7. apiVersion: rbac.authorization.k8s.io/v1
  8. kind: ClusterRoleBinding
  9. metadata:
  10. name: admin-user
  11. annotations:
  12. rbac.authorization.kubernetes.io/autoupdate: "true"
  13. roleRef:
  14. apiGroup: rbac.authorization.k8s.io
  15. kind: ClusterRole
  16. name: cluster-admin
  17. subjects:
  18. - kind: ServiceAccount
  19. name: admin-user
  20. namespace: kube-system

kubectl apply -f admin.yaml -n kube-system
启动错误:

  1. Events:
  2. Type Reason Age From Message
  3. ---- ------ ---- ---- -------
  4. Normal Scheduled 4m52s default-scheduler Successfully assigned kube-system/metrics-server-d9c898cdf-kpc5k to k8s-node03
  5. Normal SandboxChanged 4m32s (x2 over 4m34s) kubelet Pod sandbox changed, it will be killed and re-created.
  6. Normal Pulling 3m34s (x3 over 4m49s) kubelet Pulling image "k8s.gcr.io/metrics-server/metrics-server:v0.5.2"
  7. Warning Failed 3m19s (x3 over 4m34s) kubelet Failed to pull image "k8s.gcr.io/metrics-server/metrics-server:v0.5.2": rpc error: code = Unknown desc = Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  8. Warning Failed 3m19s (x3 over 4m34s) kubelet Error: ErrImagePull
  9. Normal BackOff 2m56s (x7 over 4m32s) kubelet Back-off pulling image "k8s.gcr.io/metrics-server/metrics-server:v0.5.2"
  10. Warning Failed 2m56s (x7 over 4m32s) kubelet Error: ImagePullBackOff

解决方法:
需要手动下载镜像再进行改名(每个节点都需要下载)
镜像地址:bitnami/metrics-server - Docker Image | Docker Hub

  1. docker pull bitnami/metrics-server:0.6.1
  2. docker tag bitnami/metrics-server:0.6.1 k8s.gcr.io/metrics-server/metrics-server:v0.6.1

3.登录dashboard

更改dashboard的svc为NodePort:
kubectl edit svc kubernetes-dashboard -n kubernetes-dashboard
将ClusterIP更改为NodePort(如果已经为NodePort忽略此步骤):
查看端口号:
kubectl get svc kubernetes-dashboard -n kubernetes-dashboard
根据自己的实例端口号,通过任意安装了kube-proxy的宿主机的IP+端口即可访问到dashboard:
访问Dashboard:https://10.103.236.201:18282(请更改18282为自己的端口),选择登录方式为令牌(即token方式)
查看token值:
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')

4.一些必须的配置更改

将Kube-proxy改为ipvs模式,因为在初始化集群的时候注释了ipvs配置,所以需要自行修改一下:
在master01节点执行
kubectl edit cm kube-proxy -n kube-system
mode: ipvs

更新Kube-Proxy的Pod:
kubectl patch daemonset kube-proxy -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"date +’%s’\"}}}}}" -n kube-system
验证Kube-Proxy模式
curl 127.0.0.1:10249/proxyMode

1.8dashboard其它选择

Kuboard - Kubernetes 多集群管理界面
地址:https://kuboard.cn/

卸载

  • 执行 Kuboard v3 的卸载

kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml

在 master 节点以及带有 k8s.kuboard.cn/role=etcd 标签的节点上执行
rm -rf /usr/share/kuboard