1.Kubeadm高可用安装k8s集群
0.网段说明
集群安装时会涉及到三个网段:
宿主机网段:就是安装k8s的服务器
Pod网段:k8s Pod的网段,相当于容器的IP
Service网段:k8s service网段,service用于集群容器通信。
一般service网段会设置为10.96.0.0/12
Pod网段会设置成10.244.0.0/12或者172.16.0.1/12
宿主机网段可能是192.168.0.0/24
需要注意的是这三个网段不能有任何交叉。
比如如果宿主机的IP是10.105.0.x
那么service网段就不能是10.96.0.0/12,因为10.96.0.0/12网段可用IP是:
10.96.0.1 ~ 10.111.255.255
所以10.105是在这个范围之内的,属于网络交叉,此时service网段需要更换,
可以更改为192.168.0.0/16网段(注意如果service网段是192.168开头的子网掩码最好不要是12,最好为16,因为子网掩码是12他的起始IP为192.160.0.1 不是192.168.0.1)。
同样的道理,技术别的网段也不能重复。可以通过http://tools.jb51.net/aideddesign/ip_net_calc/计算:
所以一般的推荐是,直接第一个开头的就不要重复,比如你的宿主机是192开头的,那么你的service可以是10.96.0.0/12.
如果你的宿主机是10开头的,就直接把service的网段改成192.168.0.0/16
如果你的宿主机是172开头的,就直接把pod网段改成192.168.0.0/12
注意搭配,均为10网段、172网段、192网段的搭配,第一个开头数字不一样就免去了网段冲突的可能性,也可以减去计算的步骤。
1.1Kubeadm基础环境配置
1.修改apt源
所有节点配置vim /etc/apt/sources.list
deb http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
ubuntu镜像-ubuntu下载地址-ubuntu安装教程-阿里巴巴开源镜像站 (aliyun.com)
# 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-security main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-security main restricted universe multiverse
# 预发布软件源,不建议启用
# deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse
https://mirrors.tuna.tsinghua.edu.cn/help/ubuntu/
2.修改hosts文件
所有节点配置vim /etc/hosts
192.168.68.20 k8s-master01
192.168.68.21 k8s-master02
192.168.68.18 k8s-master-lb # 如果不是高可用集群,该IP为Master01的IP
192.168.68.22 k8s-node01
192.168.68.23 k8s-node02
3.关闭swap分区
所有节点配置vim /etc/fstab
临时禁用:swapoff -a
永久禁用:
vim /etc/fstab 注释掉swap条目,重启系统
4.时间同步
所有节点
apt install ntpdate -y
rm -rf /etc/localtime
ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
ntpdate cn.ntp.org.cn
# 加入到crontab -e
*/5 * * * * /usr/sbin/ntpdate time2.aliyun.com
# 加入到开机自动同步,vim /etc/rc.local
ntpdate cn.ntp.org.cn
5.配置limit
所有节点ulimit -SHn 65535
vim /etc/security/limits.conf
# 末尾添加如下内容
* soft nofile 655360
* hard nofile 131072
* soft nproc 655350
* hard nproc 655350
* soft memlock unlimited
* hard memlock unlimited
6.修改内核参数
cat >/etc/sysctl.conf <<EOF
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded
applications. kernel.core_uses_pid = 1
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1
# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536
# # Controls the maximum size of a message, in bytes
kernel.msgmax = 65536
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736
# # Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296
# TCP kernel paramater
net.ipv4.tcp_mem = 786432 1048576 1572864
net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_wmem = 4096 16384 4194304 n
et.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_sack = 1
# socket buffer
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.netdev_max_backlog = 262144
net.core.somaxconn = 20480
net.core.optmem_max = 81920
# TCP conn
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_syn_retries = 3
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_retries2 = 15
# tcp conn reuse
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_tw_reuse = 0
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_fin_timeout = 1
net.ipv4.tcp_max_tw_buckets = 20000
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.tcp_synack_retries = 1
net.ipv4.tcp_syncookies = 1
# keepalive conn
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.ip_local_port_range = 10001 65000
# swap
vm.overcommit_memory = 0
vm.swappiness = 10
#net.ipv4.conf.eth1.rp_filter = 0
#net.ipv4.conf.lo.arp_ignore = 1
#net.ipv4.conf.lo.arp_announce = 2
#net.ipv4.conf.all.arp_ignore = 1
#net.ipv4.conf.all.arp_announce = 2
EOF
安装常用软件apt install wget jq psmisc vim net-tools telnet lvm2 git -y
所有节点安装ipvsadm
apt install ipvsadm ipset conntrack -y
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
vim /etc/modules-load.d/ipvs.conf
# 加入以下内容
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
ip_vs_ftp
ip_vs_sh
nf_conntrack
ip_tables
ip_set
xt_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip
然后执行systemctl enable —now systemd-modules-load.service即可
开启一些k8s集群中必须的内核参数,所有节点配置k8s内核
cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
fs.may_detach_mounts = 1
net.ipv4.conf.all.route_localnet = 1
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_watches=89100
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl =15
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.ip_conntrack_max = 65536
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_timestamps = 0
net.core.somaxconn = 16384
EOF
sysctl —system
所有节点配置完内核后,重启服务器,保证重启后内核依旧加载
reboot
查看
lsmod | grep —color=auto -e ip_vs -e nf_conntrack
8.Master01节点免密钥登录其他节点
安装过程中生成配置文件和证书均在Master01上操作,集群管理也在Master01上操作,阿里云或者AWS上需要单独一台kubectl服务器。密钥配置如下:
ssh-keygen -t rsa -f /root/.ssh/id_rsa -C "192.168.68.20@k8s-master01" -N ""
for i in k8s-master01 k8s-master02 k8s-node01 k8s-node02;do ssh-copy-id -i /root/.ssh/id_rsa.pub $i;done
1.2基本组件安装
1.所有节点安装docker
docker20.10版本K8S不支持需要安装19.03
将官方 Docker 版本库的 GPG 密钥添加到系统中:
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
将 Docker 版本库添加到APT源:
add-apt-repository “deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable”
我们用新添加的 Docker 软件包来进行升级更新。
apt update
确保要从 Docker 版本库,而不是默认的 Ubuntu 版本库进行安装:
apt-cache policy docker-ce
安装 Docker :
apt install docker-ce=5:19.03.15~3-0~ubuntu-focal docker-ce-cli=5:19.03.15~3-0~ubuntu-focal -y
启动docker
systemctl status docker
mkdir /etc/docker
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors": ["http://hub-mirror.c.163.com,"https://registry.docker-cn.com","https://docker.mirrors.ustc.edu.cn""]
}
EOF
设置开机自启**systemctl daemon-reload && systemctl enable --now docker**
2.所有节点安装k8s组件
apt install kubeadm kubelet kubectl -y
如果无法下载先配置k8s软件源
默认配置的pause镜像使用gcr.io仓库,国内可能无法访问,所以这里配置Kubelet使用阿里云的pause镜像:
# 查看 kubelet 配置
systemctl status kubelet
cd /etc/systemd/system/kubelet.service.d
# 添加 Environment="KUBELET_POD_INFRA_CONTAINER= --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.4.1" 如下
cat 10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
Environment="KUBELET_CGROUP_ARGS=--system-reserved=memory=10Gi --kube-reserved=memory=400Mi --eviction-hard=imagefs.available<15%,memory.available<300Mi,nodefs.available<10%,nodefs.inodesFree<5% --cgroup-driver=systemd"
Environment="KUBELET_POD_INFRA_CONTAINER= --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_CGROUP_ARGS $KUBELET_EXTRA_ARGS
# 重启 kubelet 服务
systemctl daemon-reload
systemctl restart kubelet
3.所有节点安装k8s软件源
apt-get update && apt-get install -y apt-transport-https
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
vim /etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
apt-get update
apt-get install -y kubelet=1.20.15-00 kubeadm=1.20.15-00 kubectl=1.20.15-00
1.3高可用组件安装
(注意:如果不是高可用集群,haproxy和keepalived无需安装)
公有云要用公有云自带的负载均衡,比如阿里云的SLB,腾讯云的ELB,用来替代haproxy和keepalived,因为公有云大部分都是不支持keepalived的,另外如果用阿里云的话,kubectl控制端不能放在master节点,推荐使用腾讯云,因为阿里云的slb有回环的问题,也就是slb代理的服务器不能反向访问SLB,但是腾讯云修复了这个问题。
所有Master节点通过apt安装HAProxy和KeepAlived:apt install keepalived haproxy -y
所有Master节点配置HAProxy(详细配置参考HAProxy文档,所有Master节点的HAProxy配置相同):
vim /etc/haproxy/haproxy.cfg
global
maxconn 2000
ulimit-n 16384
log 127.0.0.1 local0 err
stats timeout 30s
defaults
log global
mode http
option httplog
timeout connect 5000
timeout client 50000
timeout server 50000
timeout http-request 15s
timeout http-keep-alive 15s
frontend monitor-in
bind *:33305
mode http
option httplog
monitor-uri /monitor
frontend k8s-master
bind 0.0.0.0:16443
bind 127.0.0.1:16443
mode tcp
option tcplog
tcp-request inspect-delay 5s
default_backend k8s-master
backend k8s-master
mode tcp
option tcplog
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server k8s-master01 192.168.68.20:6443 check
server k8s-master02 192.168.68.21:6443 check
所有Master节点配置KeepAlived,配置不一样,注意区分
# vim /etc/keepalived/keepalived.conf ,注意每个节点的IP和网卡(interface参数)
M01节点的配置:
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state MASTER
interface ens33
mcast_src_ip 192.168.68.20
virtual_router_id 51
priority 101
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.68.18
}
track_script {
chk_apiserver
}
}
M02节点的配置:
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
mcast_src_ip 192.168.68.21
virtual_router_id 51
priority 100
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.68.18
}
track_script {
chk_apiserver
}
}
所有master节点配置KeepAlived健康检查文件:
vim /etc/keepalived/check_apiserver.sh
#!/bin/bash
err=0
for k in $(seq 1 3)
do
check_code=$(pgrep haproxy)
if [[ $check_code == "" ]]; then
err=$(expr $err + 1)
sleep 1
continue
else
err=0
break
fi
done
if [[ $err != "0" ]]; then
echo "systemctl stop keepalived"
/usr/bin/systemctl stop keepalived
exit 1
else
exit 0
fi
chmod +x /etc/keepalived/check_apiserver.sh
启动haproxy和keepalived
systemctl daemon-reload
systemctl enable --now haproxy
systemctl enable --now keepalived
systemctl restart keepalived haproxy
重要:如果安装了keepalived和haproxy,需要测试keepalived是否是正常的
测试VIP
root:kubelet.service.d/ # ping 192.168.68.18 -c 4 [11:34:19]
PING 192.168.68.18 (192.168.68.18) 56(84) bytes of data.
64 bytes from 192.168.68.18: icmp_seq=1 ttl=64 time=0.032 ms
64 bytes from 192.168.68.18: icmp_seq=2 ttl=64 time=0.043 ms
64 bytes from 192.168.68.18: icmp_seq=3 ttl=64 time=0.033 ms
64 bytes from 192.168.68.18: icmp_seq=4 ttl=64 time=0.038 ms
--- 192.168.68.18 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3075ms
rtt min/avg/max/mdev = 0.032/0.036/0.043/0.004 ms
root:~/ # telnet 192.168.68.18 16443 [11:44:03]
Trying 192.168.68.18...
Connected to 192.168.68.18.
Escape character is '^]'.
Connection closed by foreign host.
1.4.集群初始化
1.以下操作只在master01节点执行
M01节点创建kubeadm-config.yaml配置文件如下:
Master01:(# 注意,如果不是高可用集群,192.168.68.18:16443改为m01的地址,16443改为apiserver的端口,默认是6443,
注意更改kubernetesVersion的值和自己服务器kubeadm的版本一致:kubeadm version)
注意:以下文件内容,宿主机网段、podSubnet网段、serviceSubnet网段不能重复
vim kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: 7t2weq.bjbawausm0jaxury
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.68.20
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k8s-master01
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
certSANs:
- 192.168.68.18
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 192.168.68.18:16443
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.20.15
networking:
dnsDomain: cluster.local
podSubnet: 172.16.0.0/12
serviceSubnet: 10.96.0.0/16
scheduler: {}
更新kubeadm文件kubeadm config migrate --old-config kubeadm-config.yaml --new-config new.yaml
将new.yaml文件复制到其他master节点,for i in k8s-master02; do scp new.yaml $i:/root/; done
所有Master节点提前下载镜像,可以节省初始化时间(其他节点不需要更改任何配置,包括IP地址也不需要更改):kubeadm config images pull --config /root/new.yaml
2.所有节点设置开机自启动kubelet
systemctl enable --now kubelet(如果启动失败无需管理,初始化成功以后即可启动)
3.Master01节点初始化
M01节点初始化,初始化以后会在/etc/kubernetes目录下生成对应的证书和配置文件,之后其他Master节点加入M01即可:kubeadm init --config /root/new.yaml --upload-certs
如果初始化失败,重置后再次初始化,命令如下:kubeadm reset -f ; ipvsadm --clear ; rm -rf ~/.kube
初始化成功以后,会产生Token值,用于其他节点加入时使用,因此要记录下初始化成功生成的token值(令牌值):
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join 192.168.68.18:16443 --token 7t2weq.bjbawausm0jaxury \
--discovery-token-ca-cert-hash sha256:d9c52c2db865df54fd9db22e911ffb7adf12d1244103005fc1885933c6d27673 \
--control-plane --certificate-key 8da0fbf16c11810bb4930846896d21d633af936839c3941890d3c5cf85b98316
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.68.18:16443 --token 7t2weq.bjbawausm0jaxury \
--discovery-token-ca-cert-hash sha256:d9c52c2db865df54fd9db22e911ffb7adf12d1244103005fc1885933c6d27673
4.Master01节点配置环境变量,用于访问Kubernetes集群:
cat <<EOF >> /root/.bashrc
export KUBECONFIG=/etc/kubernetes/admin.conf
EOF
cat <<EOF >> /root/.zshrc
export KUBECONFIG=/etc/kubernetes/admin.conf
EOF
查看节点状态:kubectl get node
采用初始化安装方式,所有的系统组件均以容器的方式运行并且在kube-system命名空间内,此时可以查看Pod状态:kubectl get po -n kube-system
5.配置k8s命令补全
临时可以使用
apt install -y bash-completion
source /usr/share/bash-completion/bash_completion
source <(kubectl completion bash)
永久使用
编辑~/.bashrc
或echo "source <(kubectl completion bash)" >>~/.bashrc
vim ~/.bashrc
添加以下内容
source <(kubectl completion bash)
保存退出,输入以下命令
source ~/.bashrc
1.Kubectl 自动补全
BASH
apt install -y bash-completion
source <(kubectl completion bash) # 在 bash 中设置当前 shell 的自动补全,要先安装 bash-completion 包。
echo "source <(kubectl completion bash)" >> ~/.bashrc # 在您的 bash shell 中永久的添加自动补全
source ~/.bashrc
您还可以为 kubectl 使用一个速记别名,该别名也可以与 completion 一起使用:
alias k=kubectl
complete -F __start_kubectl k
ZSH
apt install -y bash-completion
source <(kubectl completion zsh) # 在 zsh 中设置当前 shell 的自动补全
echo "[[ $commands[kubectl] ]] && source <(kubectl completion zsh)" >> ~/.zshrc # 在您的 zsh shell 中永久的添加自动补全
source ~/.zshrc
1.5.Master和node节点
注意:以下步骤是上述init命令产生的Token过期了才需要执行以下步骤,如果没有过期不需要执行
Token过期后生成新的token:
kubeadm token create —print-join-command
Master需要生成__—certificate-key
kubeadm init phase upload-certs —upload-certs
Token没有过期直接执行Join就行了
其他master加入集群,master02执行
kubeadm join 192.168.68.18:16443 --token 7t2weq.bjbawausm0jaxury \
--discovery-token-ca-cert-hash sha256:d9c52c2db865df54fd9db22e911ffb7adf12d1244103005fc1885933c6d27673 \
--control-plane --certificate-key 8da0fbf16c11810bb4930846896d21d633af936839c3941890d3c5cf85b98316
查看当前状态:kubectl get node
Node节点的配置
Node节点上主要部署公司的一些业务应用,生产环境中不建议Master节点部署系统组件之外的其他Pod,测试环境可以允许Master节点部署Pod以节省系统资源。
kubeadm join 192.168.68.18:16443 --token 7t2weq.bjbawausm0jaxury \
--discovery-token-ca-cert-hash sha256:d9c52c2db865df54fd9db22e911ffb7adf12d1244103005fc1885933c6d27673
所有节点初始化完成后,查看集群状态
kubectl get node
1.6.网络组件Calico组件的安装
calico-etcd.yaml
---
# Source: calico/templates/calico-etcd-secrets.yaml
# The following contains k8s Secrets for use with a TLS enabled etcd cluster.
# For information on populating Secrets, see http://kubernetes.io/docs/user-guide/secrets/
apiVersion: v1
kind: Secret
type: Opaque
metadata:
name: calico-etcd-secrets
namespace: kube-system
data:
# Populate the following with etcd TLS configuration if desired, but leave blank if
# not using TLS for etcd.
# The keys below should be uncommented and the values populated with the base64
# encoded contents of each file that would be associated with the TLS data.
# Example command for encoding a file contents: cat <file> | base64 -w 0
# etcd-key: null
# etcd-cert: null
# etcd-ca: null
---
# Source: calico/templates/calico-config.yaml
# This ConfigMap is used to configure a self-hosted Calico installation.
kind: ConfigMap
apiVersion: v1
metadata:
name: calico-config
namespace: kube-system
data:
# Configure this with the location of your etcd cluster.
etcd_endpoints: "http://<ETCD_IP>:<ETCD_PORT>"
# If you're using TLS enabled etcd uncomment the following.
# You must also populate the Secret below with these files.
etcd_ca: "" # "/calico-secrets/etcd-ca"
etcd_cert: "" # "/calico-secrets/etcd-cert"
etcd_key: "" # "/calico-secrets/etcd-key"
# Typha is disabled.
typha_service_name: "none"
# Configure the backend to use.
calico_backend: "bird"
# Configure the MTU to use for workload interfaces and tunnels.
# By default, MTU is auto-detected, and explicitly setting this field should not be required.
# You can override auto-detection by providing a non-zero value.
veth_mtu: "0"
# The CNI network configuration to install on each node. The special
# values in this config will be automatically populated.
cni_network_config: |-
{
"name": "k8s-pod-network",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "calico",
"log_level": "info",
"log_file_path": "/var/log/calico/cni/cni.log",
"etcd_endpoints": "__ETCD_ENDPOINTS__",
"etcd_key_file": "__ETCD_KEY_FILE__",
"etcd_cert_file": "__ETCD_CERT_FILE__",
"etcd_ca_cert_file": "__ETCD_CA_CERT_FILE__",
"mtu": __CNI_MTU__,
"ipam": {
"type": "calico-ipam"
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "__KUBECONFIG_FILEPATH__"
}
},
{
"type": "portmap",
"snat": true,
"capabilities": {"portMappings": true}
},
{
"type": "bandwidth",
"capabilities": {"bandwidth": true}
}
]
}
---
# Source: calico/templates/calico-kube-controllers-rbac.yaml
# Include a clusterrole for the kube-controllers component,
# and bind it to the calico-kube-controllers serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-kube-controllers
rules:
# Pods are monitored for changing labels.
# The node controller monitors Kubernetes nodes.
# Namespace and serviceaccount labels are used for policy.
- apiGroups: [""]
resources:
- pods
- nodes
- namespaces
- serviceaccounts
verbs:
- watch
- list
- get
# Watch for changes to Kubernetes NetworkPolicies.
- apiGroups: ["networking.k8s.io"]
resources:
- networkpolicies
verbs:
- watch
- list
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-kube-controllers
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: calico-kube-controllers
subjects:
- kind: ServiceAccount
name: calico-kube-controllers
namespace: kube-system
---
---
# Source: calico/templates/calico-node-rbac.yaml
# Include a clusterrole for the calico-node DaemonSet,
# and bind it to the calico-node serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-node
rules:
# The CNI plugin needs to get pods, nodes, and namespaces.
- apiGroups: [""]
resources:
- pods
- nodes
- namespaces
verbs:
- get
# EndpointSlices are used for Service-based network policy rule
# enforcement.
- apiGroups: ["discovery.k8s.io"]
resources:
- endpointslices
verbs:
- watch
- list
- apiGroups: [""]
resources:
- endpoints
- services
verbs:
# Used to discover service IPs for advertisement.
- watch
- list
# Pod CIDR auto-detection on kubeadm needs access to config maps.
- apiGroups: [""]
resources:
- configmaps
verbs:
- get
- apiGroups: [""]
resources:
- nodes/status
verbs:
# Needed for clearing NodeNetworkUnavailable flag.
- patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: calico-node
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: calico-node
subjects:
- kind: ServiceAccount
name: calico-node
namespace: kube-system
---
# Source: calico/templates/calico-node.yaml
# This manifest installs the calico-node container, as well
# as the CNI plugins and network config on
# each master and worker node in a Kubernetes cluster.
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: calico-node
namespace: kube-system
labels:
k8s-app: calico-node
spec:
selector:
matchLabels:
k8s-app: calico-node
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
k8s-app: calico-node
spec:
nodeSelector:
kubernetes.io/os: linux
hostNetwork: true
tolerations:
# Make sure calico-node gets scheduled on all nodes.
- effect: NoSchedule
operator: Exists
# Mark the pod as a critical add-on for rescheduling.
- key: CriticalAddonsOnly
operator: Exists
- effect: NoExecute
operator: Exists
serviceAccountName: calico-node
# Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
# deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
terminationGracePeriodSeconds: 0
priorityClassName: system-node-critical
initContainers:
# This container installs the CNI binaries
# and CNI network config file on each node.
- name: install-cni
image: registry.cn-beijing.aliyuncs.com/dotbalo/cni:v3.22.0
command: ["/opt/cni/bin/install"]
envFrom:
- configMapRef:
# Allow KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT to be overridden for eBPF mode.
name: kubernetes-services-endpoint
optional: true
env:
# Name of the CNI config file to create.
- name: CNI_CONF_NAME
value: "10-calico.conflist"
# The CNI network config to install on each node.
- name: CNI_NETWORK_CONFIG
valueFrom:
configMapKeyRef:
name: calico-config
key: cni_network_config
# The location of the etcd cluster.
- name: ETCD_ENDPOINTS
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_endpoints
# CNI MTU Config variable
- name: CNI_MTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# Prevents the container from sleeping forever.
- name: SLEEP
value: "false"
volumeMounts:
- mountPath: /host/opt/cni/bin
name: cni-bin-dir
- mountPath: /host/etc/cni/net.d
name: cni-net-dir
- mountPath: /calico-secrets
name: etcd-certs
securityContext:
privileged: true
# Adds a Flex Volume Driver that creates a per-pod Unix Domain Socket to allow Dikastes
# to communicate with Felix over the Policy Sync API.
- name: flexvol-driver
image: registry.cn-beijing.aliyuncs.com/dotbalo/pod2daemon-flexvol:v3.22.0
volumeMounts:
- name: flexvol-driver-host
mountPath: /host/driver
securityContext:
privileged: true
containers:
# Runs calico-node container on each Kubernetes node. This
# container programs network policy and routes on each
# host.
- name: calico-node
image: registry.cn-beijing.aliyuncs.com/dotbalo/node:v3.22.0
envFrom:
- configMapRef:
# Allow KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT to be overridden for eBPF mode.
name: kubernetes-services-endpoint
optional: true
env:
# The location of the etcd cluster.
- name: ETCD_ENDPOINTS
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_endpoints
# Location of the CA certificate for etcd.
- name: ETCD_CA_CERT_FILE
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_ca
# Location of the client key for etcd.
- name: ETCD_KEY_FILE
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_key
# Location of the client certificate for etcd.
- name: ETCD_CERT_FILE
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_cert
# Set noderef for node controller.
- name: CALICO_K8S_NODE_REF
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# Choose the backend to use.
- name: CALICO_NETWORKING_BACKEND
valueFrom:
configMapKeyRef:
name: calico-config
key: calico_backend
# Cluster type to identify the deployment type
- name: CLUSTER_TYPE
value: "k8s,bgp"
# Auto-detect the BGP IP address.
- name: IP
value: "autodetect"
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "Always"
# Enable or Disable VXLAN on the default IP pool.
- name: CALICO_IPV4POOL_VXLAN
value: "Never"
# Set MTU for tunnel device used if ipip is enabled
- name: FELIX_IPINIPMTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# Set MTU for the VXLAN tunnel device.
- name: FELIX_VXLANMTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# Set MTU for the Wireguard tunnel device.
- name: FELIX_WIREGUARDMTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# The default IPv4 pool to create on startup if none exists. Pod IPs will be
# chosen from this range. Changing this value after installation will have
# no effect. This should fall within `--cluster-cidr`.
- name: CALICO_IPV4POOL_CIDR
value: "POD_CIDR"
# Disable file logging so `kubectl logs` works.
- name: CALICO_DISABLE_FILE_LOGGING
value: "true"
# Set Felix endpoint to host default action to ACCEPT.
- name: FELIX_DEFAULTENDPOINTTOHOSTACTION
value: "ACCEPT"
# Disable IPv6 on Kubernetes.
- name: FELIX_IPV6SUPPORT
value: "false"
- name: FELIX_HEALTHENABLED
value: "true"
securityContext:
privileged: true
resources:
requests:
cpu: 250m
lifecycle:
preStop:
exec:
command:
- /bin/calico-node
- -shutdown
livenessProbe:
exec:
command:
- /bin/calico-node
- -felix-live
- -bird-live
periodSeconds: 10
initialDelaySeconds: 10
failureThreshold: 6
timeoutSeconds: 10
readinessProbe:
exec:
command:
- /bin/calico-node
- -felix-ready
- -bird-ready
periodSeconds: 10
timeoutSeconds: 10
volumeMounts:
# For maintaining CNI plugin API credentials.
- mountPath: /host/etc/cni/net.d
name: cni-net-dir
readOnly: false
- mountPath: /lib/modules
name: lib-modules
readOnly: true
- mountPath: /run/xtables.lock
name: xtables-lock
readOnly: false
- mountPath: /var/run/calico
name: var-run-calico
readOnly: false
- mountPath: /var/lib/calico
name: var-lib-calico
readOnly: false
- mountPath: /calico-secrets
name: etcd-certs
- name: policysync
mountPath: /var/run/nodeagent
# For eBPF mode, we need to be able to mount the BPF filesystem at /sys/fs/bpf so we mount in the
# parent directory.
- name: sysfs
mountPath: /sys/fs/
# Bidirectional means that, if we mount the BPF filesystem at /sys/fs/bpf it will propagate to the host.
# If the host is known to mount that filesystem already then Bidirectional can be omitted.
mountPropagation: Bidirectional
- name: cni-log-dir
mountPath: /var/log/calico/cni
readOnly: true
volumes:
# Used by calico-node.
- name: lib-modules
hostPath:
path: /lib/modules
- name: var-run-calico
hostPath:
path: /var/run/calico
- name: var-lib-calico
hostPath:
path: /var/lib/calico
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
- name: sysfs
hostPath:
path: /sys/fs/
type: DirectoryOrCreate
# Used to install CNI.
- name: cni-bin-dir
hostPath:
path: /opt/cni/bin
- name: cni-net-dir
hostPath:
path: /etc/cni/net.d
# Used to access CNI logs.
- name: cni-log-dir
hostPath:
path: /var/log/calico/cni
# Mount in the etcd TLS secrets with mode 400.
# See https://kubernetes.io/docs/concepts/configuration/secret/
- name: etcd-certs
secret:
secretName: calico-etcd-secrets
defaultMode: 0400
# Used to create per-pod Unix Domain Sockets
- name: policysync
hostPath:
type: DirectoryOrCreate
path: /var/run/nodeagent
# Used to install Flex Volume Driver
- name: flexvol-driver-host
hostPath:
type: DirectoryOrCreate
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: calico-node
namespace: kube-system
---
# Source: calico/templates/calico-kube-controllers.yaml
# See https://github.com/projectcalico/kube-controllers
apiVersion: apps/v1
kind: Deployment
metadata:
name: calico-kube-controllers
namespace: kube-system
labels:
k8s-app: calico-kube-controllers
spec:
# The controllers can only have a single active instance.
replicas: 1
selector:
matchLabels:
k8s-app: calico-kube-controllers
strategy:
type: Recreate
template:
metadata:
name: calico-kube-controllers
namespace: kube-system
labels:
k8s-app: calico-kube-controllers
spec:
nodeSelector:
kubernetes.io/os: linux
tolerations:
# Mark the pod as a critical add-on for rescheduling.
- key: CriticalAddonsOnly
operator: Exists
- key: node-role.kubernetes.io/master
effect: NoSchedule
serviceAccountName: calico-kube-controllers
priorityClassName: system-cluster-critical
# The controllers must run in the host network namespace so that
# it isn't governed by policy that would prevent it from working.
hostNetwork: true
containers:
- name: calico-kube-controllers
image: registry.cn-beijing.aliyuncs.com/dotbalo/kube-controllers:v3.22.0
env:
# The location of the etcd cluster.
- name: ETCD_ENDPOINTS
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_endpoints
# Location of the CA certificate for etcd.
- name: ETCD_CA_CERT_FILE
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_ca
# Location of the client key for etcd.
- name: ETCD_KEY_FILE
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_key
# Location of the client certificate for etcd.
- name: ETCD_CERT_FILE
valueFrom:
configMapKeyRef:
name: calico-config
key: etcd_cert
# Choose which controllers to run.
- name: ENABLED_CONTROLLERS
value: policy,namespace,serviceaccount,workloadendpoint,node
volumeMounts:
# Mount in the etcd TLS secrets.
- mountPath: /calico-secrets
name: etcd-certs
livenessProbe:
exec:
command:
- /usr/bin/check-status
- -l
periodSeconds: 10
initialDelaySeconds: 10
failureThreshold: 6
timeoutSeconds: 10
readinessProbe:
exec:
command:
- /usr/bin/check-status
- -r
periodSeconds: 10
volumes:
# Mount in the etcd TLS secrets with mode 400.
# See https://kubernetes.io/docs/concepts/configuration/secret/
- name: etcd-certs
secret:
secretName: calico-etcd-secrets
defaultMode: 0440
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: calico-kube-controllers
namespace: kube-system
---
# This manifest creates a Pod Disruption Budget for Controller to allow K8s Cluster Autoscaler to evict
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: calico-kube-controllers
namespace: kube-system
labels:
k8s-app: calico-kube-controllers
spec:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: calico-kube-controllers
---
# Source: calico/templates/calico-typha.yaml
---
# Source: calico/templates/configure-canal.yaml
---
# Source: calico/templates/kdd-crds.yaml
修改calico-etcd.yaml的以下位置sed -i 's#etcd_endpoints: "http://<ETCD_IP>:<ETCD_PORT>"#etcd_endpoints: "https://192.168.68.20:2379,https://192.168.68.21:2379"#g' calico-etcd.yaml
ETCD_CA=
cat /etc/kubernetes/pki/etcd/ca.crt | base64 | tr -d ‘\n’<br />`ETCD_CERT=`cat /etc/kubernetes/pki/etcd/server.crt | base64 | tr -d '\n'
ETCD_KEY=
cat /etc/kubernetes/pki/etcd/server.key | base64 | tr -d ‘\n’<br />`sed -i "s@# etcd-key: null@etcd-key: ${ETCD_KEY}@g; s@# etcd-cert: null@etcd-cert: ${ETCD_CERT}@g; s@# etcd-ca: null@etcd-ca: ${ETCD_CA}@g" calico-etcd.yaml`<br />`sed -i 's#etcd_ca: ""#etcd_ca: "/calico-secrets/etcd-ca"#g; s#etcd_cert: ""#etcd_cert: "/calico-secrets/etcd-cert"#g; s#etcd_key: "" #etcd_key: "/calico-secrets/etcd-key" #g' calico-etcd.yaml`<br />`POD_SUBNET=`cat /etc/kubernetes/manifests/kube-controller-manager.yaml | grep cluster-cidr= | awk -F= '{print $NF}'
sed -i 's@# - name: CALICO_IPV4POOL_CIDR@- name: CALICO_IPV4POOL_CIDR@g; s@# value: "172.16.0.0/12"@ value: '"${POD_SUBNET}"'@g' calico-etcd.yaml
kubectl apply -f calico-etcd.yaml
部署calico 时报错,下载的官方的calico.yaml 文件 【数据存储在Kubernetes API Datastore服务中】https://docs.projectcalico.org/manifests/calico.yaml
报错内容:
Warning: policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
解决方法: 其实报错很很明显了 calico.yaml 文件里的 用的是 v1beta1 PodDisruptionBudget ,而在K8S 1.21 版本之后就不支持v1beta1 PodDisruptionBudget ,改为支持
v1 PodDisruptionBudget
使用 Kubernetes API 数据存储安装 Calico,50 个节点或更少
1.下载 Kubernetes API 数据存储的 Calico 网络清单。curl https://docs.projectcalico.org/manifests/calico.yaml -O
2.如果您使用的是 Pod CIDR ,请跳到下一步。如果您将不同的 pod CIDR 与 kubeadm 配合使用,则无需进行任何更改 - Calico 将根据正在运行的配置自动检测 CIDR。对于其他平台,请确保取消对清单中CALICO_IPV4POOL_CIDR变量的注释,并将其设置为与所选 pod CIDR 相同的值。192.168.0.0/16
- name: CALICO_IPV4POOL_CIDR
value: "172.16.0.0/12"
修改calico.yaml取消注释,
value和podSubnet一样
3.根据需要自定义清单。
4.使用以下命令应用清单。kubectl apply -f calico.yaml
使用 Kubernetes API 数据存储安装 Calico,超过 50 个节点
- 下载 Kubernetes API 数据存储的 Calico 网络清单。
- $ curl https://docs.projectcalico.org/manifests/calico-typha.yaml -o calico.yaml
- 如果您使用的是 Pod CIDR ,请跳到下一步。如果您将不同的 pod CIDR 与 kubeadm 配合使用,则无需进行任何更改 - Calico 将根据正在运行的配置自动检测 CIDR。对于其他平台,请确保取消对清单中CALICO_IPV4POOL_CIDR变量的注释,并将其设置为与所选 pod CIDR 相同的值。192.168.0.0/16
- 将副本计数修改为命名 中的所需数字。Deploymentcalico-typha
- apiVersion: apps/v1beta1 kind: Deployment metadata: name: calico-typha … spec: … replicas:
- 我们建议每 200 个节点至少一个副本,并且不超过 20 个副本。在生产环境中,我们建议至少使用三个副本,以减少滚动升级和故障的影响。副本数应始终小于节点数,否则滚动升级将停止。此外,Typha 仅当 Typha 实例数少于节点数时,才有助于扩展。
- 警告:如果将 Typha 部署副本计数设置为 0,则 Felix 将不会启动。typha_service_name
- 如果需要,自定义清单。
- 应用清单。
- $ kubectl apply -f calico.yaml
使用 etcd 数据存储安装 Calico
注意:不建议将etcd数据库用于新安装。但是,如果您将Calico作为OpenStack和Kubernetes的网络插件运行,则可以选择它。
- 下载 etcd 的 Calico 网络清单。
- $ curl https://docs.projectcalico.org/manifests/calico-etcd.yaml -o calico.yaml
- 如果您使用的是 Pod CIDR ,请跳到下一步。如果您将不同的 pod CIDR 与 kubeadm 配合使用,则无需进行任何更改 - Calico 将根据正在运行的配置自动检测 CIDR。对于其他平台,请确保取消对清单中CALICO_IPV4POOL_CIDR变量的注释,并将其设置为与所选 pod CIDR 相同的值。192.168.0.0/16
- 在 命名中, 将 的值设置为 etcd 服务器的 IP 地址和端口。
- ConfigMap
- calico-config
- etcd_endpoints
- 提示:您可以使用逗号作为分隔符指定多个。etcd_endpoint
- 如果需要,自定义清单。
- 使用以下命令应用清单。
- $ kubectl apply -f calico.yaml
1.7.Metrics部署和Dashboard部署
1.Metrics部署
每个节点都下载并修改名称
docker pull bitnami/metrics-server:0.5.2
docker tag bitnami/metrics-server:0.5.2 k8s.gcr.io/metrics-server/metrics-server:v0.5.2
vim components0.5.2.yaml
$ cat components0.5.2.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-view: "true"
name: system:aggregated-metrics-reader
rules:
- apiGroups:
- metrics.k8s.io
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- nodes/stats
- namespaces
- configmaps
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
ports:
- name: https
port: 443
protocol: TCP
targetPort: https
selector:
k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 0
template:
metadata:
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls
image: k8s.gcr.io/metrics-server/metrics-server:v0.5.2
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 4443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
initialDelaySeconds: 20
periodSeconds: 10
resources:
requests:
cpu: 100m
memory: 200Mi
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- mountPath: /tmp
name: tmp-dir
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
volumes:
- emptyDir: {}
name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
**kubectl **apply** -f **components0.5.2.yaml
1.23.5存在问题下面的方法
地址:https://github.com/kubernetes-sigs/metrics-server
将Master01节点的front-proxy-ca.crt复制到所有Node节点**scp /etc/kubernetes/pki/front-proxy-ca.crt k8s-node01:/etc/kubernetes/pki/front-proxy-ca.crt**
**scp /etc/kubernetes/pki/front-proxy-ca.crt k8s-node(其他节点自行拷贝):/etc/kubernetes/pki/front-proxy-ca.crt**
增加证书vim components.yaml
- —requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt # change to front-proxy-ca.crt for kubeadm
安装metrics server**kubectl create -f **components.yaml
2.Dashboard部署
地址:https://github.com/kubernetes/dashboardkubectl apply -f [https://raw.githubusercontent.com/kubernetes/dashboard/v2.4.0/aio/deploy/recommended.yaml](https://raw.githubusercontent.com/kubernetes/dashboard/v2.4.0/aio/deploy/recommended.yaml)
kubectl apply -f recommended.yaml
创建管理员用户vim admin.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kube-system
kubectl apply -f admin.yaml -n kube-system
启动错误:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m52s default-scheduler Successfully assigned kube-system/metrics-server-d9c898cdf-kpc5k to k8s-node03
Normal SandboxChanged 4m32s (x2 over 4m34s) kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulling 3m34s (x3 over 4m49s) kubelet Pulling image "k8s.gcr.io/metrics-server/metrics-server:v0.5.2"
Warning Failed 3m19s (x3 over 4m34s) kubelet Failed to pull image "k8s.gcr.io/metrics-server/metrics-server:v0.5.2": rpc error: code = Unknown desc = Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Failed 3m19s (x3 over 4m34s) kubelet Error: ErrImagePull
Normal BackOff 2m56s (x7 over 4m32s) kubelet Back-off pulling image "k8s.gcr.io/metrics-server/metrics-server:v0.5.2"
Warning Failed 2m56s (x7 over 4m32s) kubelet Error: ImagePullBackOff
解决方法:
需要手动下载镜像再进行改名(每个节点都需要下载)
镜像地址:bitnami/metrics-server - Docker Image | Docker Hub
docker pull bitnami/metrics-server:0.6.1
docker tag bitnami/metrics-server:0.6.1 k8s.gcr.io/metrics-server/metrics-server:v0.6.1
3.登录dashboard
更改dashboard的svc为NodePort:
kubectl edit svc kubernetes-dashboard -n kubernetes-dashboard
将ClusterIP更改为NodePort(如果已经为NodePort忽略此步骤):
查看端口号:
kubectl get svc kubernetes-dashboard -n kubernetes-dashboard
根据自己的实例端口号,通过任意安装了kube-proxy的宿主机的IP+端口即可访问到dashboard:
访问Dashboard:https://10.103.236.201:18282(请更改18282为自己的端口),选择登录方式为令牌(即token方式)
查看token值:kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')
4.一些必须的配置更改
将Kube-proxy改为ipvs模式,因为在初始化集群的时候注释了ipvs配置,所以需要自行修改一下:
在master01节点执行
kubectl edit cm kube-proxy -n kube-system
mode: ipvs
更新Kube-Proxy的Pod:kubectl patch daemonset kube-proxy -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"
date +’%s’\"}}}}}" -n kube-system
验证Kube-Proxy模式
curl 127.0.0.1:10249/proxyMode
1.8dashboard其它选择
Kuboard - Kubernetes 多集群管理界面
地址:https://kuboard.cn/
- 用户名: admin
密 码: Kuboard123
kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml
查看watch kubectl get pods -n kuboard
访问 Kuboard在浏览器中打开链接 http://your-node-ip-address:30080 192.168.68.18:30080
- 输入初始用户名和密码,并登录
- 用户名:admin
- 密码:Kuboard123
卸载
- 执行 Kuboard v3 的卸载
kubectl apply -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml
- kubectl delete -f https://addons.kuboard.cn/kuboard/kuboard-v3.yaml
- 清理遗留数据
在 master 节点以及带有 k8s.kuboard.cn/role=etcd 标签的节点上执行
rm -rf /usr/share/kuboard