前言

腾讯云绑定用户,开始使用过腾讯云的tke1.10版本。鉴于各种原因选择了自建。线上kubeadm自建kubernetes集群1.16版本(小版本升级到1.16.15)。kubeadm+haproxy+slb+flannel搭建高可用集群,集群启用ipvs。对外服务使用slb绑定traefik tcp 80 443端口对外映射(这是历史遗留问题,过去腾讯云slb不支持挂载多证书,这样也造成了无法使用slb的日志投递功能,现在slb已经支持了多证书的挂载,可以直接使用http http方式了)。生产环境当时搭建仓库没有使用腾讯云的块存储,直接使用cbs。直接用了local disk,还有nfs的共享存储。前几天整了个项目的压力测试,然后使用nfs存储的项目IO直接就飙升了。生产环境不建议使用。准备安装kubernetes 1.20版本,并使用cilium组网。hubble替代kube-proxy 体验一下ebpf。另外也直接上containerd。dockershim的方式确实也浪费资源的。这样也是可以减少资源开销,部署速度的。反正就是体验一下各种最新功能:
image.png
image.png
图片引用自:https://blog.kelu.org/tech/2020/10/09/the-diff-between-docker-containerd-runc-docker-shim.html

环境准备:

主机名 ip 系统 内核
sh-master-01 10.3.2.5 centos8 4.18.0-240.15.1.el8_3.x86_64
sh-master-02 10.3.2.13 centos8 4.18.0-240.15.1.el8_3.x86_64
sh-master-03 10.3.2.16 centos8 4.18.0-240.15.1.el8_3.x86_64
sh-work-01 10.3.2.2 centos8 4.18.0-240.15.1.el8_3.x86_64
sh-work-02 10.3.2.2 centos8 4.18.0-240.15.1.el8_3.x86_64
sh-work-03 10.3.2.4 centos8 4.18.0-240.15.1.el8_3.x86_64

注: 用centos8是为了懒升级内核版本了。centos7内核版本3.10确实有些老了。但是同样的centos8 kubernetes源是没有的,只能使用centos7的源。
VIP slb地址:10.3.2.12(因为内网没有使用域名的需求,直接用了传统型内网负载,为了让slb映射端口与本地端口一样中间加了一层haproxy代理本地6443.然后slb代理8443端口为6443.)。
image.png
image.png

1. 系统初始化:

注:由于环境是部署在公有云的,使用了懒人方法。直接初始化了一台server.然后其他的直接都是复制的方式搭建的。

1. 更改主机名

  1. hostnamectl set-hostname sh-master-01
  2. cat /etc/hosts

image.png
就是举个例子了。我的host文件只在三台master节点写了,work节点都没有写的…….

2. 关闭swap交换分区

  1. swapoff -a
  2. sed -i 's/.*swap.*/#&/' /etc/fstab

3. 关闭selinux

  1. setenforce 0
  2. sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/sysconfig/selinux
  3. sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
  4. sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/sysconfig/selinux
  5. sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/selinux/config

4. 关闭防火墙

  1. systemctl disable --now firewalld
  2. chkconfig firewalld off

5. 调整文件打开数等配置

  1. cat> /etc/security/limits.conf <<EOF
  2. * soft nproc 1000000
  3. * hard nproc 1000000
  4. * soft nofile 1000000
  5. * hard nofile 1000000
  6. * soft memlock unlimited
  7. * hard memlock unlimited
  8. EOF

当然了这里最好的其实是/etc/security/limits.d目录下生成一个新的配置文件。避免修改原来的总配置文件、这也是推荐使用的方式。

6. yum update 八仙过海各显神通吧,安装自己所需的习惯的应用

  1. yum update
  2. yum -y install gcc bc gcc-c++ ncurses ncurses-devel cmake elfutils-libelf-devel openssl-devel flex* bison* autoconf automake zlib* fiex* libxml* ncurses-devel libmcrypt* libtool-ltdl-devel* make cmake pcre pcre-devel openssl openssl-devel jemalloc-devel tlc libtool vim unzip wget lrzsz bash-comp* ipvsadm ipset jq sysstat conntrack libseccomp conntrack-tools socat curl wget git conntrack-tools psmisc nfs-utils tree bash-completion conntrack libseccomp net-tools crontabs sysstat iftop nload strace bind-utils tcpdump htop telnet lsof

7. ipvs添加(centos8内核默认4.18.内核4.19不包括4.19的是用这个)

  1. :> /etc/modules-load.d/ipvs.conf
  2. module=(
  3. ip_vs
  4. ip_vs_rr
  5. ip_vs_wrr
  6. ip_vs_sh
  7. br_netfilter
  8. )
  9. for kernel_module in ${module[@]};do
  10. /sbin/modinfo -F filename $kernel_module |& grep -qv ERROR && echo $kernel_module >> /etc/modules-load.d/ipvs.conf || :
  11. done

内核大于等于4.19的

  1. :> /etc/modules-load.d/ipvs.conf
  2. module=(
  3. ip_vs
  4. ip_vs_rr
  5. ip_vs_wrr
  6. ip_vs_sh
  7. nf_conntrack
  8. br_netfilter
  9. )
  10. for kernel_module in ${module[@]};do
  11. /sbin/modinfo -F filename $kernel_module |& grep -qv ERROR && echo $kernel_module >> /etc/modules-load.d/ipvs.conf || :
  12. done

这个地方我想我开不开ipvs应该没有多大关系了吧? 因为我网络组件用的cilium hubble。网络用的是ebpf。没有用iptables ipvs吧?至于配置ipvs算是原来部署养成的习惯
加载ipvs模块

  1. systemctl daemon-reload
  2. systemctl enable --now systemd-modules-load.service

查询ipvs是否加载

  1. # lsmod | grep ip_vs
  2. ip_vs_sh 16384 0
  3. ip_vs_wrr 16384 0
  4. ip_vs_rr 16384 0
  5. ip_vs 172032 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
  6. nf_conntrack 172032 6 xt_conntrack,nf_nat,xt_state,ipt_MASQUERADE,xt_CT,ip_vs
  7. nf_defrag_ipv6 20480 4 nf_conntrack,xt_socket,xt_TPROXY,ip_vs
  8. libcrc32c 16384 3 nf_conntrack,nf_nat,ip_vs

8. 优化系统参数(不一定是最优,各取所有)

  1. cat <<EOF > /etc/sysctl.d/k8s.conf
  2. net.ipv6.conf.all.disable_ipv6 = 1
  3. net.ipv6.conf.default.disable_ipv6 = 1
  4. net.ipv6.conf.lo.disable_ipv6 = 1
  5. net.ipv4.neigh.default.gc_stale_time = 120
  6. net.ipv4.conf.all.rp_filter = 0
  7. net.ipv4.conf.default.rp_filter = 0
  8. net.ipv4.conf.default.arp_announce = 2
  9. net.ipv4.conf.lo.arp_announce = 2
  10. net.ipv4.conf.all.arp_announce = 2
  11. net.ipv4.ip_forward = 1
  12. net.ipv4.tcp_max_tw_buckets = 5000
  13. net.ipv4.tcp_syncookies = 1
  14. net.ipv4.tcp_max_syn_backlog = 1024
  15. net.ipv4.tcp_synack_retries = 2
  16. # 要求iptables不对bridge的数据进行处理
  17. net.bridge.bridge-nf-call-ip6tables = 1
  18. net.bridge.bridge-nf-call-iptables = 1
  19. net.bridge.bridge-nf-call-arptables = 1
  20. net.netfilter.nf_conntrack_max = 2310720
  21. fs.inotify.max_user_watches=89100
  22. fs.may_detach_mounts = 1
  23. fs.file-max = 52706963
  24. fs.nr_open = 52706963
  25. vm.overcommit_memory=1
  26. vm.panic_on_oom=0
  27. vm.swappiness = 0
  28. EOF
  29. sysctl --system

9. containerd安装

dnf 与yum centos8的变化,具体的自己去看了呢。差不多吧…….

  1. dnf install dnf-utils device-mapper-persistent-data lvm2
  2. yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
  3. sudo yum update -y && sudo yum install -y containerd.io
  4. containerd config default > /etc/containerd/config.toml
  5. # 替换 containerd 默认的 sand_box 镜像,编辑 /etc/containerd/config.toml
  6. sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.2"
  7. # 重启containerd
  8. $ systemctl daemon-reload
  9. $ systemctl restart containerd

其他的配置一个是启用SystemdCgroup另外一个是添加了本地镜像库,账号密码(直接使用了腾讯云的仓库)。
image.png

10. 配置 CRI 客户端 crictl

  1. VERSION="v1.21.0"
  2. wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz
  3. sudo tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin
  4. rm -f crictl-$VERSION-linux-amd64.tar.gz
  1. cat <<EOF > /etc/crictl.yaml
  2. runtime-endpoint: unix:///run/containerd/containerd.sock
  3. image-endpoint: unix:///run/containerd/containerd.sock
  4. timeout: 10
  5. debug: false
  6. EOF
  7. # 验证是否可用(可以顺便验证一下私有仓库)
  8. crictl pull nginx:alpine
  9. crictl rmi nginx:alpine
  10. crictl images

11. 安装 Kubeadm(centos8没有对应yum源使用centos7的阿里云yum源)

  1. cat <<EOF > /etc/yum.repos.d/kubernetes.repo
  2. [kubernetes]
  3. name=Kubernetes
  4. baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
  5. enabled=1
  6. gpgcheck=0
  7. repo_gpgcheck=0
  8. gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
  9. EOF
  10. # 删除旧版本,如果安装了
  11. yum remove kubeadm kubectl kubelet kubernetes-cni cri-tools socat
  12. # 查看所有可安装版本 下面两个都可以啊
  13. # yum list --showduplicates kubeadm --disableexcludes=kubernetes
  14. # 安装指定版本用下面的命令
  15. # yum -y install kubeadm-1.20.5 kubectl-1.20.5 kubelet-1.20.5
  16. or
  17. # yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
  18. # 默认安装最新稳定版,当前版本1.20.5
  19. yum install kubeadm
  20. # 开机自启
  21. systemctl enable kubelet.service

12. 修改kubelet配置

  1. vi /etc/sysconfig/kubelet
  2. KUBELET_EXTRA_ARGS= --cgroup-driver=systemd --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock

13 . journal 日志相关避免日志重复搜集,浪费系统资源。修改systemctl启动的最小文件打开数量,关闭ssh反向dns解析.设置清理日志,最大200m(可根据个人需求设置)

  1. sed -ri 's/^\$ModLoad imjournal/#&/' /etc/rsyslog.conf
  2. sed -ri 's/^\$IMJournalStateFile/#&/' /etc/rsyslog.conf
  3. sed -ri 's/^#(DefaultLimitCORE)=/\1=100000/' /etc/systemd/system.conf
  4. sed -ri 's/^#(DefaultLimitNOFILE)=/\1=100000/' /etc/systemd/system.conf
  5. sed -ri 's/^#(UseDNS )yes/\1no/' /etc/ssh/sshd_config
  6. journalctl --vacuum-size=200M

2. master节点操作

1 . 安装haproxy

  1. yum install haproxy
  1. cat <<EOF > /etc/haproxy/haproxy.cfg
  2. #---------------------------------------------------------------------
  3. # Example configuration for a possible web application. See the
  4. # full configuration options online.
  5. #
  6. # http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
  7. #
  8. #---------------------------------------------------------------------
  9. #---------------------------------------------------------------------
  10. # Global settings
  11. #---------------------------------------------------------------------
  12. global
  13. # to have these messages end up in /var/log/haproxy.log you will
  14. # need to:
  15. #
  16. # 1) configure syslog to accept network log events. This is done
  17. # by adding the '-r' option to the SYSLOGD_OPTIONS in
  18. # /etc/sysconfig/syslog
  19. #
  20. # 2) configure local2 events to go to the /var/log/haproxy.log
  21. # file. A line like the following can be added to
  22. # /etc/sysconfig/syslog
  23. #
  24. # local2.* /var/log/haproxy.log
  25. #
  26. log 127.0.0.1 local2
  27. chroot /var/lib/haproxy
  28. pidfile /var/run/haproxy.pid
  29. maxconn 4000
  30. user haproxy
  31. group haproxy
  32. daemon
  33. # turn on stats unix socket
  34. stats socket /var/lib/haproxy/stats
  35. #---------------------------------------------------------------------
  36. # common defaults that all the 'listen' and 'backend' sections will
  37. # use if not designated in their block
  38. #---------------------------------------------------------------------
  39. defaults
  40. mode tcp
  41. log global
  42. option tcplog
  43. option dontlognull
  44. option http-server-close
  45. option forwardfor except 127.0.0.0/8
  46. option redispatch
  47. retries 3
  48. timeout http-request 10s
  49. timeout queue 1m
  50. timeout connect 10s
  51. timeout client 1m
  52. timeout server 1m
  53. timeout http-keep-alive 10s
  54. timeout check 10s
  55. maxconn 3000
  56. #---------------------------------------------------------------------
  57. # main frontend which proxys to the backends
  58. #---------------------------------------------------------------------
  59. frontend kubernetes
  60. bind *:8443 #配置端口为8443
  61. mode tcp
  62. default_backend kubernetes
  63. #---------------------------------------------------------------------
  64. # static backend for serving up images, stylesheets and such
  65. #---------------------------------------------------------------------
  66. backend kubernetes #后端服务器,也就是说访问10.3.2.12:6443会将请求转发到后端的三台,这样就实现了负载均衡
  67. balance roundrobin
  68. server master1 10.3.2.5:6443 check maxconn 2000
  69. server master2 10.3.2.13:6443 check maxconn 2000
  70. server master3 10.3.2.16:6443 check maxconn 2000
  71. EOF
  72. systemctl enable haproxy && systemctl start haproxy && systemctl status haproxy

嗯 slb绑定端口
image.png

2. sh-master-01节点初始化

1.生成config配置文件

  1. kubeadm config print init-defaults > config.yaml

下面的图就是举个例子…….
image.png

2. 修改kubeadm初始化文件

  1. apiVersion: kubeadm.k8s.io/v1beta2
  2. bootstrapTokens:
  3. - groups:
  4. - system:bootstrappers:kubeadm:default-node-token
  5. token: abcdef.0123456789abcdef
  6. ttl: 24h0m0s
  7. usages:
  8. - signing
  9. - authentication
  10. kind: InitConfiguration
  11. localAPIEndpoint:
  12. advertiseAddress: 10.3.2.5
  13. bindPort: 6443
  14. nodeRegistration:
  15. criSocket: /run/containerd/containerd.sock
  16. name: sh-master-01
  17. taints:
  18. - effect: NoSchedule
  19. key: node-role.kubernetes.io/master
  20. ---
  21. apiServer:
  22. timeoutForControlPlane: 4m0s
  23. certSANs:
  24. - sh-master-01
  25. - sh-master-02
  26. - sh-master-03
  27. - sh-master.k8s.io
  28. - localhost
  29. - 127.0.0.1
  30. - 10.3.2.5
  31. - 10.3.2.13
  32. - 10.3.2.16
  33. - 10.3.2.12
  34. apiVersion: kubeadm.k8s.io/v1beta2
  35. certificatesDir: /etc/kubernetes/pki
  36. clusterName: kubernetes
  37. controlPlaneEndpoint: "10.3.2.12:6443"
  38. controllerManager: {}
  39. dns:
  40. type: CoreDNS
  41. etcd:
  42. local:
  43. dataDir: /var/lib/etcd
  44. imageRepository: registry.aliyuncs.com/google_containers
  45. kind: ClusterConfiguration
  46. kubernetesVersion: v1.20.5
  47. networking:
  48. dnsDomain: xx.daemon
  49. serviceSubnet: 172.254.0.0/16
  50. podSubnet: 172.3.0.0/16
  51. scheduler: {}

修改的地方在下图中做了标识
image.png

3. kubeadm master-01节点初始化(屏蔽kube-proxy)。

  1. kubeadm init --skip-phases=addon/kube-proxy --config=config.yaml

安装成功截图就忽略了,后写的笔记没有保存截图。成功的日志中包含

  1. mkdir -p $HOME/.kube
  2. mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  3. sudo chown $(id -u):$(id -g) $HOME/.kube/config
  4. 按照输出sh-master-02 sh-master-03节点加入集群
  5. sh-master-01 /etc/kubernetes/pki目录下ca.* sa.* front-proxy-ca.* etcd/ca* 打包分发到sh-master-02,sh-master-03 /etc/kubernetes/pki目录下
  6. kubeadm join 10.3.2.12:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:eb0fe00b59fa27f82c62c91def14ba294f838cd0731c91d0d9c619fe781286b6 --control-plane
  7. 然后同sh-master-01一样执行一遍下面的命令:
  8. mkdir -p $HOME/.kube
  9. sudo \cp /etc/kubernetes/admin.conf $HOME/.kube/config
  10. sudo chown $(id -u):$(id -g) $HOME/.kube/config

3. helm 安装 部署cilium 与hubble(默认helm3了)

1. 下载helm并安装helm

注: 由于网络原因。下载helm安装包下载不动经常,直接github下载到本地了
image.png

  1. tar zxvf helm-v3.5.3-linux-amd64.tar.gz
  2. cp helm /usr/bin/

2 . helm 安装cilium hubble

早先版本 cilium 与hubble是分开的现在貌似都集成了一波流走一遍:

  1. helm install cilium cilium/cilium --version 1.9.5
  2. --namespace kube-system
  3. --set nodeinit.enabled=true
  4. --set externalIPs.enabled=true
  5. --set nodePort.enabled=true
  6. --set hostPort.enabled=true
  7. --set pullPolicy=IfNotPresent
  8. --set config.ipam=cluster-pool
  9. --set hubble.enabled=true
  10. --set hubble.listenAddress=":4244"
  11. --set hubble.relay.enabled=true
  12. --set hubble.metrics.enabled="{dns,drop,tcp,flow,port-distribution,icmp,http}"
  13. --set prometheus.enabled=true
  14. --set peratorPrometheus.enabled=true
  15. --set hubble.ui.enabled=true
  16. --set kubeProxyReplacement=strict
  17. --set k8sServiceHost=10.3.2.12
  18. --set k8sServicePort=6443

部署成功就是这样的
image.png
image.png
嗯 木有kube-proxy的(截图是work加点加入后的故node-init cilium pod都有6个)

4. work节点部署

sh-work-01 sh-work-02 sh-work-03节点加入集群

  1. kubeadm join 10.3.2.12:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:eb0fe00b59fa27f82c62c91def14ba294f838cd0731c91d0d9c619fe781286b6

5. master节点验证

随便一台master节点 。默认master-01节点
image.png
容易出错 的地方

  1. 关于slb绑定。绑定一台server然后kubeadm init是容易出差的 slb 端口与主机端口一样。自己连自己是不可以的….不明觉厉。试了好几次。最后绑定三个都先启动了haproxy。
  2. cilium依赖于BPF要先确认下系统是否挂载了BPF文件系统(我的是检查了默认启用了)
    1. [root@sh-master-01 manifests]# mount |grep bpf
    2. bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)

3.关于kubernetes的配置Cgroup设置与containerd一直都用了system,记得检查

  1. KUBELET_EXTRA_ARGS= --cgroup-driver=systemd --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock
  1. 在 kube-controller-manager 中使能 PodCIDR

    在 controller-manager.config 中添加--allocate-node-cidrs=true

    6. 其他

    1. 验证下hubble hubble ui

    image.png

    1. kubectl edit svc hubble-ui -n kube-system

    修改为NodePort 先测试一下。后面会用traefik代理
    image.png
    work or master节点随便一个公网IP+nodeport访问
    image.png

    2 .将ETCDCTL工具部署在容器外

    很多时候要用etcdctl还要进入容器 比较麻烦,把etcdctl工具直接提取到master01节点docker有copy的命令 containerd不会玩了 直接github仓库下载etcdctl
    image.png ``` tar zxvf etcd-v3.4.15-linux-amd64.tar.gz cd etcd-v3.4.15-linux-amd64/ cp etcdctl /usr/local/bin/etcdctl

cat >/etc/profile.d/etcd.sh<<’EOF’ ETCD_CERET_DIR=/etc/kubernetes/pki/etcd/ ETCD_CA_FILE=ca.crt ETCD_KEY_FILE=healthcheck-client.key ETCD_CERT_FILE=healthcheck-client.crt ETCD_EP=https://10.3.2.5:2379,https://10.3.2.13:2379,https://10.3.2.16:2379

alias etcd_v3=”ETCDCTL_API=3 \ etcdctl \ —cert ${ETCD_CERET_DIR}/${ETCD_CERT_FILE} \ —key ${ETCD_CERET_DIR}/${ETCD_KEY_FILE} \ —cacert ${ETCD_CERET_DIR}/${ETCD_CA_FILE} \ —endpoints $ETCD_EP” EOF source /etc/profile.d/etcd.sh ``` 验证etcd
etcd_v3 endpoint status —write-out=table
image.png

总结

综合以上。基本环境算是安装完了,由于文章是后写的,可能有些地方没有写清楚,想起来了再补呢