1 K8s官方说明的两种HA拓扑
根据K8s官方文档将HA拓扑分为两种,Stacked etcd topology(堆叠ETCD)和External etcd topology(外部ETCD)。 https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/ha-topology/
1.1 堆叠ETCD(Stacked etcd topology)
每个master节点上运行一个apiserver和etcd, etcd只与本节点apiserver通信
1.2 外部ETCD(External etcd topology)
etcd集群运行在单独的主机上,每个etcd都与apiserver节点通信。
官方文档主要是解决了高可用场景下apiserver与etcd集群的关系, 三master节点防止单点故障。但是集群对外访问接口不可能将三个apiserver都暴露出去,一个挂掉时还是不能自动切换到其他节点。官方文档只提到了一句“使用负载均衡器将apiserver暴露给工作程序节点”,而这恰恰是生产环境中需要解决的重点问题。
Notes: 此处的负载均衡器并不是kube-proxy,此处的Load Balancer是针对apiserver的。
2 部署架构
2.1 架构图
2.2 集群模式高可用架构说明
| 核心组件 | 高可用模式 | 高可用实现方式 |
|---|---|---|
| apiserver | 集群 | lvs+keepalived |
| controller-manager | 主备 | leader election |
| scheduler | 主备 | leader election |
| etcd | 集群 | kubeadm |
- 由外部负载均衡器提供一个vip,流量负载到keepalived master节点上。
- 当keepalived节点出现故障, vip自动漂到其他可用节点。
- apiserver 通过lvs-keepalived实现高可用,haproxy负责将流量负载到各个control plane节点上。
- controller-manager k8s内部通过选举方式产生领导者(由–leader-elect 选型控制,默认为true),同一时刻集群内只有一个controller-manager组件运行;
- scheduler k8s内部通过选举方式产生领导者(由–leader-elect 选型控制,默认为true),同一时刻集群内只有一个scheduler组件运行;
- 三个apiserver会同时工作,但k8s中controller-manager和scheduler只会有一个工作,其余处于backup状态。我猜测apiserver主要是读写数据库,数据一致性的问题由数据库保证,此外apiserver是k8s中最繁忙的组件,多个同时工作也有利于减轻压力。而controller-manager和scheduler主要处理执行逻辑,多个大脑同时运作可能会引发混乱。
- etcd 通过运行kubeadm方式自动创建集群来实现高可用,部署的节点数为奇数,3节点方式最多容忍一台机器宕机。
2.2 集群规划
(示例集群操作系统为centos 7.6)
| 主机名 | Centos版本 | ip | docker version | flannel version | Keepalived version | 主机配置 | 备注 |
|---|---|---|---|---|---|---|---|
| VIP | 7.6.1810 | 172.17.50.1 | / | / | v1.3.5 | 在haproxy-keepalived两台主机上浮动 | |
| haproxy-keepalived01 | 7.6.1810 | 172.17.50.2 | / | / | v1.3.5 | 2C4G | haproxy |
| haproxy-keepalived02 | 7.6.1810 | 172.17.50.3 | / | / | v1.3.5 | 2C4G | haproxy keepalived |
| master01 | 7.6.1810 | 172.17.50.4 | 18.09.9 | v0.11.0 | / | 4C4G | control plane |
| master02 | 7.6.1810 | 172.17.50.5 | 18.09.9 | v0.11.0 | / | 4C4G | control plane |
| master03 | 7.6.1810 | 172.17.50.6 | 18.09.9 | v0.11.0 | / | 4C4G | control plane |
| work01 | 7.6.1810 | 172.17.50.7 | 18.09.9 | / | / | 4C4G | worker nodes |
| work02 | 7.6.1810 | 172.17.50.8 | 18.09.9 | / | / | 4C4G | worker nodes |
| work03 | 7.6.1810 | 172.17.50.9 | 18.09.9 | / | / | 4C4G | worker nodes |
2.3 部署haproxy
haproxy提供高可用性,负载均衡,基于TCP和HTTP的代理,支持数以万记的并发连接。https://github.com/haproxy/haproxy
haproxy可安装在主机上,也可使用docker容器实现。文本采用第一种。
创建配置文件/etc/haproxy/haproxy.cfg,重要配置以中文注释标出:
安装haproxy
参考:https://www.yuque.com/malu/qg0c26/ehyemg
配置haproxy (两台lb机器配置一致即可,注意替换后端服务地址)
#全局配置global#设置日志log 127.0.0.1 local3 info#用户与用户组user haproxygroup haproxy#守护进程启动daemon#最大连接数maxconn 4000nbproc 1pidfile /var/run/haproxy.pid#默认配置defaultsmode httplog globalretries 3option httplogoption dontlognulloption httpcloseoption forwardfortimeout connect 5000timeout client 50000timeout server 50000#前端配置,frontend 后面名称可自定义frontend pd-dev-k8s-kube-apiservermode tcpoption tcplog#发起http请求到bind 后面端口,会被转发到设置的ip及端口bind *:16443default_backend pd-dev-kube-apiserver#后端配置,backend 后面名称可自定义backend pd-dev-kube-apiservermode tcpoption tcp-checkoption redispatchoption abortonclose#负载均衡方式#source 根据请求源IP#static-rr 根据权重#leastconn 最少连接者先处理#uri 根据请求的uri#url_param 根据请求的url参数#rdp-cookie 据据cookie(name)来锁定并哈希每一次请求#hdr(name) 根据HTTP请求头来锁定每一次HTTP请求#roundrobin 轮询方式balance roundrobincookie SERVERID#传递客户端真实IPoption forwardfor header X-Forwarded-Fordefault-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100server kube-apiserver-1 172.17.50.51:6443 check on-marked-down shutdown-sessionsserver kube-apiserver-2 172.17.50.52:6443 check on-marked-down shutdown-sessionsserver kube-apiserver-3 172.17.50.53:6443 check on-marked-down shutdown-sessions#前端配置,frontend 后面名称可自定义frontend pd-dev-k8s-front-web-8843mode tcpoption tcplog#发起http请求到bind 后面端口,会被转发到设置的ip及端口bind *:8843default_backend pd-dev-k8s-ingress-nginx#后端配置,backend 后面名称可自定义backend pd-dev-k8s-ingress-nginxmode tcpoption tcp-checkoption redispatchoption abortonclose#负载均衡方式#source 根据请求源IP#static-rr 根据权重#leastconn 最少连接者先处理#uri 根据请求的uri#url_param 根据请求的url参数#rdp-cookie 据据cookie(name)来锁定并哈希每一次请求#hdr(name) 根据HTTP请求头来锁定每一次HTTP请求#roundrobin 轮询方式balance roundrobincookie SERVERID#传递客户端真实IPoption forwardfor header X-Forwarded-Fordefault-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100server kube-apiserver-1 172.17.50.51:8843 check on-marked-down shutdown-sessionsserver kube-apiserver-2 172.17.50.52:8843 check on-marked-down shutdown-sessionsserver kube-apiserver-3 172.17.50.53:8843 check on-marked-down shutdown-sessions#前端配置,frontend 后面名称可自定义frontend pd-test-k8s-kube-apiservermode tcpoption tcplog#发起http请求到bind 后面的端口,会被转发到设置的ip及端口bind *:26443default_backend pd-test-kube-apiserver#后端配置,backend 后面名称可自定义backend pd-test-kube-apiservermode tcpoption tcp-checkoption redispatchoption abortonclose#负载均衡方式#source 根据请求源IP#static-rr 根据权重#leastconn 最少连接者先处理#uri 根据请求的uri#url_param 根据请求的url参数#rdp-cookie 据据cookie(name)来锁定并哈希每一次请求#hdr(name) 根据HTTP请求头来锁定每一次HTTP请求#roundrobin 轮询方式balance roundrobincookie SERVERID#传递客户端真实IPoption forwardfor header X-Forwarded-Fordefault-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100server kube-apiserver-1 172.17.50.57:6443 check on-marked-down shutdown-sessionsserver kube-apiserver-2 172.17.50.58:6443 check on-marked-down shutdown-sessionsserver kube-apiserver-3 172.17.50.59:6443 check on-marked-down shutdown-sessions#前端配置,frontend 后面名称可自定义frontend pd-test-k8s-front-web-8943mode tcpoption tcplog#发起http请求到bind 后面的端口,会被转发到设置的ip及端口bind *:8943default_backend pd-test-k8s-ingress-nginx#后端配置,backend 后面名称可自定义backend pd-test-k8s-ingress-nginxmode tcpoption tcp-checkoption redispatchoption abortonclose#负载均衡方式#source 根据请求源IP#static-rr 根据权重#leastconn 最少连接者先处理#uri 根据请求的uri#url_param 根据请求的url参数#rdp-cookie 据据cookie(name)来锁定并哈希每一次请求#hdr(name) 根据HTTP请求头来锁定每一次HTTP请求#roundrobin 轮询方式balance roundrobincookie SERVERID#传递客户端真实IPoption forwardfor header X-Forwarded-Fordefault-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100server kube-apiserver-1 172.17.50.57:8943 check on-marked-down shutdown-sessionsserver kube-apiserver-2 172.17.50.58:8943 check on-marked-down shutdown-sessionsserver kube-apiserver-3 172.17.50.59:8943 check on-marked-down shutdown-sessions# haproxy查看状态listen admin_statsbind *:9188#mode httpstats auth admin:admin123stats refresh 30sstats uri /haproxy-statusstats realm welcome login\ Haproxystats hide-versionstats admin if TRUE
启动haproxy,并设置开机自启动
systemctl restart haproxy
systemctl enable haproxy
2.4 部署keepalived
keepalived是以VRRP(虚拟路由冗余协议)协议为基础, 包括一个master和多个backup。 master劫持vip对外提供服务。master发送组播,backup节点收不到vrrp包时认为master宕机,此时选出剩余优先级最高的节点作为新的master, 劫持vip。keepalived是保证高可用的重要组件。
keepalived可安装在主机上,也可使用docker容器实现。文本采用第一种。
安装keepalived
参考:https://www.yuque.com/malu/qg0c26/dnda1d
配置keepalived (注释的地方,根据实际环境填写,每台机器配置不同)
[root@keepalived01 ~]# more /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
# vrrp_strict
}
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance haproxy01 { #vrrp实例定义部分
state MASTER #设置lvs的状态,MASTER和BACKUP两种,必须大写
interface ens192 #设置对外服务的接口
virtual_router_id 100 #设置虚拟路由标示,这个标示是一个数字,同一个vrrp实例使用唯一标示
priority 100 #定义优先级,数字越大优先级越高,在一个vrrp——instance下,master的优先级必须大于backup
advert_int 1 #设定master与backup负载均衡器之间同步检查的时间间隔,单位是秒
authentication { #设置验证类型和密码
auth_type PASS #主要有PASS和AH两种
auth_pass 1111 #验证密码,同一个vrrp_instance下MASTER和BACKUP密码必须相同
}
virtual_ipaddress { #设置虚拟ip地址,可以设置多个,每行一个
172.17.50.75
}
track_script {
chk_haproxy
}
}
启动keepalived,设置开机自启动
systemctl restart keepalived
systemctl enable keepalived
验证keepalived + haproxy 可用性
(1) 使用ip a s查看各lb节点vip绑定情况
(2) 暂停vip所在节点haproxy:systemctl stop haproxy
(3) 再次使用ip a s查看各lb节点vip绑定情况,查看vip是否发生漂移
3 使用kubekey部署k8s集群
3.1 创建配置文件
./kk create config --with-kubesphere #创建包含kubesphere的配置文件
填写配置文件 (根据机器实际情况填写相关信息, controlPlaneEndpoint.address填入vip地址)
# config-example.yaml
apiVersion: kubekey.kubesphere.io/v1alpha1
kind: Cluster
metadata:
name: config-sample
spec:
hosts:
- {name: node1, address: 192.168.0.81, internalAddress: 192.168.0.81, password: EulerOS@2.5}
- {name: node2, address: 192.168.0.82, internalAddress: 192.168.0.82, password: EulerOS@2.5}
- {name: node3, address: 192.168.0.83, internalAddress: 192.168.0.83, password: EulerOS@2.5}
- {name: node4, address: 192.168.0.84, internalAddress: 192.168.0.84, password: EulerOS@2.5}
- {name: node5, address: 192.168.0.85, internalAddress: 192.168.0.85, password: EulerOS@2.5}
- {name: node6, address: 192.168.0.86, internalAddress: 192.168.0.86, password: EulerOS@2.5}
roleGroups:
etcd:
- node[1:3]
master:
- node[1:3]
worker:
- node[4:6]
controlPlaneEndpoint:
domain: lb.kubesphere.local
address: 192.168.0.100 # vip
port: 6443
kubernetes:
version: v1.17.6
imageRepo: kubesphere
clusterName: cluster.local
network:
plugin: calico
kube_pods_cidr: 10.233.64.0/18
kube_service_cidr: 10.233.0.0/18
registry:
registryMirrors: []
insecureRegistries: []
storage:
defaultStorageClass: localVolume
localVolume:
storageClassName: local
---
apiVersion: v1
data:
ks-config.yaml: |
---
local_registry: ""
persistence:
storageClass: ""
etcd:
monitoring: true
endpointIps: 192.168.0.7,192.168.0.8,192.168.0.9
port: 2379
tlsEnable: true
common:
mysqlVolumeSize: 20Gi
minioVolumeSize: 20Gi
etcdVolumeSize: 20Gi
openldapVolumeSize: 2Gi
redisVolumSize: 2Gi
console:
enableMultiLogin: False # enable/disable multi login
port: 30880
monitoring:
prometheusReplicas: 1
prometheusMemoryRequest: 400Mi
prometheusVolumeSize: 20Gi
grafana:
enabled: false
notification:
enabled: false
logging:
enabled: false
elasticsearchMasterReplicas: 1
elasticsearchDataReplicas: 1
logsidecarReplicas: 2
elasticsearchMasterVolumeSize: 4Gi
elasticsearchDataVolumeSize: 20Gi
logMaxAge: 7
elkPrefix: logstash
containersLogMountedPath: ""
kibana:
enabled: false
events:
enabled: false
auditing:
enabled: false
openpitrix:
enabled: false
devops:
enabled: false
jenkinsMemoryLim: 2Gi
jenkinsMemoryReq: 1500Mi
jenkinsVolumeSize: 8Gi
jenkinsJavaOpts_Xms: 512m
jenkinsJavaOpts_Xmx: 512m
jenkinsJavaOpts_MaxRAM: 2g
sonarqube:
enabled: false
postgresqlVolumeSize: 8Gi
servicemesh:
enabled: false
notification:
enabled: false
alerting:
enabled: false
metrics_server:
enabled: false
weave_scope:
enabled: false
kind: ConfigMap
metadata:
name: ks-installer
namespace: kubesphere-system
labels:
version: v3.0.0
3.2 执行部署
./kk create cluster -f config-example.yaml
3.3 登录kubesphere界面
等待部署完成,登录kubesphere界面
