• etcd介绍
  • etcd集群搭建 in centos7
  • etcd集群搭建 in k8s

想把任何服务移植到k8s里部署,亦或是自己写Operator,前提是自己对这个服务本身非常了解,包括原理、功能、部署、使用等,要不然会玩出火的,
要么部署阶段就失败,要么后期无法运维,要么卡在某点烂尾了。

引导 etcd 集群的启动有以下三种机制:

  • 静态
  • etcd 动态发现
  • DNS 发现

静态启动 etcd 集群要求每个成员都知道集群中的另一个成员。 在许多情况下,群集成员的 IP 可能未知,在这些情况下,可以在发现服务的帮助下引导 etcd 集群。
可以使用官方提供的工具来生成 etcd 集群的配置: http://play.etcd.io/install
这里我们将主要介绍静态方式启动 etcd 集群。

一 、etcd集群搭建 in centos7


由于 etcd 是基于 raft 分布式协议的集群,所以如果要组成一个集群,集群的数量需要奇数台。

1 集群规划

主机IP 客户端交互端口 peer通信端口 etcd节点名
192.168.255.120 2379 2380 etcd-1
192.168.255.121 2379 2380 etcd-2
192.168.255.122 2379 2380 etcd-3

2 下载安装包

直接从 github release 页面下载对应的包即可,然后在三个节点安装etcd服务。

  1. wget https://github.com/etcd-io/etcd/releases/download/v3.4.13/etcd-v3.4.13-linux-amd64.tar.gz
  2. $ tar -xvf etcd-v3.4.13-linux-amd64.tar.gz
  3. $ mkdir /tmp/etcd
  4. $ mv etcd-v3.4.13-linux-amd64/etcd /tmp/etcd/
  5. $ mv etcd-v3.4.13-linux-amd64/etcdctl /tmp/etcd

3 启动服务

在三个节点以集群模式启动三个etcd服务。

  1. # 确保 etcd 进程对数据目录具有写访问权
  2. # 如果集群是新集群,则删除该目录;如果重新启动则保留该目录
  3. # 启动第一个节点
  4. $ /tmp/etcd/etcd --name etcd-1 \ # etcd 节点名称
  5. --data-dir /tmp/etcd/data \ # 数据存储目录
  6. --listen-client-urls http://192.168.255.120:2379 \ # 本节点访问地址
  7. --advertise-client-urls http://192.168.255.120:2379 \ # 用于通知其他 ETCD 节点,客户端接入本节点的监听地址
  8. --listen-peer-urls http://192.168.255.120:2380 \ # 本节点与其他节点进行数据交换的监听地址
  9. --initial-advertise-peer-urls http://192.168.255.120:2380 \ # 通知其他节点与本节点进行数据交换的地址
  10. --initial-cluster etcd-1=http://192.168.255.120:2380,etcd-2=http://192.168.255.121:2380,etcd-3=http://192.168.255.122:2380 \ # 集群所有节点配置
  11. --initial-cluster-token tkn \ # 集群唯一标识
  12. --initial-cluster-state new # 节点初始化方式
  13. # 启动第二个节点
  14. $ /tmp/etcd/etcd --name etcd-2 \
  15. --data-dir /tmp/etcd/data \
  16. --listen-client-urls http://192.168.255.121:2379 \
  17. --advertise-client-urls http://192.168.255.121:2379 \
  18. --listen-peer-urls http://192.168.255.121:2380 \
  19. --initial-advertise-peer-urls http://192.168.255.121:2380 \
  20. --initial-cluster etcd-1=http://192.168.255.120:2380,etcd-2=http://192.168.255.121:2380,etcd-3=http://192.168.255.122:2380 \
  21. --initial-cluster-token tkn \
  22. --initial-cluster-state new
  23. # 启动第三个节点
  24. $ /tmp/etcd/etcd --name etcd-3 \
  25. --data-dir /tmp/etcd/data \
  26. --listen-client-urls http://192.168.255.122:2379 \
  27. --advertise-client-urls http://192.168.255.122:2379 \
  28. --listen-peer-urls http://192.168.255.122:2380 \
  29. --initial-advertise-peer-urls http://192.168.255.122:2380 \
  30. --initial-cluster etcd-1=http://192.168.255.120:2380,etcd-2=http://192.168.255.121:2380,etcd-3=http://192.168.255.122:2380 \
  31. --initial-cluster-token tkn \
  32. --initial-cluster-state new

4 查看集群状态

正常启动完成后,我们可以使用 etcdctl 命令来查看集群的状态:

$ ETCDCTL_API=3 /tmp/etcd/etcdctl \
  --endpoints 192.168.255.120:2379,192.168.255.121:2379,192.168.255.122:2379 \
  endpoint health
 192.168.255.120:2379 is healthy: successfully committed proposal: took = 14.22105ms
 192.168.255.121:2379 is healthy: successfully committed proposal: took = 13.058173ms
 192.168.255.122:2379 is healthy: successfully committed proposal: took = 16.497453ms

$ ETCDCTL_API=3 /tmp/etcd/etcdctl \
  --endpoints 192.168.255.120:2379,192.168.255.121:2379,192.168.255.122:2379 \ 
  endpoint status --write-out=table
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|       ENDPOINT       |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.255.120:2379 | 7339c4e5e833c029 |  3.4.13 |   20 kB |      true |      false |        43 |          9 |                  9 |        |
| 192.168.255.121:2379 | 729934363faa4a24 |  3.4.13 |   20 kB |     false |      false |        43 |          9 |                  9 |        |
| 192.168.255.122:2379 |  b548c2511513015 |  3.4.13 |   20 kB |     false |      false |        43 |          9 |                  9 |        |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

5 集群测试

从上面可以看到3个 etcd 节点都运行成功了,也可以查看当前集群的 LEADER 节点为192.168.255.120。如果这个时候我们把 192.168.255.120的2379 端口的进程杀掉,再来查看集群的状态:

$ ETCDCTL_API=3 /tmp/etcd/etcdctl   --endpoints 192.168.255.120:2379,192.168.255.121:2379,192.168.255.122:2379  endpoint status --write-out=table
{"level":"warn","ts":"2020-11-16T14:39:25.024+0800","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"passthrough:///192.168.255.120:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: connection error: desc = \"transport: Error while dialing dial tcp [::1]:2379: connect: connection refused\""}
Failed to get the status of endpoint localhost:2379 (context deadline exceeded)
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|       ENDPOINT       |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.255.121:2379 | 729934363faa4a24 |  3.4.13 |   20 kB |      true |      false |        44 |         10 |                 10 |        |
| 192.168.255.122:2379 |  b548c2511513015 |  3.4.13 |   20 kB |     false |      false |        44 |         10 |                 10 |        |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

可以看到现在集群中只有两个节点,但是现在集群还是可以正常使用的,因为我们一共3个节点,现在还有超过一一半的节点正常,那么集群就是正常的,当然如果我们再关闭一个节点那么我们集群就将处于不健康的状态下了。

二、etcd集群搭建 in k8s

1 思路

这里我们可以使用 StatefulSet 这个控制器来运行 etcd 集群,etcd 集群的编排的资源清单文件我们可以使用 Kubernetes 源码中提供的,位于目录:test/e2e/testing-manifests/statefulset/etcd 下面。

2 编写清单文件

需要三个文件:
pdb.yaml : 用来保证 etcd 的高可用的一个 PodDisruptionBudget 资源对象;
service.yaml : StatefulSet 使用的 headless service ,暴露服务;
statefulset.yaml : 最重要的当然是 statefulset.yaml 文件;

# vim pdb.yaml

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: etcd-pdb
  labels:
    pdb: etcd
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: etcd

# vim service.yaml

apiVersion: v1
kind: Service
metadata:
  name: etcd
  labels:
    app: etcd
spec:
  ports:
  - port: 2380
    name: etcd-server
  - port: 2379
    name: etcd-client
  clusterIP: None
  selector:
    app: etcd
  publishNotReadyAddresses: true

最重要的当然是 statefulset.yaml 文件,但是这个文件有很多 bug,比如在上面参数配置的时候我们就提到过 —listen-peer-urls 和 —listen-client-urls 这两个参数的值是不支持域名的绑定形式的,而这里不能使用http://${HOSTNAME}.${SET_NAME} 这个 FQDN 形式的域名绑定,我们需要将其修改为 IP 的形式,要获取 Pod 的 IP 地址也很简单,我们可以通过 Kubernetes 提供的 Downward API 来注入一个 POD_IP 的环境变量来获取,然后将这两个参数值改成 http://${POD_IP}:PORT 即可。

# vim statefulset.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: etcd
  name: etcd
spec:
  replicas: 3
  selector:
    matchLabels:
      app: etcd
  serviceName: etcd
  template:
    metadata:
      labels:
        app: etcd
    spec:
      containers:
        - name: etcd
          image: cnych/etcd:v3.4.13
          imagePullPolicy: IfNotPresent
          ports:
          - containerPort: 2380
            name: peer
            protocol: TCP
          - containerPort: 2379
            name: client
            protocol: TCP
          env:
          - name: INITIAL_CLUSTER_SIZE
            value: "3"
          - name: MY_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          - name: POD_IP
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
          - name: SET_NAME
            value: "etcd"
          command:
            - /bin/sh
            - -ec
            - |
              HOSTNAME=$(hostname)

              ETCDCTL_API=3

              eps() {
                  EPS=""
                  for i in $(seq 0 $((${INITIAL_CLUSTER_SIZE} - 1))); do
                      EPS="${EPS}${EPS:+,}http://${SET_NAME}-${i}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2379"
                  done
                  echo ${EPS}
              }

              member_hash() {
                  etcdctl member list | grep -w "$HOSTNAME" | awk '{ print $1}' | awk -F "," '{ print $1}'
              }

              initial_peers() {
                  PEERS=""
                  for i in $(seq 0 $((${INITIAL_CLUSTER_SIZE} - 1))); do
                    PEERS="${PEERS}${PEERS:+,}${SET_NAME}-${i}=http://${SET_NAME}-${i}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2380"
                  done
                  echo ${PEERS}
              }

              # etcd-SET_ID
              SET_ID=${HOSTNAME##*-}

              # adding a new member to existing cluster (assuming all initial pods are available)
              if [ "${SET_ID}" -ge ${INITIAL_CLUSTER_SIZE} ]; then
                  # export ETCDCTL_ENDPOINTS=$(eps)
                  # member already added?

                  MEMBER_HASH=$(member_hash)
                  if [ -n "${MEMBER_HASH}" ]; then
                      # the member hash exists but for some reason etcd failed
                      # as the datadir has not be created, we can remove the member
                      # and retrieve new hash
                      echo "Remove member ${MEMBER_HASH}"
                      etcdctl --endpoints=$(eps) member remove ${MEMBER_HASH}
                  fi

                  echo "Adding new member"

                  echo "etcdctl --endpoints=$(eps) member add ${HOSTNAME} --peer-urls=http://${HOSTNAME}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2380"
                  etcdctl member --endpoints=$(eps) add ${HOSTNAME} --peer-urls=http://${HOSTNAME}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2380 | grep "^ETCD_" > /var/run/etcd/new_member_envs

                  if [ $? -ne 0 ]; then
                      echo "member add ${HOSTNAME} error."
                      rm -f /var/run/etcd/new_member_envs
                      exit 1
                  fi

                  echo "==> Loading env vars of existing cluster..."
                  sed -ie "s/^/export /" /var/run/etcd/new_member_envs
                  cat /var/run/etcd/new_member_envs
                  . /var/run/etcd/new_member_envs

                  echo "etcd --name ${HOSTNAME} --initial-advertise-peer-urls ${ETCD_INITIAL_ADVERTISE_PEER_URLS} --listen-peer-urls http://${POD_IP}:2380 --listen-client-urls http://${POD_IP}:2379,http://127.0.0.1:2379 --advertise-client-urls http://${HOSTNAME}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2379 --data-dir /var/run/etcd/default.etcd --initial-cluster ${ETCD_INITIAL_CLUSTER} --initial-cluster-state ${ETCD_INITIAL_CLUSTER_STATE}"

                  exec etcd --listen-peer-urls http://${POD_IP}:2380 \
                      --listen-client-urls http://${POD_IP}:2379,http://127.0.0.1:2379 \
                      --advertise-client-urls http://${HOSTNAME}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2379 \
                      --data-dir /var/run/etcd/default.etcd
              fi

              for i in $(seq 0 $((${INITIAL_CLUSTER_SIZE} - 1))); do
                  while true; do
                      echo "Waiting for ${SET_NAME}-${i}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local to come up"
                      ping -W 1 -c 1 ${SET_NAME}-${i}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local > /dev/null && break
                      sleep 1s
                  done
              done

              echo "join member ${HOSTNAME}"
              # join member
              exec etcd --name ${HOSTNAME} \
                  --initial-advertise-peer-urls http://${HOSTNAME}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2380 \
                  --listen-peer-urls http://${POD_IP}:2380 \
                  --listen-client-urls http://${POD_IP}:2379,http://127.0.0.1:2379 \
                  --advertise-client-urls http://${HOSTNAME}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2379 \
                  --initial-cluster-token etcd-cluster-1 \
                  --data-dir /var/run/etcd/default.etcd \
                  --initial-cluster $(initial_peers) \
                  --initial-cluster-state new
          lifecycle:
            preStop:
              exec:
                command:
                  - /bin/sh
                  - -ec
                  - |
                    HOSTNAME=$(hostname)

                    member_hash() {
                        etcdctl member list | grep -w "$HOSTNAME" | awk '{ print $1}' | awk -F "," '{ print $1}'
                    }

                    eps() {
                        EPS=""
                        for i in $(seq 0 $((${INITIAL_CLUSTER_SIZE} - 1))); do
                            EPS="${EPS}${EPS:+,}http://${SET_NAME}-${i}.${SET_NAME}.${MY_NAMESPACE}.svc.cluster.local:2379"
                        done
                        echo ${EPS}
                    }

                    export ETCDCTL_ENDPOINTS=$(eps)
                    SET_ID=${HOSTNAME##*-}

                    # Removing member from cluster
                    if [ "${SET_ID}" -ge ${INITIAL_CLUSTER_SIZE} ]; then
                        echo "Removing ${HOSTNAME} from etcd cluster"
                        etcdctl member remove $(member_hash)
                        if [ $? -eq 0 ]; then
                            # Remove everything otherwise the cluster will no longer scale-up
                            rm -rf /var/run/etcd/*
                        fi
                    fi
          volumeMounts:
          - mountPath: /var/run/etcd
            name: datadir
  volumeClaimTemplates:
  - metadata:
      name: datadir
    spec:
      accessModes:
      - "ReadWriteOnce"
      resources:
        requests:
          # upstream recommended max is 700M
          storage: 1Gi

3 创建etcd集群

我们在 Kubernetes 集群中创建上面的几个资源对象 。

[root@master1 etcd]# ls
pdb.yaml  service.yaml  sts.yaml
[root@master1 etcd]# kubectl apply -f pdb.yaml
poddisruptionbudget.policy/etcd-pdb created
[root@master1 etcd]# kubectl apply -f service.yaml 
service/etcd created
[root@master1 etcd]# kubectl apply -f statefulset.yaml 
statefulset.apps/etcd created
[root@master1 etcd]#
[root@master1 etcd]# kubectl get pods
NAME     READY   STATUS    RESTARTS   AGE
etcd-0   1/1     Running   0          109s
etcd-1   1/1     Running   0          92s
etcd-2   1/1     Running   0          89s
[root@master1 etcd]#

4 查看集群状态

[root@master1 etcd]# kubectl exec -it etcd-0 /bin/sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
# etcdctl --endpoints etcd-0.etcd:2379,etcd-1.etcd:2379,etcd-2.etcd:2379 endpoint status --write-out=table
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|     ENDPOINT     |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| etcd-0.etcd:2379 | cf6e4d46327c6096 |  3.4.13 |   20 kB |     false |      false |         3 |          9 |                  9 |        |
| etcd-1.etcd:2379 | ef7b9396380fadf3 |  3.4.13 |   20 kB |     false |      false |         3 |          9 |                  9 |        |
| etcd-2.etcd:2379 | 59d3bdbcac74f6d9 |  3.4.13 |   20 kB |      true |      false |         3 |          9 |                  9 |        |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
#

5 集群测试

5.1 etcd集群扩容-增加节点

[root@master1 etcd]# kubectl scale --replicas=5 statefulset etcd
[root@master1 etcd]# kubectl get pods -l app=etcd               
NAME     READY   STATUS    RESTARTS   AGE
etcd-0   1/1     Running   0          5m59s
etcd-1   1/1     Running   0          5m52s
etcd-2   1/1     Running   0          5m47s
etcd-3   1/1     Running   0          4m
etcd-4   1/1     Running   1          3m55s

此时我们再去查看集群的状态:

[root@master1 etcd]# kubectl exec -it etcd-0 /bin/sh            
# etcdctl --endpoints etcd-0.etcd:2379,etcd-1.etcd:2379,etcd-2.etcd:2379,etcd-3.etcd:2379,etcd-4.etcd:2379 endpoint status --write-out=table
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|     ENDPOINT     |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| etcd-0.etcd:2379 | c799a6ef06bc8c14 |  3.4.13 |   20 kB |     false |      false |        16 |         13 |                 13 |        |
| etcd-1.etcd:2379 | 9869f0647883a00d |  3.4.13 |   20 kB |      true |      false |        16 |         13 |                 13 |        |
| etcd-2.etcd:2379 | 42c8b94265b9b79a |  3.4.13 |   20 kB |     false |      false |        16 |         13 |                 13 |        |
| etcd-3.etcd:2379 | 41eec5480dc0d9ec |  3.4.13 |   20 kB |     false |      false |        16 |         13 |                 13 |        |
| etcd-4.etcd:2379 | ebbc833cba01ecad |  3.4.13 |   20 kB |     false |      false |        16 |         13 |                 13 |        |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

5.2 etcd集群缩容-减少节点

[root@master1 etcd]# kubectl scale --replicas=3 statefulset etcd
statefulset.apps/etcd scaled
[root@master1 etcd]# kubectl get pods -l app=etcd               
NAME     READY   STATUS    RESTARTS   AGE
etcd-0   1/1     Running   0          11m
etcd-1   1/1     Running   0          28s
etcd-2   1/1     Running   0          23s
[root@master1 etcd]# kubectl exec -it etcd-0 /bin/sh            
# etcdctl --endpoints etcd-0.etcd:2379,etcd-1.etcd:2379,etcd-2.etcd:2379 endpoint status --write-out=table
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|     ENDPOINT     |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| etcd-0.etcd:2379 | 2e80f96756a54ca9 |  3.4.13 |   20 kB |      true |      false |       139 |         23 |                 23 |        |
| etcd-1.etcd:2379 | 7fd61f3f79d97779 |  3.4.13 |   20 kB |     false |      false |       139 |         23 |                 23 |        |
| etcd-2.etcd:2379 | b429c86e3cd4e077 |  3.4.13 |   20 kB |     false |      false |       139 |         23 |                 23 |        |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+