date: 2021-04-20title: Ceph与k8s完美集成 #标题
tags: 存储,k8s持久化 #标签
categories: k8s # 分类

本文目的是实现ceph和k8s完美集成,包括与volumes、pvc、sc等集成。

这里不讲ceph以及k8s集群如何安装,可以找我之前的博文或者使用其他方式将集群安装完成。

参考: 官方文档

ceph与volumes结合

参考:官方文档

k8s节点安装ceph工具

在k8s所有节点上安装。

  1. $ yum -y install ceph-common
  2. # 如果找不到软件包,则需要配置ceph仓库
  3. $ cat > /etc/yum.repos.d/ceph.repo << EOF
  4. [ceph-norch]
  5. name=ceph-norch
  6. baseurl=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch/
  7. enabled=1
  8. gpgcheck=0
  9. [ceph-x86_64]
  10. name=ceph-x86_64
  11. baseurl=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/x86_64/
  12. enabled=1
  13. gpgcheck=0
  14. EOF

ceph集群中创建pool、块文件及认证用户

在任意一个ceph管理节点上执行。

  1. # 创建pool
  2. $ ceph osd pool create kubernetes 16 16
  3. # 初始化pool
  4. $ rbd pool init kubernetes
  5. # 创建块文件
  6. $ rbd create -p kubernetes --image-feature layering rbd.img --size 10G
  7. # 创建授权用户
  8. $ ceph auth get-or-create client.kubernetes mon 'profile rbd' osd 'profile rbd pool=kubernetes' mgr 'profile rbd pool=kubernetes'
  9. [client.kubernetes]
  10. key = AQCgD3xgde3HDRAA4iqESawxR8LgB3mAZ70fWQ==
  11. # 确认授权列表
  12. $ ceph auth get client.kubernetes
  13. exported keyring for client.kubernetes
  14. [client.kubernetes]
  15. key = AQCgD3xgde3HDRAA4iqESawxR8LgB3mAZ70fWQ==
  16. caps mgr = "profile rbd pool=kubernetes"
  17. caps mon = "profile rbd"
  18. caps osd = "profile rbd pool=kubernetes"

创建secret资源对象

在k8s集群中进行如下操作。

  1. # 对ceph中创建的ceph授权key进行base64加密
  2. $ echo 'AQCgD3xgde3HDRAA4iqESawxR8LgB3mAZ70fWQ==' | base64
  3. QVFDZ0QzeGdkZTNIRFJBQTRpcUVTYXd4UjhMZ0IzbUFaNzBmV1E9PQo=
  4. # 编写secret资源对象
  5. cat > ceph_secret.yml << EOF
  6. apiVersion: v1
  7. kind: Secret
  8. metadata:
  9. name: ceph-secret
  10. type: "kubernetes.io/rbd"
  11. data:
  12. key: QVFDZ0QzeGdkZTNIRFJBQTRpcUVTYXd4UjhMZ0IzbUFaNzBmV1E9PQo=
  13. EOF
  14. # 上面的key指定的就是加密后的ceph授权秘钥
  15. # 创建secret资源对象
  16. $ kubectl apply -f ceph_secret.yml
  17. secret/ceph-secret created
  18. # 确认已创建
  19. $ kubectl get secret/ceph-secret -o yaml
  20. apiVersion: v1
  21. data:
  22. key: QVFDZ0QzeGdkZTNIRFJBQTRpcUVTYXd4UjhMZ0IzbUFaNzBmV1E9PQo=
  23. kind: Secret
  24. metadata:
  25. ............. # 忽略部分输出

k8s创建deployment资源对象使用ceph块设备

下面的yaml文件就考验各位对k8s的熟悉程度了,懒得解释。

  1. $ cat > nginx.yaml << EOF
  2. ---
  3. apiVersion: apps/v1
  4. kind: Deployment
  5. metadata:
  6. name: web-nginx
  7. labels:
  8. k8s.cn/layer: web
  9. spec:
  10. replicas: 1
  11. selector:
  12. matchLabels:
  13. k8s.cn/layer: web
  14. template:
  15. metadata:
  16. labels:
  17. k8s.cn/layer: web
  18. spec:
  19. containers:
  20. - image: nginx
  21. imagePullPolicy: IfNotPresent
  22. name: nginx
  23. ports:
  24. - containerPort: 80
  25. name: www
  26. protocol: TCP
  27. volumeMounts:
  28. - mountPath: /data
  29. name: ceph-demo
  30. volumes:
  31. - name: ceph-demo
  32. rbd:
  33. monitors:
  34. - 192.168.20.10:6789
  35. - 192.168.20.5:6789
  36. - 192.168.20.6:6789
  37. pool: kubernetes
  38. image: rbd.img
  39. fsType: ext4
  40. user: kubernetes
  41. secretRef:
  42. name: ceph-secret
  43. EOF
  44. # 创建deployment
  45. $ kubectl apply -f nginx.yaml
  46. # 稍等片刻,确认容器运行成功
  47. $ kubectl get pods
  48. NAME READY STATUS RESTARTS AGE
  49. web-nginx-6bf57f9cf7-dfvv4 1/1 Running 0 8s

进入容器进行验证

  1. # 查询pod
  2. $ kubectl get pods
  3. NAME READY STATUS RESTARTS AGE
  4. web-nginx-6bf57f9cf7-dfvv4 1/1 Running 0 13s
  5. # 进入pod
  6. $ kubectl exec pod/web-nginx-6bf57f9cf7-dfvv4 -- /bin/bash
  7. # 查看/data 目录挂载情况
  8. root@web-nginx-6bf57f9cf7-dfvv4:/# df -hT data
  9. Filesystem Type Size Used Avail Use% Mounted on
  10. /dev/rbd0 ext4 9.8G 37M 9.7G 1% /data

验证块设备挂载流程

经过上面的容器内验证,我们已经确认块设备成功挂载到了pod中,那么现在要看看这个块设备是如何挂载到pod中的。

  1. # 查看pod所在节点
  2. $ kubectl get pods -o wide
  3. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  4. web-nginx-6bf57f9cf7-dfvv4 1/1 Running 0 11m 10.100.15.202 centos-20-4 <none> <none>
  5. # centos-20-4 宿主机节点进行查看
  6. $ df -hT | grep rbd # 可以看出来是先挂载到宿主机上,然后再映射到pod中的
  7. /dev/rbd0 ext4 9.8G 37M 9.7G 1% /var/lib/kubelet/plugins/kubernetes.io/rbd/mounts/kubernetes-image-rbd.img

结论1:我自己删除pod后,使deployment控制器自动拉起pod,确定rbd.img中的数据不会丢失,说明如果你的rbd文件没有文件系统,那么会按照资源清单中指定的文件系统格式进行格式化,如果已存在文件系统,则直接挂载使用。

结论2:上面创建deployment的yaml文件不支持多副本,当扩容副本数量时,将会产生报错,一个rbd只能挂载到一个主机上。所以,如果想要扩展多个pod共享一个rbd块设备,那么只有指定一个固定节点运行pod了,下面是我创建的第二个deployment资源对象,通过nodeName字段指定其只能运行在centos-20-4 宿主机节点上,如下(除了资源对象名字以外,其他都一样):

  1. $ cat nginx2.yaml
  2. ---
  3. apiVersion: apps/v1
  4. kind: Deployment
  5. metadata:
  6. name: web-nginx2
  7. labels:
  8. k8s.cn/layer: web2
  9. spec:
  10. replicas: 1
  11. selector:
  12. matchLabels:
  13. k8s.cn/layer: web2
  14. template:
  15. metadata:
  16. labels:
  17. k8s.cn/layer: web2
  18. spec:
  19. nodeName: centos-20-4
  20. containers:
  21. - image: nginx
  22. imagePullPolicy: IfNotPresent
  23. name: nginx2
  24. ports:
  25. - containerPort: 80
  26. name: www
  27. protocol: TCP
  28. volumeMounts:
  29. - mountPath: /data
  30. name: ceph-demo2
  31. volumes:
  32. - name: ceph-demo2
  33. rbd:
  34. monitors:
  35. - 192.168.20.10:6789
  36. - 192.168.20.5:6789
  37. - 192.168.20.6:6789
  38. pool: kubernetes
  39. image: rbd.img
  40. fsType: ext4
  41. user: kubernetes
  42. secretRef:
  43. name: ceph-secret

Ceph与PV/PVC集成

准备工作

参考ceph与volumes结合小节,创建好pool、用户认证、secrets(如已创建,可以忽略,直接复用即可)。

  1. # ceph节点创建块文件
  2. $ rbd create -p kubernetes --image-feature layering demo-1.img --size 10G
  3. # 确认块设备文件已存在
  4. $ rbd -p kubernetes ls
  5. demo-1.img
  6. rbd.img

创建PV

接下来的所有yaml文件字段值,这里不展开解释,如果你不懂,请自行查阅其他资料。

  1. # 定义yaml文件
  2. $ cat > pv.yaml << EOF
  3. apiVersion: v1
  4. kind: PersistentVolume
  5. metadata:
  6. name: rbd-demo
  7. spec:
  8. accessModes:
  9. - ReadWriteOnce
  10. capacity:
  11. storage: 10G
  12. rbd:
  13. monitors:
  14. - 192.168.20.10:678
  15. - 192.168.20.5:6789
  16. - 192.168.20.6:6789
  17. pool: kubernetes
  18. image: demo-1.img
  19. fsType: ext4
  20. user: kubernetes
  21. secretRef:
  22. name: ceph-secret
  23. persistentVolumeReclaimPolicy: Retain
  24. storageClassName: rbd
  25. EOF
  26. # 创建PV
  27. $ kubectl apply -f pv.yaml
  28. persistentvolume/rbd-demo created
  29. # 确认PV已创建
  30. $ kubectl get pv
  31. NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
  32. rbd-demo 10G RWO Retain Available rbd 13s

定义PVC,引用PV

  1. # 定义pvc的yaml文件
  2. $ cat > pvc.yaml << EOF
  3. apiVersion: v1
  4. kind: PersistentVolumeClaim
  5. metadata:
  6. name: pvc-demo
  7. spec:
  8. accessModes:
  9. - ReadWriteOnce # 访问模式需要和PV一致
  10. volumeName: rbd-demo # 指定PV的名称
  11. resources:
  12. requests:
  13. storage: 10G
  14. storageClassName: rbd # 这个字段需要和PV的yaml文件中的storageClassName字段值保持一致
  15. EOF
  16. # 创建pvc
  17. $ kubectl apply -f pvc.yaml
  18. # 确认PVC已创建
  19. $ kubectl get pvc
  20. NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
  21. pvc-demo Bound rbd-demo 10G RWO rbd 100s

确认PVC和PV处于Bound状态

  1. $ kubectl get pv,pvc
  2. NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
  3. persistentvolume/rbd-demo 10G RWO Retain Bound default/pvc-demo rbd 3m43s
  4. NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
  5. persistentvolumeclaim/pvc-demo Bound rbd-demo 10G RWO rbd 32s

创建deployment挂载PVC

  1. $ cat > nginx.yaml << EOF
  2. ---
  3. apiVersion: apps/v1
  4. kind: Deployment
  5. metadata:
  6. name: web-nginx
  7. labels:
  8. k8s.cn/layer: web
  9. spec:
  10. replicas: 1
  11. selector:
  12. matchLabels:
  13. k8s.cn/layer: web
  14. template:
  15. metadata:
  16. labels:
  17. k8s.cn/layer: web
  18. spec:
  19. containers:
  20. - image: nginx
  21. imagePullPolicy: IfNotPresent
  22. name: nginx
  23. ports:
  24. - containerPort: 80
  25. name: www
  26. protocol: TCP
  27. volumeMounts:
  28. - mountPath: /data
  29. name: rbd # 此处需要和下面volumes.name字段的值一致
  30. volumes:
  31. - name: rbd
  32. persistentVolumeClaim:
  33. claimName: pvc-demo # 这里需要指定pvc的名字
  34. EOF
  35. # 创建deployment
  36. $ kubectl apply -f nginx.yaml
  37. deployment.apps/web-nginx created
  38. # 确认deployment已创建
  39. $ kubectl get pods
  40. NAME READY STATUS RESTARTS AGE
  41. web-nginx-57b545d4d8-c9zk4 1/1 Running 0 4m

验证挂载情况

  1. $ kubectl exec -it web-nginx-57b545d4d8-c9zk4 -- /bin/bash
  2. root@web-nginx-57b545d4d8-c9zk4:/# df -hT /data
  3. Filesystem Type Size Used Avail Use% Mounted on
  4. /dev/rbd0 ext4 9.8G 37M 9.7G 1% /data
  5. # 查看pod所在节点
  6. $ kubectl get pods -o wide
  7. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  8. web-nginx-57b545d4d8-c9zk4 1/1 Running 0 6m27s 10.100.15.205 centos-20-4 <none> <none>
  9. # 查看宿主机挂载情况
  10. $ df -hT | grep rbd
  11. /dev/rbd0 ext4 9.8G 37M 9.7G 1% /var/lib/kubelet/plugins/kubernetes.io/rbd/mounts/kubernetes-image-demo-1.img
  12. # 宿主机上的挂载目录和你pod中的data目录内容是完全一致的。

Ceph与SC集成架构

参考:官方文档

在K8s v1.13及以后的版本中,可以通过Ceph -csi驱动使用Ceph块设备镜像,它动态地提供RBD映像来支持K8s存储卷,并将这些RBD镜像映射为运行引用RBD支持的卷的pods的工作节点上的块设备(可以将这些RBD镜像映射为镜像中包含的文件系统)。

下面是k8s使用ceph-csi驱动工作的一个架构示意图:

Ceph与k8s完美集成 - 图1

在上图中,k8s收到用户请求后,会通过ceph-csi去动态的创建PV、PVC以及ceph集群中的块文件,并且将其挂载到pod对应的节点上。

部署ceph-csi驱动程序

1、查询ceph集群信息(ceph控制节点上执行)

  1. $ ceph mon dump
  2. dumped monmap epoch 3
  3. epoch 3
  4. fsid d94fee92-ef1a-4f1f-80a5-1c7e1caf4a4a # 我们需要这个id
  5. last_changed 2021-04-14 17:58:46.874896
  6. created 2021-04-14 17:54:27.836955
  7. min_mon_release 14 (nautilus)
  8. # 以及下面的monitor监听地址
  9. 0: [v2:192.168.20.10:3300/0,v1:192.168.20.10:6789/0] mon.centos-20-10
  10. 1: [v2:192.168.20.5:3300/0,v1:192.168.20.5:6789/0] mon.centos-20-5
  11. 2: [v2:192.168.20.6:3300/0,v1:192.168.20.6:6789/0] mon.centos-20-6

2、创建ceph-csi配置文件(k8s集群节点上执行)

  1. # 定义configmap文件
  2. $ cat <<EOF > csi-config-map.yaml
  3. ---
  4. apiVersion: v1
  5. kind: ConfigMap
  6. data:
  7. config.json: |-
  8. [
  9. {
  10. "clusterID": "d94fee92-ef1a-4f1f-80a5-1c7e1caf4a4a", # 替换为你查出来的ceph集群id
  11. "monitors": [ # 将下面替换为你ceph集群的monitor地址
  12. "192.168.20.5:6789",
  13. "192.168.20.6:6789",
  14. "192.168.20.10:6789"
  15. ]
  16. }
  17. ]
  18. metadata:
  19. name: ceph-csi-config
  20. EOF
  21. # 创建configmap
  22. $ kubectl apply -f csi-config-map.yaml

3、定义认证信息文件

  1. # ceph集群中查询kubernetes用户对应的key
  2. # 如果后续创建pvc失败,可以尝试使用admin用户替换kubernetes用户
  3. # 正常来说kubernetes用户是没问题的
  4. $ ceph auth get-or-create client.kubernetes mon 'profile rbd' osd 'profile rbd pool=kubernetes' mgr 'profile rbd pool=kubernetes'
  5. [client.kubernetes]
  6. key = AQDMi35gIuRmIxAApx47Id2rsMPF33R5r4jrwQ==
  7. # k8s中编写secret文件
  8. $ cat <<EOF > csi-rbd-secret.yaml
  9. ---
  10. apiVersion: v1
  11. kind: Secret
  12. metadata:
  13. name: csi-rbd-secret
  14. namespace: default
  15. stringData:
  16. userID: kubernetes
  17. userKey: AQCadX5ggqz0GRAAuQuf/Ks3B7aJoK5L3SqXDQ==
  18. EOF
  19. # 创建secret
  20. $ kubectl apply -f csi-rbd-secret.yaml
  21. # ceph-csi还需要一个额外的ConfigMap对象来定义密钥管理服务(KMS)提供程序细节。
  22. # 如果没有设置KMS,则在csi-kms-config-map中放置一个空配置。
  23. # yaml文件或参考示例 https://github.com/ceph/ceph-csi/tree/master/examples/kms
  24. # 这里直接创建一个空的configmap即可,如下:
  25. $ cat <<EOF > csi-kms-config-map.yaml
  26. ---
  27. apiVersion: v1
  28. kind: ConfigMap
  29. data:
  30. config.json: |-
  31. {}
  32. metadata:
  33. name: ceph-csi-encryption-kms-config
  34. EOF
  35. # 创建configmap
  36. $ kubectl apply -f csi-kms-config-map.yaml

4、创建RBAC授权认证

  1. $ kubectl apply -f https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-provisioner-rbac.yaml
  2. $ kubectl apply -f https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-nodeplugin-rbac.yaml

5、安装ceph-csi驱动插件

  1. # 如下是官方的yaml文件链接,注意:有问题
  2. $ wget https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml
  3. $ wget https://raw.githubusercontent.com/ceph/ceph-csi/master/deploy/rbd/kubernetes/csi-rbdplugin.yaml

在官方提供的yaml文件中,已知有两个问题,具体如下:

  • 问题1:kubernetes/csi-rbdplugin-provisioner.yaml配置了节点亲和度,并且副本数为3,如果你的k8s集群中只有两个node节点,那么将会有一个pod调度失败,比如我这里是1个master,2个node节点,这样执行创建动作后,就会有一个副本调度失败(不要想着将副本数改为2,这个东西貌似涉及到leader选举,所以还是建议3副本),需要配置下污点容忍,使其可以被调度到master节点上,如果你的k8s集群中有三个及以上的node节点,那么可以忽略这个问题。
  • 问题2:官方的yaml文件中,定义的quay.io/cephcsi/cephcsi镜像标签都是canary版本,也就是开发测试版本,如果你是用于生产,还是建议去 官方镜像仓库 找到最新的稳定版,并替换为相应的版本号。

下面是我修改后的两个yaml文件(只是增加了一个污点容忍的配置并且修改了quay.io/cephcsi/cephcsi的镜像标签):

  1. $ cat csi-rbdplugin-provisioner.yaml
  2. ---
  3. kind: Service
  4. apiVersion: v1
  5. metadata:
  6. name: csi-rbdplugin-provisioner
  7. labels:
  8. app: csi-metrics
  9. spec:
  10. selector:
  11. app: csi-rbdplugin-provisioner
  12. ports:
  13. - name: http-metrics
  14. port: 8080
  15. protocol: TCP
  16. targetPort: 8680
  17. ---
  18. kind: Deployment
  19. apiVersion: apps/v1
  20. metadata:
  21. name: csi-rbdplugin-provisioner
  22. spec:
  23. replicas: 3
  24. selector:
  25. matchLabels:
  26. app: csi-rbdplugin-provisioner
  27. template:
  28. metadata:
  29. labels:
  30. app: csi-rbdplugin-provisioner
  31. spec:
  32. tolerations: # 只是增加了这个字段
  33. - effect: NoSchedule
  34. key: node-role.kubernetes.io/master
  35. operator: Exists
  36. affinity:
  37. podAntiAffinity:
  38. requiredDuringSchedulingIgnoredDuringExecution:
  39. - labelSelector:
  40. matchExpressions:
  41. - key: app
  42. operator: In
  43. values:
  44. - csi-rbdplugin-provisioner
  45. topologyKey: "kubernetes.io/hostname"
  46. serviceAccountName: rbd-csi-provisioner
  47. priorityClassName: system-cluster-critical
  48. containers:
  49. - name: csi-provisioner
  50. image: k8s.gcr.io/sig-storage/csi-provisioner:v2.0.4
  51. args:
  52. - "--csi-address=$(ADDRESS)"
  53. - "--v=5"
  54. - "--timeout=150s"
  55. - "--retry-interval-start=500ms"
  56. - "--leader-election=true"
  57. # set it to true to use topology based provisioning
  58. - "--feature-gates=Topology=false"
  59. # if fstype is not specified in storageclass, ext4 is default
  60. - "--default-fstype=ext4"
  61. - "--extra-create-metadata=true"
  62. env:
  63. - name: ADDRESS
  64. value: unix:///csi/csi-provisioner.sock
  65. imagePullPolicy: "IfNotPresent"
  66. volumeMounts:
  67. - name: socket-dir
  68. mountPath: /csi
  69. - name: csi-snapshotter
  70. image: k8s.gcr.io/sig-storage/csi-snapshotter:v4.0.0
  71. args:
  72. - "--csi-address=$(ADDRESS)"
  73. - "--v=5"
  74. - "--timeout=150s"
  75. - "--leader-election=true"
  76. env:
  77. - name: ADDRESS
  78. value: unix:///csi/csi-provisioner.sock
  79. imagePullPolicy: "IfNotPresent"
  80. securityContext:
  81. privileged: true
  82. volumeMounts:
  83. - name: socket-dir
  84. mountPath: /csi
  85. - name: csi-attacher
  86. image: k8s.gcr.io/sig-storage/csi-attacher:v3.0.2
  87. args:
  88. - "--v=5"
  89. - "--csi-address=$(ADDRESS)"
  90. - "--leader-election=true"
  91. - "--retry-interval-start=500ms"
  92. env:
  93. - name: ADDRESS
  94. value: /csi/csi-provisioner.sock
  95. imagePullPolicy: "IfNotPresent"
  96. volumeMounts:
  97. - name: socket-dir
  98. mountPath: /csi
  99. - name: csi-resizer
  100. image: k8s.gcr.io/sig-storage/csi-resizer:v1.0.1
  101. args:
  102. - "--csi-address=$(ADDRESS)"
  103. - "--v=5"
  104. - "--timeout=150s"
  105. - "--leader-election"
  106. - "--retry-interval-start=500ms"
  107. - "--handle-volume-inuse-error=false"
  108. env:
  109. - name: ADDRESS
  110. value: unix:///csi/csi-provisioner.sock
  111. imagePullPolicy: "IfNotPresent"
  112. volumeMounts:
  113. - name: socket-dir
  114. mountPath: /csi
  115. - name: csi-rbdplugin
  116. securityContext:
  117. privileged: true
  118. capabilities:
  119. add: ["SYS_ADMIN"]
  120. # for stable functionality replace canary with latest release version
  121. image: quay.io/cephcsi/cephcsi:v3.3.0
  122. args:
  123. - "--nodeid=$(NODE_ID)"
  124. - "--type=rbd"
  125. - "--controllerserver=true"
  126. - "--endpoint=$(CSI_ENDPOINT)"
  127. - "--v=5"
  128. - "--drivername=rbd.csi.ceph.com"
  129. - "--pidlimit=-1"
  130. - "--rbdhardmaxclonedepth=8"
  131. - "--rbdsoftmaxclonedepth=4"
  132. env:
  133. - name: POD_IP
  134. valueFrom:
  135. fieldRef:
  136. fieldPath: status.podIP
  137. - name: NODE_ID
  138. valueFrom:
  139. fieldRef:
  140. fieldPath: spec.nodeName
  141. # - name: POD_NAMESPACE
  142. # valueFrom:
  143. # fieldRef:
  144. # fieldPath: spec.namespace
  145. # - name: KMS_CONFIGMAP_NAME
  146. # value: encryptionConfig
  147. - name: CSI_ENDPOINT
  148. value: unix:///csi/csi-provisioner.sock
  149. imagePullPolicy: "IfNotPresent"
  150. volumeMounts:
  151. - name: socket-dir
  152. mountPath: /csi
  153. - mountPath: /dev
  154. name: host-dev
  155. - mountPath: /sys
  156. name: host-sys
  157. - mountPath: /lib/modules
  158. name: lib-modules
  159. readOnly: true
  160. - name: ceph-csi-config
  161. mountPath: /etc/ceph-csi-config/
  162. - name: ceph-csi-encryption-kms-config
  163. mountPath: /etc/ceph-csi-encryption-kms-config/
  164. - name: keys-tmp-dir
  165. mountPath: /tmp/csi/keys
  166. - name: csi-rbdplugin-controller
  167. securityContext:
  168. privileged: true
  169. capabilities:
  170. add: ["SYS_ADMIN"]
  171. # for stable functionality replace canary with latest release version
  172. image: quay.io/cephcsi/cephcsi:v3.3.0
  173. args:
  174. - "--type=controller"
  175. - "--v=5"
  176. - "--drivername=rbd.csi.ceph.com"
  177. - "--drivernamespace=$(DRIVER_NAMESPACE)"
  178. env:
  179. - name: DRIVER_NAMESPACE
  180. valueFrom:
  181. fieldRef:
  182. fieldPath: metadata.namespace
  183. imagePullPolicy: "IfNotPresent"
  184. volumeMounts:
  185. - name: ceph-csi-config
  186. mountPath: /etc/ceph-csi-config/
  187. - name: keys-tmp-dir
  188. mountPath: /tmp/csi/keys
  189. - name: liveness-prometheus
  190. image: quay.io/cephcsi/cephcsi:v3.3.0
  191. args:
  192. - "--type=liveness"
  193. - "--endpoint=$(CSI_ENDPOINT)"
  194. - "--metricsport=8680"
  195. - "--metricspath=/metrics"
  196. - "--polltime=60s"
  197. - "--timeout=3s"
  198. env:
  199. - name: CSI_ENDPOINT
  200. value: unix:///csi/csi-provisioner.sock
  201. - name: POD_IP
  202. valueFrom:
  203. fieldRef:
  204. fieldPath: status.podIP
  205. volumeMounts:
  206. - name: socket-dir
  207. mountPath: /csi
  208. imagePullPolicy: "IfNotPresent"
  209. volumes:
  210. - name: host-dev
  211. hostPath:
  212. path: /dev
  213. - name: host-sys
  214. hostPath:
  215. path: /sys
  216. - name: lib-modules
  217. hostPath:
  218. path: /lib/modules
  219. - name: socket-dir
  220. emptyDir: {
  221. medium: "Memory"
  222. }
  223. - name: ceph-csi-config
  224. configMap:
  225. name: ceph-csi-config
  226. - name: ceph-csi-encryption-kms-config
  227. configMap:
  228. name: ceph-csi-encryption-kms-config
  229. - name: keys-tmp-dir
  230. emptyDir: {
  231. medium: "Memory"
  232. }
  233. $ cat csi-rbdplugin.yaml
  234. ---
  235. kind: DaemonSet
  236. apiVersion: apps/v1
  237. metadata:
  238. name: csi-rbdplugin
  239. spec:
  240. selector:
  241. matchLabels:
  242. app: csi-rbdplugin
  243. template:
  244. metadata:
  245. labels:
  246. app: csi-rbdplugin
  247. spec:
  248. serviceAccountName: rbd-csi-nodeplugin
  249. hostNetwork: true
  250. hostPID: true
  251. priorityClassName: system-node-critical
  252. # to use e.g. Rook orchestrated cluster, and mons' FQDN is
  253. # resolved through k8s service, set dns policy to cluster first
  254. dnsPolicy: ClusterFirstWithHostNet
  255. containers:
  256. - name: driver-registrar
  257. # This is necessary only for systems with SELinux, where
  258. # non-privileged sidecar containers cannot access unix domain socket
  259. # created by privileged CSI driver container.
  260. securityContext:
  261. privileged: true
  262. image: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.0.1
  263. args:
  264. - "--v=5"
  265. - "--csi-address=/csi/csi.sock"
  266. - "--kubelet-registration-path=/var/lib/kubelet/plugins/rbd.csi.ceph.com/csi.sock"
  267. env:
  268. - name: KUBE_NODE_NAME
  269. valueFrom:
  270. fieldRef:
  271. fieldPath: spec.nodeName
  272. volumeMounts:
  273. - name: socket-dir
  274. mountPath: /csi
  275. - name: registration-dir
  276. mountPath: /registration
  277. - name: csi-rbdplugin
  278. securityContext:
  279. privileged: true
  280. capabilities:
  281. add: ["SYS_ADMIN"]
  282. allowPrivilegeEscalation: true
  283. # for stable functionality replace canary with latest release version
  284. image: quay.io/cephcsi/cephcsi:v3.3.0
  285. args:
  286. - "--nodeid=$(NODE_ID)"
  287. - "--type=rbd"
  288. - "--nodeserver=true"
  289. - "--endpoint=$(CSI_ENDPOINT)"
  290. - "--v=5"
  291. - "--drivername=rbd.csi.ceph.com"
  292. # If topology based provisioning is desired, configure required
  293. # node labels representing the nodes topology domain
  294. # and pass the label names below, for CSI to consume and advertise
  295. # its equivalent topology domain
  296. # - "--domainlabels=failure-domain/region,failure-domain/zone"
  297. env:
  298. - name: POD_IP
  299. valueFrom:
  300. fieldRef:
  301. fieldPath: status.podIP
  302. - name: NODE_ID
  303. valueFrom:
  304. fieldRef:
  305. fieldPath: spec.nodeName
  306. # - name: POD_NAMESPACE
  307. # valueFrom:
  308. # fieldRef:
  309. # fieldPath: spec.namespace
  310. # - name: KMS_CONFIGMAP_NAME
  311. # value: encryptionConfig
  312. - name: CSI_ENDPOINT
  313. value: unix:///csi/csi.sock
  314. imagePullPolicy: "IfNotPresent"
  315. volumeMounts:
  316. - name: socket-dir
  317. mountPath: /csi
  318. - mountPath: /dev
  319. name: host-dev
  320. - mountPath: /sys
  321. name: host-sys
  322. - mountPath: /run/mount
  323. name: host-mount
  324. - mountPath: /lib/modules
  325. name: lib-modules
  326. readOnly: true
  327. - name: ceph-csi-config
  328. mountPath: /etc/ceph-csi-config/
  329. - name: ceph-csi-encryption-kms-config
  330. mountPath: /etc/ceph-csi-encryption-kms-config/
  331. - name: plugin-dir
  332. mountPath: /var/lib/kubelet/plugins
  333. mountPropagation: "Bidirectional"
  334. - name: mountpoint-dir
  335. mountPath: /var/lib/kubelet/pods
  336. mountPropagation: "Bidirectional"
  337. - name: keys-tmp-dir
  338. mountPath: /tmp/csi/keys
  339. - name: liveness-prometheus
  340. securityContext:
  341. privileged: true
  342. image: quay.io/cephcsi/cephcsi:v3.3.0
  343. args:
  344. - "--type=liveness"
  345. - "--endpoint=$(CSI_ENDPOINT)"
  346. - "--metricsport=8680"
  347. - "--metricspath=/metrics"
  348. - "--polltime=60s"
  349. - "--timeout=3s"
  350. env:
  351. - name: CSI_ENDPOINT
  352. value: unix:///csi/csi.sock
  353. - name: POD_IP
  354. valueFrom:
  355. fieldRef:
  356. fieldPath: status.podIP
  357. volumeMounts:
  358. - name: socket-dir
  359. mountPath: /csi
  360. imagePullPolicy: "IfNotPresent"
  361. volumes:
  362. - name: socket-dir
  363. hostPath:
  364. path: /var/lib/kubelet/plugins/rbd.csi.ceph.com
  365. type: DirectoryOrCreate
  366. - name: plugin-dir
  367. hostPath:
  368. path: /var/lib/kubelet/plugins
  369. type: Directory
  370. - name: mountpoint-dir
  371. hostPath:
  372. path: /var/lib/kubelet/pods
  373. type: DirectoryOrCreate
  374. - name: registration-dir
  375. hostPath:
  376. path: /var/lib/kubelet/plugins_registry/
  377. type: Directory
  378. - name: host-dev
  379. hostPath:
  380. path: /dev
  381. - name: host-sys
  382. hostPath:
  383. path: /sys
  384. - name: host-mount
  385. hostPath:
  386. path: /run/mount
  387. - name: lib-modules
  388. hostPath:
  389. path: /lib/modules
  390. - name: ceph-csi-config
  391. configMap:
  392. name: ceph-csi-config
  393. - name: ceph-csi-encryption-kms-config
  394. configMap:
  395. name: ceph-csi-encryption-kms-config
  396. - name: keys-tmp-dir
  397. emptyDir: {
  398. medium: "Memory"
  399. }
  400. ---
  401. # This is a service to expose the liveness metrics
  402. apiVersion: v1
  403. kind: Service
  404. metadata:
  405. name: csi-metrics-rbdplugin
  406. labels:
  407. app: csi-metrics
  408. spec:
  409. ports:
  410. - name: http-metrics
  411. port: 8080
  412. protocol: TCP
  413. targetPort: 8680
  414. selector:
  415. app: csi-rbdplugin

上面的两个yaml文件中需要使用几个k8s官方的镜像,所以我买了台云主机将镜像手动下载下来了,建议在安装插件之前,将我百度网盘中的镜像(提取码:1234)导入到你的k8s环境中,如果你有其他方式下载k8s官方的镜像,忽略这个建议即可。

当你导入镜像后,就可以执行下面两个指令创建相应的资源对象了:

  1. $ kubectl apply -f csi-rbdplugin-provisioner.yaml
  2. $ kubectl apply -f csi-rbdplugin.yaml

创建存储类

你可以定义多个存储类,比如你的ceph集群中有ssd和hdd两种类型的存储介质,那么你可以创建两种存储类,分别对应不同的存储介质。

  1. # 定义yaml文件
  2. $ cat <<EOF > csi-rbd-sc.yaml
  3. ---
  4. apiVersion: storage.k8s.io/v1
  5. kind: StorageClass
  6. metadata:
  7. name: csi-rbd-sc
  8. provisioner: rbd.csi.ceph.com # 指定所用驱动
  9. parameters:
  10. clusterID: d94fee92-ef1a-4f1f-80a5-1c7e1caf4a4a # 你的ceph集群id,可以通过ceph -s 指令查询
  11. pool: kubernetes # 指定你要使用ceph集群中的哪个pool
  12. imageFeatures: layering
  13. csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret # 指定你访问ceph集群的用户及秘钥(我们将其写入到了secret资源对象中,所以指定这个secret资源对象即可)
  14. csi.storage.k8s.io/provisioner-secret-namespace: default # 上面指定的secret属于哪个名称空间
  15. csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret # 指定挂载时使用的用户资源(一般和上面保持一致)
  16. csi.storage.k8s.io/node-stage-secret-namespace: default # 同上
  17. csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret
  18. csi.storage.k8s.io/controller-expand-secret-namespace: default
  19. reclaimPolicy: Delete # 指定回收策略
  20. allowVolumeExpansion: true
  21. mountOptions:
  22. - discard
  23. EOF
  24. # 创建资源对象
  25. $ kubectl apply -f csi-rbd-sc.yaml
  26. # 确认sc已创建
  27. $ kubectl get sc
  28. NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
  29. csi-rbd-sc rbd.csi.ceph.com Delete Immediate true 2s

创建PVC动态申请空间

PVC支持块级别或文件系统级别去申请存储资源。一般推荐使用文件系统级别去申请,如果申请块设备,那么只能以块设备的方式映射到pod中,还需要在pod运行后对其进行磁盘格式化等操作,这个不太可能这么做,如果你想了解下如何申请块设备,可以参考官方文档(其实都差不多,只是将下面的Filesystem改成了Block,只是pod使用这个块时,就不能用volumeMounts这个关键字了,需要使用volumeDevices):

  1. # 申请文件系统级别存储资源
  2. $ cat <<EOF > rbd-pvc.yaml
  3. ---
  4. apiVersion: v1
  5. kind: PersistentVolumeClaim
  6. metadata:
  7. name: rbd-pvc
  8. spec:
  9. accessModes:
  10. - ReadWriteOnce
  11. volumeMode: Filesystem
  12. resources:
  13. requests:
  14. storage: 1Gi
  15. storageClassName: csi-rbd-sc
  16. EOF
  17. # 创建PVC
  18. $ kubectl apply -f rbd-pvc.yaml

查看SC、PV、PVC资源对象

当我们PVC创建成功后,PVC会找SC去创建相应的PV,并将PV和PVC进行绑定。整个过程于用户而言,都是自动的。

  1. $ kubectl get pv,pvc,sc
  2. NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
  3. persistentvolume/pvc-74eb57c1-9cb5-442f-8dfe-71f2e869f4df 1Gi RWO Delete Bound default/rbd-pvc csi-rbd-sc 18s
  4. NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
  5. persistentvolumeclaim/rbd-pvc Bound pvc-74eb57c1-9cb5-442f-8dfe-71f2e869f4df 1Gi RWO csi-rbd-sc 18s
  6. NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
  7. storageclass.storage.k8s.io/csi-rbd-sc rbd.csi.ceph.com Delete Immediate true 75m
  8. # 到ceph集群中查看,也有对应的一个块文件
  9. $ rbd -p kubernetes ls
  10. csi-vol-1310f455-a180-11eb-9aeb-6e7f74732123

至此,就实现了PV、PVC及SC整套流程的使用,也是生产中最为实用的一个存储解决方案。

创建pod挂载PVC

现在就定义一个deployment资源对象,来挂载使用下我们上面创建的PVC。

  1. $ cat > nginx.yaml << EOF
  2. ---
  3. apiVersion: apps/v1
  4. kind: Deployment
  5. metadata:
  6. name: web-nginx
  7. labels:
  8. k8s.cn/layer: web
  9. spec:
  10. replicas: 1
  11. selector:
  12. matchLabels:
  13. k8s.cn/layer: web
  14. template:
  15. metadata:
  16. labels:
  17. k8s.cn/layer: web
  18. spec:
  19. containers:
  20. - image: nginx
  21. imagePullPolicy: IfNotPresent
  22. name: nginx
  23. ports:
  24. - containerPort: 80
  25. name: www
  26. protocol: TCP
  27. volumeMounts:
  28. - mountPath: /data
  29. name: rbd # 这里要和下面定义的volumes名字一致
  30. volumes:
  31. - name: rbd
  32. persistentVolumeClaim:
  33. claimName: rbd-pvc # 这里需要指定pvc的名字
  34. EOF
  35. # 创建deployment
  36. $ kubectl apply -f nginx.yaml

验证存储挂载情况

  1. # 自行查阅pod名称并进入pod
  2. $ kubectl exec -it web-nginx-7698cd7569-5bjf7 -- /bin/bash
  3. root@web-nginx-7698cd7569-5bjf7:/# df -hT /data # 查看 /data 目录
  4. Filesystem Type Size Used Avail Use% Mounted on
  5. /dev/rbd0 ext4 976M 2.6M 958M 1% /data

至此,已经实现了sc的功能,但由于我们使用deployment资源对象来验证的,并没有将其功能完全展示出来,我们还需要在使用前去创建PVC,但如果你使用StatefulSets来使用存储资源,那么将会更自动化些,下面来试试。

StorageClass最佳实践

一句话: StorageClassStatefulSets结合使用,才可以展现其最大魅力。

下面的yaml文件参考官方文档

  1. # 定义StatefulSets资源对象
  2. $ cat > nginx.yaml << EOF
  3. apiVersion: apps/v1
  4. kind: StatefulSet
  5. metadata:
  6. name: web
  7. spec:
  8. selector:
  9. matchLabels:
  10. app: nginx # has to match .spec.template.metadata.labels
  11. serviceName: "nginx"
  12. replicas: 3 # by default is 1
  13. template:
  14. metadata:
  15. labels:
  16. app: nginx # has to match .spec.selector.matchLabels
  17. spec:
  18. terminationGracePeriodSeconds: 10
  19. containers:
  20. - name: nginx
  21. image: nginx
  22. ports:
  23. - containerPort: 80
  24. name: web
  25. volumeMounts:
  26. - name: www
  27. mountPath: /usr/share/nginx/html
  28. volumeClaimTemplates: # 主要是这一字段,通过定义卷模板,去创建相应的pvc,pv
  29. - metadata:
  30. name: www
  31. spec:
  32. accessModes: [ "ReadWriteOnce" ]
  33. storageClassName: "csi-rbd-sc"
  34. resources:
  35. requests:
  36. storage: 1Gi
  37. EOF
  38. # 创建statefulset资源对象
  39. $ kubectl apply -f nginx.yaml
  40. # 查看创建的资源对象(可以看到,自动创建了相应的pv,pvc并绑定)
  41. $ kubectl get sts,pvc,pv
  42. NAME READY AGE
  43. statefulset.apps/web 3/3 20m
  44. NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
  45. persistentvolumeclaim/www-web-0 Bound pvc-aff8fcc7-41bb-467a-a3d7-e29bbdede904 1Gi RWO csi-rbd-sc 20m
  46. persistentvolumeclaim/www-web-1 Bound pvc-cc929a1b-0401-48aa-bf2a-dc2dca28079a 1Gi RWO csi-rbd-sc 20m
  47. persistentvolumeclaim/www-web-2 Bound pvc-b077642f-b802-4b1a-b02b-df02d55a8891 1Gi RWO csi-rbd-sc 19m
  48. NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
  49. persistentvolume/pvc-aff8fcc7-41bb-467a-a3d7-e29bbdede904 1Gi RWO Delete Bound default/www-web-0 csi-rbd-sc 20m
  50. persistentvolume/pvc-b077642f-b802-4b1a-b02b-df02d55a8891 1Gi RWO Delete Bound default/www-web-2 csi-rbd-sc 19m
  51. persistentvolume/pvc-cc929a1b-0401-48aa-bf2a-dc2dca28079a 1Gi RWO Delete Bound default/www-web-1 csi-rbd-sc 20m
  52. $ kubectl get pods | grep web
  53. web-0 1/1 Running 0 19m
  54. web-1 1/1 Running 0 19m
  55. web-2 1/1 Running 0 18m