id: install_cluster-milvusoperator.md label: Milvus Operator related_key: Kubernetes order: 2 group: install_cluster-docker.md
summary: 了解如何使用 Milvus Operator 在 Kubernetes 上安装 Milvus 集群
安装分布式版 Milvus
{{fragments/installation_guide_cluster.md}}
{{tab}}
创建 Kubernetes 集群
如果已经在生产环境中创建了 K8s 集群,你可以跳过此步骤,直接开始部署 Milvus Operator。如未创建 K8s 集群,你可以根据以下步骤快速创建一个用于测试的 K8s 集群,并使用其安装分布式版 Milvus。本文将介绍两种创建 K8s 集群的方法:
- 使用 minikube 在虚拟机中创建 Kubernetes 集群
- 使用 kind 在 docker 中创建 Kubernetes 集群
使用 minikube 在虚拟机中创建 K8s 集群
minikube 是一种可以让你在本地轻松运行 Kubernetes 的工具。
1. 安装 minikube
更多细节参考 安装前提。
2. 使用 minikube 启用 K8s 集群
安装 minikube 后,运行如下指令,启用 K8s 集群。
$ minikube start
成功启用 K8s 集群后,你可以看到如下结果。输出结果可能根据你的操作系统和虚拟机监控器会有所不同。
😄 minikube v1.21.0 on Darwin 11.4🎉 minikube 1.23.2 is available! Download it: https://github.com/kubernetes/minikube/releases/tag/v1.23.2💡 To disable this notice, run: 'minikube config set WantUpdateNotification false'✨ Automatically selected the docker driver. Other choices:hyperkit, ssh👍 Starting control plane node minikube in cluster minikube🚜 Pulling base image ...❗ minikube was unable to download gcr.io/k8s-minikube/kicbase:v0.0.23, but successfully downloaded kicbase/stable:v0.0.23 as a fallback image🔥 Creating docker container (CPUs=2, Memory=8100MB) ...❗ This container is having trouble accessing https://k8s.gcr.io💡 To pull new external images, you may need to configure a proxy: https://minikube.sigs.k8s.io/docs/reference/networking/proxy/🐳 Preparing Kubernetes v1.20.7 on Docker 20.10.7…▪ Generating certificates and keys ...▪ Booting up control plane ...▪ Configuring RBAC rules ...🔎 Verifying Kubernetes components...▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5🌟 Enabled addons: storage-provisioner, default-storageclass🏄 Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
3. 检查 K8s 集群状态
运行命令 $ kubectl cluster-info ,检查你所创建的 K8s 集群状态。确保你可以使用 kubectl 访问 K8s 集群。输出结果如下:
Kubernetes control plane is running at https://127.0.0.1:63754KubeDNS is running at https://127.0.0.1:63754/api/v1/namespaces/kube-system/services/kube-dns:dns/proxyTo further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
使用 kind 创建 K8s 集群
kind 是一种使用 Docker 容器作为 node 节点,运行本地Kubernetes 集群的工具。
1. 创建配置文件
创建 kind.yaml 配置文件。
kind: ClusterapiVersion: kind.x-k8s.io/v1alpha4nodes:- role: control-plane- role: worker- role: worker- role: worker
2. 创建 K8s 集群
使用 kind.yaml 配置文件创建 K8s 集群。
$ kind create cluster --name myk8s --config kind.yaml
成功启动 K8s 集群后,可以看到如下结果:
Creating cluster "myk8s" ...✓ Ensuring node image (kindest/node:v1.21.1) 🖼✓ Preparing nodes 📦 📦 📦 📦✓ Writing configuration 📜✓ Starting control-plane 🕹️✓ Installing CNI 🔌✓ Installing StorageClass 💾✓ Joining worker nodes 🚜Set kubectl context to "kind-myk8s"You can now use your cluster with:kubectl cluster-info --context kind-myk8sNot sure what to do next? 😅 Check out https://kind.sigs.k8s.io/docs/user/quick-start/
3. 检查 K8s 集群状态
运行指令 $ kubectl cluster-info,检查你所创建的 K8s 集群状态。确保你可以使用 kubectl 访问 K8s 集群。输出结果如下:
Kubernetes control plane is running at https://127.0.0.1:55668CoreDNS is running at https://127.0.0.1:55668/api/v1/namespaces/kube-system/services/kube-dns:dns/proxyTo further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
部署 Milvus Operator
Milvus Operator 解决方案能够帮助你在目标 K8s 集群上部署 Milvus 服务栈,包含所有 Milvus 组件及 etcd、Pulsar、MinIO 等相关第三方组件。Milvus Operator 会在 Kubernetes 自定义资源 基础上定义 Milvus 集群的自定义资源。定义资源后,你可以声明式使用 K8s API 并管理 Milvus 部署栈以确保服务可扩展和高可用。
部署前提
- 确保你可以使用
kubectl访问 K8s 集群。 - 确保已经安装 StorageClass 组件。minikube 及 kind 默认安装 Storageclass 组件。 运行指令
kubectl get sc,检查是否已安装 Storageclass 组件。如已安装,你可以看到如下结果。如未安装,请手动配置 sStorageclass。 详见改变默认 StorageClass。
NAME PROVISIONER RECLAIMPOLICY VOLUMEBIINDINGMODE ALLOWVOLUMEEXPANSION AGEstandard (default) k8s.io/minikube-hostpath Delete Immediate false 3m36s
1. 安装 cert-manager
Milvus Operator 使用 cert-manager 为 webhook 服务生成证书。运行如下指令,安装 cert-manager。
$ kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.5.3/cert-manager.yaml
安装成功后,你可以看到如下结果:
customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io createdcustomresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io createdcustomresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io createdcustomresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io createdcustomresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io createdcustomresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io creatednamespace/cert-manager createdserviceaccount/cert-manager-cainjector createdserviceaccount/cert-manager createdserviceaccount/cert-manager-webhook createdclusterrole.rbac.authorization.k8s.io/cert-manager-cainjector createdclusterrole.rbac.authorization.k8s.io/cert-manager-controller-issuers createdclusterrole.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers createdclusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificates createdclusterrole.rbac.authorization.k8s.io/cert-manager-controller-orders createdclusterrole.rbac.authorization.k8s.io/cert-manager-controller-challenges createdclusterrole.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim createdclusterrole.rbac.authorization.k8s.io/cert-manager-view createdclusterrole.rbac.authorization.k8s.io/cert-manager-edit createdclusterrole.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io createdclusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests createdclusterrole.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews createdclusterrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector createdclusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-issuers createdclusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers createdclusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificates createdclusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-orders createdclusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-challenges createdclusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim createdclusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io createdclusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests createdclusterrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews createdrole.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection createdrole.rbac.authorization.k8s.io/cert-manager:leaderelection createdrole.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving createdrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection createdrolebinding.rbac.authorization.k8s.io/cert-manager:leaderelection createdrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving createdservice/cert-manager createdservice/cert-manager-webhook createddeployment.apps/cert-manager-cainjector createddeployment.apps/cert-manager createddeployment.apps/cert-manager-webhook createdmutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook createdvalidatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created
运行指令 $ kubectl get pods -n cert-manager,检查 cert-manager 是否正在运行。如果正在运行,你可以看到所有 pods 都在运行中,如下所示:
NAME READY STATUS RESTARTS AGEcert-manager-848f547974-gccz8 1/1 Running 0 70scert-manager-cainjector-54f4cc6b5-dpj84 1/1 Running 0 70scert-manager-webhook-7c9588c76-tqncn 1/1 Running 0 70s
2. 安装 Milvus Operator
运行如下指令,安装 Milvus Operator。
$ kubectl apply -f https://raw.githubusercontent.com/milvus-io/milvus-operator/main/deploy/manifests/deployment.yaml
安装成功后,你可以看到如下结果:
namespace/milvus-operator createdcustomresourcedefinition.apiextensions.k8s.io/milvusclusters.milvus.io createdserviceaccount/milvus-operator-controller-manager createdrole.rbac.authorization.k8s.io/milvus-operator-leader-election-role createdclusterrole.rbac.authorization.k8s.io/milvus-operator-manager-role createdclusterrole.rbac.authorization.k8s.io/milvus-operator-metrics-reader createdclusterrole.rbac.authorization.k8s.io/milvus-operator-proxy-role createdrolebinding.rbac.authorization.k8s.io/milvus-operator-leader-election-rolebinding createdclusterrolebinding.rbac.authorization.k8s.io/milvus-operator-manager-rolebinding createdclusterrolebinding.rbac.authorization.k8s.io/milvus-operator-proxy-rolebinding createdconfigmap/milvus-operator-manager-config createdservice/milvus-operator-controller-manager-metrics-service createdservice/milvus-operator-webhook-service createddeployment.apps/milvus-operator-controller-manager createdcertificate.cert-manager.io/milvus-operator-serving-cert createdissuer.cert-manager.io/milvus-operator-selfsigned-issuer createdmutatingwebhookconfiguration.admissionregistration.k8s.io/milvus-operator-mutating-webhook-configuration createdvalidatingwebhookconfiguration.admissionregistration.k8s.io/milvus-operator-validating-webhook-configuration created
运行指令 $ kubectl get pods -n milvus-operator,检查 Milvus Operator 是否正在运行。如果正在运行中,你可以看到 Milvus Operator 的 pod 正在运行中,如下所示:
NAME READY STATUS RESTARTS AGEmilvus-operator-controller-manager-698fc7dc8d-rlmtk 1/1 Running 0 46s
安装分布式版 Milvus
本文在安装分布式版 Milvus 时使用了默认配置。所有 Milvus 组件均启用了多个副本,这会消耗大量资源。本地资源有限时,你可以 使用最低配置 安装分布式版 Milvus。
1. 部署 Milvus 集群
启用 Milvus Operator 后,运行如下指令,部署 Milvus 集群。
$ kubectl apply -f https://raw.githubusercontent.com/milvus-io/milvus-operator/main/config/samples/milvuscluster_default.yaml
部署完毕后,你可以看到如下结果:
milvuscluster.milvus.io/my-release created
2. 检查 Milvus 集群状态
运行如下指令,检查 Milvus 集群状态。
$ kubectl get mc my-release -o yaml
你可以通过输出结果中 status 一栏确认 Milvus 集群的当前状态。如果 Milvus 集群还在创建中,status 一栏会显示 Unhealthy。
apiVersion: milvus.io/v1alpha1kind: MilvusClustermetadata:...status:conditions:- lastTransitionTime: "2021-11-02T02:52:04Z"message: 'Get "http://my-release-minio.default:9000/minio/admin/v3/info": dialtcp 10.96.78.153:9000: connect: connection refused'reason: ClientErrorstatus: "False"type: StorageReady- lastTransitionTime: "2021-11-02T02:52:04Z"message: connection errorreason: PulsarNotReadystatus: "False"type: PulsarReady- lastTransitionTime: "2021-11-02T02:52:04Z"message: All etcd endpoints are unhealthyreason: EtcdNotReadystatus: "False"type: EtcdReady- lastTransitionTime: "2021-11-02T02:52:04Z"message: Milvus Dependencies is not readyreason: DependencyNotReadystatus: "False"type: MilvusReadyendpoint: my-release-milvus.default:19530status: Unhealthy
运行如下指令,检查 Milvus pod 当前状态。
$ kubectl get pods
NAME READY STATUS RESTARTS AGEmy-release-etcd-0 0/1 Running 0 16smy-release-etcd-1 0/1 ContainerCreating 0 16smy-release-etcd-2 0/1 ContainerCreating 0 16smy-release-minio-0 1/1 Running 0 16smy-release-minio-1 1/1 Running 0 16smy-release-minio-2 0/1 Running 0 16smy-release-minio-3 0/1 ContainerCreating 0 16smy-release-pulsar-bookie-0 0/1 Pending 0 15smy-release-pulsar-bookie-1 0/1 Pending 0 15smy-release-pulsar-bookie-init-h6tfz 0/1 Init:0/1 0 15smy-release-pulsar-broker-0 0/1 Init:0/2 0 15smy-release-pulsar-broker-1 0/1 Init:0/2 0 15smy-release-pulsar-proxy-0 0/1 Init:0/2 0 16smy-release-pulsar-proxy-1 0/1 Init:0/2 0 15smy-release-pulsar-pulsar-init-d2t56 0/1 Init:0/2 0 15smy-release-pulsar-recovery-0 0/1 Init:0/1 0 16smy-release-pulsar-toolset-0 1/1 Running 0 16smy-release-pulsar-zookeeper-0 0/1 Pending 0 16s
3. 启用 Milvus 组件
Milvus Operator 会先创建 etcd、Pulsar、MinIO 等第三方组件,随后再创建 Milvus 组件。因此,目前你仅能看到 etcd、Pulsar 及 MinIO 的 pod。Milvus Operator 会在所有第三方组件启用后启动 Milvus 组件。Milvus 集群状态如下所示:
...status:conditions:- lastTransitionTime: "2021-11-02T05:59:41Z"reason: StorageReadystatus: "True"type: StorageReady- lastTransitionTime: "2021-11-02T06:06:23Z"message: Pulsar is readyreason: PulsarReadystatus: "True"type: PulsarReady- lastTransitionTime: "2021-11-02T05:59:41Z"message: Etcd endpoints is healthyreason: EtcdReadystatus: "True"type: EtcdReady- lastTransitionTime: "2021-11-02T06:06:24Z"message: '[datacoord datanode indexcoord indexnode proxy querycoord querynoderootcoord] not ready'reason: MilvusComponentNotHealthystatus: "False"type: MilvusReadyendpoint: my-release-milvus.default:19530status: Unhealthy
再次检查 Milvus Pods 状态。
$ kubectl get pods
NAME READY STATUS RESTARTS AGEmy-release-etcd-0 1/1 Running 0 6m49smy-release-etcd-1 1/1 Running 0 6m49smy-release-etcd-2 1/1 Running 0 6m49smy-release-milvus-datacoord-6c7bb4b488-k9htl 0/1 ContainerCreating 0 16smy-release-milvus-datanode-5c686bd65-wxtmf 0/1 ContainerCreating 0 16smy-release-milvus-indexcoord-586b9f4987-vb7m4 0/1 Running 0 16smy-release-milvus-indexnode-5b9787b54-xclbx 0/1 ContainerCreating 0 16smy-release-milvus-proxy-84f67cdb7f-pg6wf 0/1 ContainerCreating 0 16smy-release-milvus-querycoord-865cc56fb4-w2jmn 0/1 Running 0 16smy-release-milvus-querynode-5bcb59f6-nhqqw 0/1 ContainerCreating 0 16smy-release-milvus-rootcoord-fdcccfc84-9964g 0/1 Running 0 16smy-release-minio-0 1/1 Running 0 6m49smy-release-minio-1 1/1 Running 0 6m49smy-release-minio-2 1/1 Running 0 6m49smy-release-minio-3 1/1 Running 0 6m49smy-release-pulsar-bookie-0 1/1 Running 0 6m48smy-release-pulsar-bookie-1 1/1 Running 0 6m48smy-release-pulsar-bookie-init-h6tfz 0/1 Completed 0 6m48smy-release-pulsar-broker-0 1/1 Running 0 6m48smy-release-pulsar-broker-1 1/1 Running 0 6m48smy-release-pulsar-proxy-0 1/1 Running 0 6m49smy-release-pulsar-proxy-1 1/1 Running 0 6m48smy-release-pulsar-pulsar-init-d2t56 0/1 Completed 0 6m48smy-release-pulsar-recovery-0 1/1 Running 0 6m49smy-release-pulsar-toolset-0 1/1 Running 0 6m49smy-release-pulsar-zookeeper-0 1/1 Running 0 6m49smy-release-pulsar-zookeeper-1 1/1 Running 0 6mmy-release-pulsar-zookeeper-2 1/1 Running 0 6m26s
所有组建启用后,Milvus 集群的 status 显示为 Healthy。
...status:conditions:- lastTransitionTime: "2021-11-02T05:59:41Z"reason: StorageReadystatus: "True"type: StorageReady- lastTransitionTime: "2021-11-02T06:06:23Z"message: Pulsar is readyreason: PulsarReadystatus: "True"type: PulsarReady- lastTransitionTime: "2021-11-02T05:59:41Z"message: Etcd endpoints is healthyreason: EtcdReadystatus: "True"type: EtcdReady- lastTransitionTime: "2021-11-02T06:12:36Z"message: All Milvus components are healthyreason: MilvusClusterHealthystatus: "True"type: MilvusReadyendpoint: my-release-milvus.default:19530status: Healthy
再次检查 Milvus pod 状态。你可以看到所有 pod 都在运行中。
$ kubectl get pods
NAME READY STATUS RESTARTS AGEmy-release-etcd-0 1/1 Running 0 14mmy-release-etcd-1 1/1 Running 0 14mmy-release-etcd-2 1/1 Running 0 14mmy-release-milvus-datacoord-6c7bb4b488-k9htl 1/1 Running 0 6mmy-release-milvus-datanode-5c686bd65-wxtmf 1/1 Running 0 6mmy-release-milvus-indexcoord-586b9f4987-vb7m4 1/1 Running 0 6mmy-release-milvus-indexnode-5b9787b54-xclbx 1/1 Running 0 6mmy-release-milvus-proxy-84f67cdb7f-pg6wf 1/1 Running 0 6mmy-release-milvus-querycoord-865cc56fb4-w2jmn 1/1 Running 0 6mmy-release-milvus-querynode-5bcb59f6-nhqqw 1/1 Running 0 6mmy-release-milvus-rootcoord-fdcccfc84-9964g 1/1 Running 0 6mmy-release-minio-0 1/1 Running 0 14mmy-release-minio-1 1/1 Running 0 14mmy-release-minio-2 1/1 Running 0 14mmy-release-minio-3 1/1 Running 0 14mmy-release-pulsar-bookie-0 1/1 Running 0 14mmy-release-pulsar-bookie-1 1/1 Running 0 14mmy-release-pulsar-bookie-init-h6tfz 0/1 Completed 0 14mmy-release-pulsar-broker-0 1/1 Running 0 14mmy-release-pulsar-broker-1 1/1 Running 0 14mmy-release-pulsar-proxy-0 1/1 Running 0 14mmy-release-pulsar-proxy-1 1/1 Running 0 14mmy-release-pulsar-pulsar-init-d2t56 0/1 Completed 0 14mmy-release-pulsar-recovery-0 1/1 Running 0 14mmy-release-pulsar-toolset-0 1/1 Running 0 14mmy-release-pulsar-zookeeper-0 1/1 Running 0 14mmy-release-pulsar-zookeeper-1 1/1 Running 0 13mmy-release-pulsar-zookeeper-2 1/1 Running 0 13m
安装分布式版 Milvus 后,你可以学习如何 管理 Milvus 连接
卸载分布式版 Milvus
运行如下指令,卸载分布式版 Milvus。
$ kubectl delete mc my-release
删除 K8s 集群
无需再使用测试环境中的 K8s 集群时,你可以删除集群。
如果你使用 minikube 安装 K8s 集群,运行指令 $ minikube delete。
如果你使用 kind 安装 K8s 集群,运行指令 $ kind delete cluster --name myk8s。
更多内容
安装 Milvus 后,你可以:
- 阅读 Hello Milvus,使用不同语言的 SDK 运行示例代码,探索 Milvus 功能。
- 学习 Milvus 基本操作:
- 使用 Helm Chart 升级 Milvus2.0 版本.
- 对 Milvus 集群进行扩所容.
- 在云端部署 Milvus 集群:
- 了解如何使用开源工具 MilvusDM将数据导入或导出 Milvus。
- 部署监控.
