准备
docker-swarm 集群由管理节点跟工作节点组成。所以需要几台装有 docker 的机器作为 docker-swarm 的节点,我这里准备了三台 linux 虚拟机(vm1,vm2,vm3),搭建包含一个管理节点跟两个工作节点的最小 docker-swarm 集群
docker-swarm 需要的 daemon api 最低版本为 1.24 ,可以使用 docker version 查看 daemon api 版本,我的版本是 1.41
$ docker versionClient: Docker Engine - CommunityVersion: 20.10.0API version: 1.41Go version: go1.13.15Git commit: 7287ab3Built: Tue Dec 8 18:57:35 2020OS/Arch: linux/amd64Context: defaultExperimental: trueServer: Docker Engine - CommunityEngine:Version: 20.10.0API version: 1.41 (minimum version 1.12)Go version: go1.13.15Git commit: eeddea2Built: Tue Dec 8 18:56:55 2020OS/Arch: linux/amd64Experimental: falsecontainerd:Version: 1.4.3GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939brunc:Version: 1.0.0-rc92GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7affdocker-init:Version: 0.19.0GitCommit: de40ad0
集群搭建
创建管理节点
我已经提前在三台虚拟机上都装好 docker 了,以 vm1 作为管理节点,在 vm1 上执行 docker swarm init 命令
$ docker swarm init --advertise--addr 192.168.1.1Swarm initialized: current node (d65uz80dl1y5cf43717fh2m07) is now a manager.To add a worker to this swarm, run the following command:docker swarm join \--token SWMTKN-1-4bfzferp69e97yzd7f83d6cadjjov5sl9klyqr8mhe0tp89m2o-6ic02cl2fzyq3jfvz7n611ujl \192.168.48.128:2377To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
如果你的 Docker 主机有多个网卡,拥有多个 IP,必须使用 --advertise-addr 指定 IP,执行 docker swarm init 命令的节点,会自动成为管理节点
添加工作节点
在 vm1 vm2 上执行
$ docker swarm join \--token SWMTKN-1-4bfzferp69e97yzd7f83d6cadjjov5sl9klyqr8mhe0tp89m2o-6ic02cl2fzyq3jfvz7n611ujl \192.168.48.128:2377This node joined a swarm as a worker.
如果执行 docker swarm join 后报以下错误
Error response from daemon: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 192.168.48.128:2377: connect: no route to host"
这是由于管理节点的机器防火墙导致的,使用 systemctl status firewalld.service 查看防火墙状态
$ systemctl status firewalld.service● firewalld.service - firewalld - dynamic firewall daemonLoaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)Active: active (running) since 四 2020-12-10 00:36:44 CST; 9h agoDocs: man:firewalld(1)Main PID: 737 (firewalld)Tasks: 2Memory: 1.4MCGroup: /system.slice/firewalld.service└─737 /usr/bin/python2 -Es /usr/sbin/firewalld --nofork --nopid
可以看到防火墙是 running 状态,需要开放 2377 端口,或者直接关闭防火墙。由于我这里是演示环境,所以就简单点直接关闭防火墙
$ systemctl status firewalld.service
关闭后再执行 docker swarm join 就可以了
查看集群
在管理节点上执行 docker node ls 查看集群节点信息
$ docker node lsID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION50x6p24w5b1csmcoolrbjeh20 localhost.localdomain Ready Active 20.10.0d65uz80dl1y5cf43717fh2m07 * localhost.localdomain Ready Active Leader 20.10.0oo1py2br7jr96lk8rv7wkcxwt localhost.localdomain Ready Active 20.10.0
可以看到 d65uz80dl1y5cf43717fh2m07 这个节点是 leader 节点,其他两个是 worker 节点
服务管理
docker service 命令可以用来管理集群中的服务,该命令只能在管理节点上运行
创建服务
以 nginx 为例,使用 docker service create 创建一个 nginx 服务
$ docker service create --replicas 2 -p 80:80 --name nginx nginx:1.13.7-alpineimage nginx:1.19.5-alpine could not be accessed on a registry to recordits digest. Each node will access nginx:1.19.5-alpine independently,possibly leading to different nodes running differentversions of the image.e3gn7co90wqp4zbwpse6719bqoverall progress: 2 out of 2 tasks1/2: running [==================================================>]2/2: running [==================================================>]verify: Service converged
成功创建服务之后,可以通过任意节点的 80 端口访问 nginx 服务
查看服务
docker service ls
使用 docker service ls 查看 swarm 集群运行的服务
$ docker service lsID NAME MODE REPLICAS IMAGE PORTSe3gn7co90wqp nginx replicated 2/2 nginx:1.19.5-alpine *:80->80/tcp
docker service inspect
使用 docker service inspect 可以查看服务详情信息,格式如下
docker service inspect [OPTIONS] SERVICE [SERVICE...]
示例
$ docker service inspect nginx[{"ID": "c1u4o956nygnthpclzpvp91r4","Version": {"Index": 973},"CreatedAt": "2020-12-11T03:06:22.605951309Z","UpdatedAt": "2020-12-11T03:06:33.981409197Z","Spec": {"Name": "nginx","Labels": {},"TaskTemplate": {"ContainerSpec": {"Image": "nginx:1.19.5-alpine@sha256:1e9c503db9913a59156f78c6420f6e2f01c8a3b71ceeeddcd7f604c4db0f045e","Init": false,"StopGracePeriod": 10000000000,"DNSConfig": {},"Isolation": "default"},"Resources": {"Limits": {},"Reservations": {}},"RestartPolicy": {"Condition": "any","Delay": 5000000000,"MaxAttempts": 0},"Placement": {"Platforms": [{"Architecture": "amd64","OS": "linux"},{"OS": "linux"},{"OS": "linux"},{"Architecture": "arm64","OS": "linux"},{"Architecture": "386","OS": "linux"},{"Architecture": "ppc64le","OS": "linux"},{"Architecture": "s390x","OS": "linux"}]},"ForceUpdate": 0,"Runtime": "container"},"Mode": {"Replicated": {"Replicas": 4}},"UpdateConfig": {"Parallelism": 1,"FailureAction": "pause","Monitor": 5000000000,"MaxFailureRatio": 0,"Order": "stop-first"},"RollbackConfig": {"Parallelism": 1,"FailureAction": "pause","Monitor": 5000000000,"MaxFailureRatio": 0,"Order": "stop-first"},"EndpointSpec": {"Mode": "vip","Ports": [{"Protocol": "tcp","TargetPort": 80,"PublishedPort": 80,"PublishMode": "ingress"}]}},"PreviousSpec": {"Name": "nginx","Labels": {},"TaskTemplate": {"ContainerSpec": {"Image": "nginx:1.19.5-alpine@sha256:1e9c503db9913a59156f78c6420f6e2f01c8a3b71ceeeddcd7f604c4db0f045e","Init": false,"DNSConfig": {},"Isolation": "default"},"Resources": {"Limits": {},"Reservations": {}},"Placement": {"Platforms": [{"Architecture": "amd64","OS": "linux"},{"OS": "linux"},{"OS": "linux"},{"Architecture": "arm64","OS": "linux"},{"Architecture": "386","OS": "linux"},{"Architecture": "ppc64le","OS": "linux"},{"Architecture": "s390x","OS": "linux"}]},"ForceUpdate": 0,"Runtime": "container"},"Mode": {"Replicated": {"Replicas": 2}},"EndpointSpec": {"Mode": "vip","Ports": [{"Protocol": "tcp","TargetPort": 80,"PublishedPort": 80,"PublishMode": "ingress"}]}},"Endpoint": {"Spec": {"Mode": "vip","Ports": [{"Protocol": "tcp","TargetPort": 80,"PublishedPort": 80,"PublishMode": "ingress"}]},"Ports": [{"Protocol": "tcp","TargetPort": 80,"PublishedPort": 80,"PublishMode": "ingress"}],"VirtualIPs": [{"NetworkID": "kj5uhvjip1s25i77eixlsz6fr","Addr": "10.0.0.8/24"}]}}]
等价于
docker service inspect c1u4o956nygnthpclzpvp91r4
更多用法参考: https://docs.docker.com/engine/reference/commandline/service_inspect/
docker service ps
使用 docker service ps 查看某个服务的详情信息
$ docker service ps nginxID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTSwkgm6q8jnsfd nginx.1 nginx:1.19.5-alpine localhost.localdomain Running Running 4 hours ago6qztt4rxmo20 nginx.2 nginx:1.19.5-alpine localhost.localdomain Running Running 4 hours ago
docker service ps 除了查看运行中的服务外,输出还会显示服务的历史记录
$ docker servie ps nginxID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTSwkx10y8dt219 nginx.1 nginx:1.19.5-alpine localhost.localdomain Running Running 32 minutes agosdnkq64hfz5k \_ nginx.1 nginx:1.19.5-alpine localhost.localdomain Shutdown Complete 33 minutes agokijx58koymhi nginx.2 nginx:1.19.5-alpine localhost.localdomain Running Running 32 minutes ago
参考链接:https://docs.docker.com/engine/reference/commandline/service_ps/
docker service logs
使用 docker service logs 查看某个服务的运行日志,格式如下
docker service logs [OPTIONS] SERVICE|TASK
示例
$ docker service logs nginxnginx.2.6qztt4rxmo20@localhost.localdomain | /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configurationnginx.2.6qztt4rxmo20@localhost.localdomain | /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/nginx.2.6qztt4rxmo20@localhost.localdomain | /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.shnginx.2.6qztt4rxmo20@localhost.localdomain | 10-listen-on-ipv6-by-default.sh: Getting the checksum of /etc/nginx/conf.d/default.confnginx.2.6qztt4rxmo20@localhost.localdomain | 10-listen-on-ipv6-by-default.sh: Enabled listen on IPv6 in /etc/nginx/conf.d/default.confnginx.2.6qztt4rxmo20@localhost.localdomain | /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.shnginx.2.6qztt4rxmo20@localhost.localdomain | /docker-entrypoint.sh: Configuration complete; ready for start upnginx.1.wkgm6q8jnsfd@localhost.localdomain | /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configurationnginx.1.wkgm6q8jnsfd@localhost.localdomain | /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/nginx.1.wkgm6q8jnsfd@localhost.localdomain | /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.shnginx.1.wkgm6q8jnsfd@localhost.localdomain | 10-listen-on-ipv6-by-default.sh: Getting the checksum of /etc/nginx/conf.d/default.confnginx.1.wkgm6q8jnsfd@localhost.localdomain | 10-listen-on-ipv6-by-default.sh: Enabled listen on IPv6 in /etc/nginx/conf.d/default.confnginx.1.wkgm6q8jnsfd@localhost.localdomain | /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.shnginx.1.wkgm6q8jnsfd@localhost.localdomain | /docker-entrypoint.sh: Configuration complete; ready for start upnginx.1.wkgm6q8jnsfd@localhost.localdomain | 10.0.0.4 - - [10/Dec/2020:09:24:29 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.62 Safari/537.36" "-"nginx.1.wkgm6q8jnsfd@localhost.localdomain | 2020/12/10 09:24:30 [error] 30#30: *1 open() "/usr/share/nginx/html/favicon.ico" failed (2: No such file or directory), client: 10.0.0.4, server: localhost, request: "GET /favicon.ico HTTP/1.1", host: "192.168.48.130", referrer: "http://192.168.48.130/"nginx.1.wkgm6q8jnsfd@localhost.localdomain | 10.0.0.4 - - [10/Dec/2020:09:24:30 +0000] "GET /favicon.ico HTTP/1.1" 404 555 "http://192.168.48.130/" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.62 Safari/537.36" "-"
服务伸缩
使用 docker service scale 对服务运行的容器数量进行伸缩
