一:集群安装前提

https://dolphinscheduler.apache.org/zh-cn/docs/latest/user_doc/cluster-deployment.html

1:zk集群搭建

  1. zk章节

2:3台服务器atguigu用户,并配置免密登录

  1. hadoop章节

3:3台服务器每台机器配置主机名ip映射

  1. 192.168.234.128 hadoop101
  2. 192.168.234.129 hadoop102
  3. 192.168.234.130 hadoop103
  4. 192.168.234.131 hadoop104
  5. 192.168.234.132 hadoop105
  6. 192.168.234.133 hadoop106

4:新建文件夹放软件,分发

  1. sudo mkdir -p /ruozedata/software
  2. sudo mkdir -p /ruozedata/app
  3. chmod 777 -R /ruozedata
  4. tar -zxvf apache-dolphinscheduler-1.3.6-bin.tar.gz -C /ruozedata/software/
  5. mv apache-dolphinscheduler-1.3.6-bin/ dolphinscheduler-bin

5:三台机器修改目录权限

  1. sudo chown -R hadoop:hadoop dolphinscheduler-bin
  2. /

6:修改数据原配置,并增加mysql客户端jar放入lib内,分发

  1. cd conf/
  2. vi datasource.properties

内容为

  1. # mysql
  2. spring.datasource.driver-class-name=com.mysql.jdbc.Driver
  3. spring.datasource.url=jdbc:mysql://192.168.234.129:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8
  4. spring.datasource.username=dol
  5. spring.datasource.password=dol

修改并保存完后,执行 script 目录下的创建表及导入基础数据脚本

  1. sh script/create-dolphinscheduler.sh

7:hadoop102配置 cd conf/env 修改目录下环境变量dolphinscheduler_env.sh,分发

  1. export HADOOP_HOME=/opt/module/hadoop-3.1.3
  2. export HADOOP_CONF_DIR=/opt/module/hadoop-3.1.3/etc/hadoop
  3. export JAVA_HOME=/opt/module/jdk1.8.0_212
  4. export HIVE_HOME=/opt/module/hive
  5. export PATH=$HADOOP_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$PATH
  6. ######################################以下不用配置,是当前环境变量配置
  7. #JAVA_HOME
  8. export JAVA_HOME=/opt/module/jdk1.8.0_212
  9. export PATH=$PATH:$JAVA_HOME/bin
  10. #HADOOP_HOME
  11. export HADOOP_HOME=/opt/module/hadoop-3.1.3
  12. export PATH=$PATH:$HADOOP_HOME/bin
  13. export PATH=$PATH:$HADOOP_HOME/sbin
  14. #HIVE_HOME
  15. HIVE_HOME=/opt/module/hive
  16. PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin
  17. export PATH JAVA_HOME HADOOP_HOME HIVE_HOME
  18. #KAFKA_HOME
  19. export KAFKA_HOME=/opt/module/kafka
  20. export PATH=$PATH:$KAFKA_HOME/bin
  21. ########################################

8:三台机器增加软连接

  1. sudo ln -s /opt/module/jdk1.8.0_212/bin/java /usr/bin/java

9:修改一键部署配置文件 conf/config/install_config.conf中的各参数,特别注意以下参数的配置,分发

  1. #
  2. # Licensed to the Apache Software Foundation (ASF) under one or more
  3. # contributor license agreements. See the NOTICE file distributed with
  4. # this work for additional information regarding copyright ownership.
  5. # The ASF licenses this file to You under the Apache License, Version 2.0
  6. # (the "License"); you may not use this file except in compliance with
  7. # the License. You may obtain a copy of the License at
  8. #
  9. # http://www.apache.org/licenses/LICENSE-2.0
  10. #
  11. # Unless required by applicable law or agreed to in writing, software
  12. # distributed under the License is distributed on an "AS IS" BASIS,
  13. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  14. # See the License for the specific language governing permissions and
  15. # limitations under the License.
  16. #
  17. # NOTICE : If the following config has special characters in the variable `.*[]^${}\+?|()@#&`, Please escape, for example, `[` escape to `\[`
  18. # postgresql or mysql
  19. dbtype="mysql"
  20. # db config
  21. # db address and port
  22. dbhost="192.168.234.129:3306"
  23. # db username
  24. username="dol"
  25. # database name
  26. dbname="dolphinscheduler"
  27. # db passwprd
  28. # NOTICE: if there are special characters, please use the \ to escape, for example, `[` escape to `\[`
  29. password="dol"
  30. # zk cluster
  31. zkQuorum="hadoop102:2181,hadoop103:2181,hadoop104:2181"
  32. # Note: the target installation path for dolphinscheduler, please not config as the same as the current path (pwd)
  33. installPath="/ruozedata/app/dolphinscheduler"
  34. # deployment user
  35. # Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself
  36. deployUser="atguigu"
  37. # alert config
  38. # mail server host
  39. mailServerHost="smtp.qq.com"
  40. # mail server port
  41. # note: Different protocols and encryption methods correspond to different ports, when SSL/TLS is enabled, make sure the port is correct.
  42. mailServerPort="25"
  43. # sender
  44. mailSender="505543479@qq.com"
  45. # user
  46. mailUser="505543479@qq.com"
  47. # sender password
  48. # note: The mail.passwd is email service authorization code, not the email login password.
  49. mailPassword="zyeeixgwjmuvcafc"
  50. # TLS mail protocol support
  51. starttlsEnable="true"
  52. # SSL mail protocol support
  53. # only one of TLS and SSL can be in the true state.
  54. sslEnable="false"
  55. #note: sslTrust is the same as mailServerHost
  56. sslTrust="smtp.qq.com"
  57. # resource storage type: HDFS, S3, NONE
  58. #resourceStorageType="NONE"
  59. # if resourceStorageType is HDFS锛宒efaultFS write namenode address锛孒A you need to put core-site.xml and hdfs-site.xml in the conf directory.
  60. # if S3锛寃rite S3 address锛孒A锛宖or example 锛歴3a://dolphinscheduler锛?
  61. # Note锛宻3 be sure to create the root directory /dolphinscheduler
  62. #defaultFS="hdfs://mycluster:8020"
  63. # if resourceStorageType is S3, the following three configuration is required, otherwise please ignore
  64. #s3Endpoint="http://192.168.xx.xx:9010"
  65. #s3AccessKey="xxxxxxxxxx"
  66. #s3SecretKey="xxxxxxxxxx"
  67. # if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty
  68. #yarnHaIps="192.168.xx.xx,192.168.xx.xx"
  69. # if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname
  70. #singleYarnIp="yarnIp1"
  71. # resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions. "/dolphinscheduler" is recommended
  72. resourceUploadPath="/rzdolphinscheduler"
  73. # who have permissions to create directory under HDFS/S3 root path
  74. # Note: if kerberos is enabled, please config hdfsRootUser=
  75. hdfsRootUser="atguigu"
  76. # kerberos config
  77. # whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore
  78. #kerberosStartUp="false"
  79. # kdc krb5 config file path
  80. #krb5ConfPath="$installPath/conf/krb5.conf"
  81. # keytab username
  82. #keytabUserName="hdfs-mycluster@ESZ.COM"
  83. # username keytab path
  84. #keytabPath="$installPath/conf/hdfs.headless.keytab"
  85. # api server port
  86. apiServerPort="12345"
  87. # install hosts
  88. # Note: install the scheduled hostname list. If it is pseudo-distributed, just write a pseudo-distributed hostname
  89. ips="hadoop102,hadoop103,hadoop104"
  90. # ssh port, default 22
  91. # Note: if ssh port is not default, modify here
  92. sshPort="22"
  93. # run master machine
  94. # Note: list of hosts hostname for deploying master
  95. masters="hadoop102,hadoop103"
  96. # run worker machine
  97. # note: need to write the worker group name of each worker, the default value is "default"
  98. workers="hadoop102:default,hadoop103:default,hadoop104:default"
  99. # run alert machine
  100. # note: list of machine hostnames for deploying alert server
  101. alertServer="hadoop102"
  102. # run api machine
  103. # note: list of machine hostnames for deploying api server
  104. apiServers="hadoop102,hadoop103"

如果配置了hdfs,则下面参数需要注意

  1. # 业务用到的比如sql等资源文件上传到哪里,可以设置:HDFS,S3,NONE,单机如果想使用本地文件系统,请配置为HDFS,因为HDFS支持本地文件系统;如果不需要资源上传功能请选择NONE。强调一点:使用本地文件系统不需要部署hadoop
  2. resourceStorageType="HDFS"
  3. #如果上传资源保存想保存在hadoop上,hadoop集群的NameNode启用了HA的话,需要将hadoop的配置文件core-site.xml和hdfs-site.xml放到安装路径的conf目录下,本例即是放到/opt/soft/dolphinscheduler/conf下面,并配置namenode cluster名称;如果NameNode不是HA,则只需要将mycluster修改为具体的ip或者主机名即可
  4. defaultFS="hdfs://mycluster:8020"
  5. # 如果没有使用到Yarn,保持以下默认值即可;如果ResourceManager是HA,则配置为ResourceManager节点的主备ip或者hostname,比如"192.168.xx.xx,192.168.xx.xx";如果是单ResourceManager请配置yarnHaIps=""即可
  6. yarnHaIps="192.168.xx.xx,192.168.xx.xx"
  7. # 如果ResourceManager是HA或者没有使用到Yarn保持默认值即可;如果是单ResourceManager,请配置真实的ResourceManager主机名或者ip
  8. singleYarnIp="yarnIp1"
  9. # 资源上传根路径,主持HDFS和S3,由于hdfs支持本地文件系统,需要确保本地文件夹存在且有读写权限
  10. resourceUploadPath="/data/dolphinscheduler"
  11. # 具备权限创建resourceUploadPath的用户
  12. hdfsRootUser="hdfs"

10:一键部署

切换到atguigu用户

  1. sh install.sh

脚本完成后,会启动以下5个服务,使用jps命令查看服务是否启动(jps为java JDK自带)

  1. MasterServer ----- master服务
  2. WorkerServer ----- worker服务
  3. LoggerServer ----- logger服务
  4. ApiApplicationServer ----- api服务
  5. AlertServer ----- alert服务

部署成功后,可以进行日志查看,日志统一存放于/ruozedata/app/logs文件夹内

  1. logs/
  2. ├── dolphinscheduler-alert-server.log
  3. ├── dolphinscheduler-master-server.log
  4. |—— dolphinscheduler-worker-server.log
  5. |—— dolphinscheduler-api-server.log
  6. |—— dolphinscheduler-logger-server.log

11:登录系统

访问前端页面地址,接口ip(自行修改) http://192.168.234.129:12345/dolphinscheduler
用户名admin 密码dolphinscheduler123
image.png

12:启停服务

一键停止集群所有服务sh ./bin/stop-all.sh
一键开启集群所有服务sh ./bin/start-all.sh

启停Master
sh ./bin/dolphinscheduler-daemon.sh start master-server sh ./bin/dolphinscheduler-daemon.sh stop master-server 
启停Worker
sh ./bin/dolphinscheduler-daemon.sh start worker-server sh ./bin/dolphinscheduler-daemon.sh stop worker-server 
启停Api
sh ./bin/dolphinscheduler-daemon.sh start api-server sh ./bin/dolphinscheduler-daemon.sh stop api-server 
启停Logger
sh ./bin/dolphinscheduler-daemon.sh start logger-server sh ./bin/dolphinscheduler-daemon.sh stop logger-server 
启停Alert
sh ./bin/dolphinscheduler-daemon.sh start alert-server sh ./bin/dolphinscheduler-daemon.sh stop alert-server