资源规划

组件 LTSR003 LTSR005 LTSR006 LTSR007 LTSR008
OS ubuntu-16.04 ubuntu-16.04 ubuntu-16.04 ubuntu-16.04 ubuntu-16.04
JDK jvm jvm jvm jvm jvm
Azkaban ExecutorServer ExecutorServer ExecutorServer ExecutorServer ExecutorServer/WebServer

安装介质

版本:azkaban-3.47.0.tar.gz
下载:https://codeload.github.com/azkaban/azkaban/tar.gz/3.47.0
安装:https://www.cnblogs.com/bujunpeng/p/9093124.html

部署模式

模式 名称 描述
solo-server 独立部署模式 使用内置h2存储元数据
two-server 两个服务器模式 1个webServer,1个execServer,在同一服务器上,使用mysql存储元数据
multiple-executor 多执行器模式 1个webServer,多个execServer分布在不同服务上,使用mysql存储元数据

构建

  1. cd ~/software
  2. git clone https://github.com/azkaban/azkaban.git
  3. cd azkaban
  4. ######################### 构建特定版本,否则将构建最新版 #########################
  5. # git tag -l -n # 查看所有tag
  6. # git branch -vv # 查看当前分支
  7. # git describe --abbrev=0 --tags # 查看当前分支对应tag
  8. # git checkout 3.84.0
  9. ######################### END #########################
  10. ./gradlew build -x test

如果构建过程特别慢或者下载失败,可以直接下载: gradle(https://services.gradle.org/distributions/gradle-4.6-all.zip) ,并放置于~/.gradle/wrapper/dists/gradle-.-all/*/下。(下载gradle的版本从~/software/azkaban/gradle/wrapper/gradle-wrapper.properties中查找)
可以通过给gradle添加国内源加快构建速度:

  1. cd ~/.gradle
  2. vi init.gradle

配置如下:

  1. allprojects{
  2. repositories {
  3. def ALIYUN_REPOSITORY_URL = 'http://maven.aliyun.com/nexus/content/groups/public'
  4. def ALIYUN_JCENTER_URL = 'http://maven.aliyun.com/nexus/content/repositories/jcenter'
  5. all { ArtifactRepository repo ->
  6. if(repo instanceof MavenArtifactRepository){
  7. def url = repo.url.toString()
  8. if (url.startsWith('https://repo1.maven.org/maven2')) {
  9. project.logger.lifecycle "Repository ${repo.url} replaced by $ALIYUN_REPOSITORY_URL."
  10. remove repo
  11. }
  12. if (url.startsWith('https://jcenter.bintray.com/')) {
  13. project.logger.lifecycle "Repository ${repo.url} replaced by $ALIYUN_JCENTER_URL."
  14. remove repo
  15. }
  16. }
  17. }
  18. maven {
  19. url ALIYUN_REPOSITORY_URL
  20. url ALIYUN_JCENTER_URL
  21. }
  22. }
  23. }

每次重新开启构建过程前请清理之前的缓存:

  1. rm -rf ~/.gradle/caches/build-cache-1/*
  2. rm -rf ~/software/azkaban/.gradle/vcsWorkingDirs/*
  3. ./gradlew clean
  4. ./gradlew build -x test

查看构建版本:

  1. for tar in ~/software/azkaban/azkaban-*/build/distributions/*.tar.gz; do echo $tar; done

解压待用:

  1. cd ~/software/azkaban
  2. # 解压到当前目录(执行以下shell的目录)
  3. for tar in ~/software/azkaban/azkaban-*/build/distributions/*.tar.gz; do tar xvf $tar; done

元数据初始化

前提条件:需要有一个mysql服务器。

  1. # LTSR006(MySQL安装节点)
  2. mysql -h LTSR006 -P 3306 -u root -plonton

创建azkaban库,azkaban用户密码,并赋予远程连接,并初始化元数据。

  1. CREATE DATABASE azkabandb;
  2. CREATE USER 'azkauser'@'%' IDENTIFIED BY 'azkauser';
  3. CREATE USER 'azkauser'@'localhost' IDENTIFIED BY 'azkauser';
  4. grant all privileges on azkabandb.* to 'azkauser'@'%' identified by 'azkauser';
  5. grant all privileges on azkabandb.* to 'azkauser'@'localhost' identified by 'azkauser';
  6. flush privileges;
  7. -- 初始化元数据
  8. use azkabandb;
  9. source ~/software/azkaban/azkaban-db-3.84.0/create-all-sql-3.84.0.sql

MySQL相关参数配置:

  1. sudo vi /etc/mysql/mysql.conf.d/mysqld.cnf

配置如下:

  1. # 修改接受数据包大小,防止写入或更新失败(默认:1024/1KB,可用show VARIABLES like '%max_allowed_packet%';进行查看)
  2. max_allowed_packet = 1024M

重启MySQL服务:

  1. sudo service mysql restart

安装Azkaban-web-server

根据资源规划,Azkaban-web-server将安装于LTSR008节点。

生成密钥对和证书

Keytool是java数据证书的管理工具,使用户能够管理自己的公/私钥对及相关证书。

参数 功能描述
-keystore 指定密钥库的名称及位置(产生的各类信息将存在.keystore文件中)
-genkey/-genkeypair 生成密钥对
-alias 为生成的密钥对定别名,如果没有默认是mykey
-keyalg 指定密钥的算法RSA/DSA,默认是DSA

执行命令如下:

  1. cd ~/software/azkaban/azkaban-web-server-3.84.0
  2. keytool -keystore keystore -alias jetty -genkey -keyalg RSA

操作过程如下:

  1. Enter keystore password: 输入密码【000000
  2. Re-enter new password: 再次输入密码【000000
  3. What is your first and last name?
  4. [Unknown]: 直接回车
  5. What is the name of your organizational unit?
  6. [Unknown]: 直接回车
  7. What is the name of your organization?
  8. [Unknown]: 直接回车
  9. What is the name of your City or Locality?
  10. [Unknown]: 直接回车
  11. What is the name of your State or Province?
  12. [Unknown]: 直接回车
  13. What is the two-letter country code for this unit?
  14. [Unknown]: 直接回车
  15. Is CN=YY, OU=YY, O=YY, L=shanghai, ST=shanghai, C=CN correct?
  16. [no]: y

配置azkaban.properties

  1. vi ~/software/azkaban/azkaban-web-server-3.84.0/conf/azkaban.properties

配置如下:

  1. ################## 个性化设置
  2. # 服务器UI名称,用于服务器上方显示的名字(Unicode:龙通科技)
  3. azkaban.name=\u9f99\u901a\u79d1\u6280
  4. # 描述(Unicode:作业调度系统,http://www.bejson.com/convert/unicode_chinese)
  5. azkaban.label=\u4f5c\u4e1a\u8c03\u5ea6\u7cfb\u7edf
  6. # UI颜色
  7. azkaban.color=#33ccff
  8. # 默认Web文件目录
  9. web.resource.dir=/home/lonton/software/azkaban/azkaban-web-server-3.84.0/web/
  10. # 默认时区
  11. default.timezone.id=Asia/Shanghai
  12. ################## 用户权限管理
  13. user.manager.xml.file=/home/lonton/software/azkaban/azkaban-web-server-3.84.0/conf/azkaban-users.xml
  14. ################## Global配置
  15. executor.global.properties=/home/lonton/software/azkaban/azkaban-web-server-3.84.0/conf/global.properties
  16. ################## MySQL配置
  17. database.type=mysql
  18. mysql.port=3306
  19. mysql.host=LTSR006
  20. mysql.database=azkabandb
  21. mysql.user=azkauser
  22. mysql.password=azkauser
  23. mysql.numconnections=100
  24. ################## SSL配置
  25. jetty.use.ssl=true
  26. jetty.maxThreads=25
  27. jetty.port=8081
  28. jetty.ssl.port=8443
  29. jetty.keystore=/home/lonton/software/azkaban/azkaban-web-server-3.84.0/keystore
  30. jetty.password=000000
  31. jetty.keypassword=000000
  32. jetty.truststore=/home/lonton/software/azkaban/azkaban-web-server-3.84.0/keystore
  33. jetty.trustpassword=000000
  34. ################## 邮件配置
  35. mail.sender=450733605@qq.com
  36. mail.host=smtp.qq.com
  37. mail.user=450733605@qq.com
  38. mail.password=ktzwkmykjrbpbjjc
  39. job.failure.email=450733605@qq.com
  40. job.success.email=450733605@qq.com
  41. ################## 执行器端口设置
  42. executor.port=12321

注意:所有路径请配置为绝对路径。

添加管理员

  1. vi ~/software/azkaban/azkaban-web-server-3.84.0/conf/azkaban-users.xml

配置如下:

  1. <azkaban-users>
  2. <user username="azkaban" password="azkaban" roles="admin" groups="azkaban"/>
  3. <user username="metrics" password="metrics" roles="metrics"/>
  4. <!-- 自定义管理员(lonton/lonton) -->
  5. <user username="lonton" password="lonton" roles="admin,metrics" />
  6. <role name="admin" permissions="ADMIN" />
  7. <role name="metrics" permissions="METRICS"/>
  8. </azkaban-users>

安装Azkaban-exec-server

配置azkaban.properties

  1. vi ~/software/azkaban/azkaban-exec-server-3.84.0/conf/azkaban.properties

配置如下:

  1. ################## 个性化设置
  2. # 默认时区
  3. default.timezone.id=Asia/Shanghai
  4. ################## Global配置
  5. executor.global.properties=/home/lonton/software/azkaban/azkaban-exec-server-3.84.0/conf/global.properties
  6. ################## azkaban Web Server URL
  7. azkaban.webserver.url=https://LTSR008:8443
  8. ################## 插件配置
  9. azkaban.jobtype.plugin.dir=/home/lonton/software/azkaban/azkaban-exec-server-3.84.0/plugins/jobtypes
  10. ################## MySQL配置
  11. database.type=mysql
  12. mysql.port=3306
  13. mysql.host=LTSR006
  14. mysql.database=azkabandb
  15. mysql.user=azkauser
  16. mysql.password=azkauser
  17. mysql.numconnections=100
  18. ################## 执行器端口设置
  19. executor.port=12321

配置commonprivate.properties

  1. vi ~/software/azkaban/azkaban-exec-server-3.84.0/plugins/jobtypes/commonprivate.properties

配置如下:

  1. execute.as.user=false
  2. azkaban.native.lib=false

添加derby.jar

由于jdk-8u121-linux-x64.tar.gz之后很多版本中不含db\lib\derby.jar,则运行Azkaban工程将抛出如下异常:Exception in thread “main” java.lang.NoClassDefFoundError: Could not initialize class org.apache.derby.jdbc.AutoloadedDriver40
解决方法:
解压jdk-8u121-linux-x64.tar.gz,找到derby.jar,然后上传到${Web-Server-HOME}/lib和${Exec-Server-HOME}/lib中,然后重新启动。
derby.jar

分发

  1. xsync ~/software/azkaban/azkaban-exec-server-*

启动验证

先启动executor,再启动web,避免Web Server会因为找不到执行器而启动失败。

  • 启动executor

    1. cd ~/software/azkaban/azkaban-exec-server-3.84.0/bin
    2. # 启动
    3. ./start-exec.sh
    4. # 示例:LTSR008节点,激活executor
    5. curl http://LTSR008:12321/executor?action=activate
    6. curl http://LTSR003:12321/executor?action=activate
    7. curl http://LTSR005:12321/executor?action=activate
    8. curl http://LTSR006:12321/executor?action=activate
    9. curl http://LTSR007:12321/executor?action=activate
    10. # 停止
    11. ./shutdown-exec.sh
  • 启动Web Server

    1. cd ~/software/azkaban/azkaban-web-server-3.84.0/bin
    2. # 启动
    3. ./start-web.sh
    4. # 停止
    5. ./shutdown-web.sh
  • 验证服务

https://ltsr008:8443 (lonton/lonton)

改进执行器

在使用Azkaban时我们会面临将某些任务放到指定的executor上去运行的情况,我们通过executor的id(useExecutor)来指定executor服务器去执行任务。
1.webp
可通过查看数据库的方式,获取exec_id:

  1. -- mysql -h LTSR006 -P 3306 -u root -plonton
  2. use azkabandb;
  3. select * from executors;

但是,当集群由于某些原因挂掉,executor重启后exec_id默认是自增长的,这就产生一个很麻烦的问题,我们需要将所有任务重新进行配置。
可通过以下思路解决:在启动executor后,executor会向元数据库executors表注册,shell脚本(start-exec-upgrade.sh)中,首先判断executor是否启动,如果启动就执行更改数据库的操作。

  1. #!/bin/bash
  2. # Start service for azkaban
  3. # Base Env
  4. SCRIPTS_DIR=$(dirname $0)
  5. HOSTNAME=`hostname -f`
  6. # User Env For MySQL
  7. PORT="3306"
  8. USERNAME="azkauser"
  9. PASSWORD="azkauser"
  10. DBNAME="azkabandb"
  11. HOST="LTSR006"
  12. # Azkaban activate status
  13. ACTIVE=1
  14. # Get the IP address the last (custom DIY)
  15. ID=3
  16. # pass along command line arguments to the internal launch script.
  17. ${SCRIPTS_DIR}/internal/internal-start-executor.sh "$@" >executorServerLog__`date +%F+%T`.out 2>&1 &
  18. # MySQL Exc
  19. MYSQL="mysql -u${USERNAME} -p${PASSWORD} -h${HOST} ${DBNAME}"
  20. select_sql="select count(1) from executors where host='${HOSTNAME}'"
  21. update_sql="update executors set id=${ID},active=${ACTIVE} where host='${HOSTNAME}';"
  22. while true
  23. do
  24. res=`$MYSQL -e "$select_sql" 2>/dev/null | sed '1d'`
  25. if [ $res -eq 1 ];then
  26. break
  27. fi
  28. sleep 3
  29. echo -e "...."
  30. done
  31. # Update
  32. $MYSQL -e "$update_sql" 2>/dev/null
  33. [ $? -eq 0 ] && echo "start ......" || {
  34. echo "stop ......"
  35. }

将start-exec-upgrade.sh分发至各个excutor节点,并修改对应的id,之后重新启动并激活。

  1. # 分发
  2. xsync ~/software/azkaban/azkaban-exec-server-*
  3. # 脚本赋权
  4. xcall chmod 755 ~/software/azkaban/azkaban-exec-server-3.84.0/bin/start-exec-upgrade.sh
  5. # 修改节点对应exec_id
  6. cd ~/software/azkaban/azkaban-exec-server-3.84.0/bin
  7. vi start-exec-upgrade.sh
  8. ############ LTSR003
  9. # ID=3
  10. ############ LTSR005
  11. # ID=5
  12. ############ LTSR006
  13. # ID=6
  14. ############ LTSR007
  15. # ID=7
  16. ############ LTSR008
  17. # ID=8
  18. # 启动
  19. bash start-exec-upgrade.sh
  20. # 激活(未激活作业执行抛异常:executor became inactive before setting up the flow xx)
  21. curl http://LTSR008:12321/executor?action=activate
  22. curl http://LTSR003:12321/executor?action=activate
  23. curl http://LTSR005:12321/executor?action=activate
  24. curl http://LTSR006:12321/executor?action=activate
  25. curl http://LTSR007:12321/executor?action=activate
  26. # 停止
  27. ./shutdown-exec.sh

作业参考

Azkaban-jobs.zip