参考
Oozie Shell Action Extension - [http://oozie.apache.org/docs/4.0.0/DG_ShellActionExtension.html](http://oozie.apache.org/docs/4.0.0/DG_ShellActionExtension.html)
exec元素必须包含要执行的Shell命令的路径。- 然后可以使用一个或多个
argument元素指定Shell命令的参数。argument(如果存在)包含传递给Shell命令的参数。
oozie安装目录下,oozie-examples.tar.gz 中存在一些案例。
解压到当前目录,并查看。创建oozie-apps目录,保存测试示例。
$ tar -zxf oozie-examples.tar.gz$ cd examples/$ lsapps input-data src$ ls appsaggregator cron custom-main demo hadoop-el hive map-reduce pig sla sqoop-freeform streamingbundle cron-schedule datelist-java-main distcp hcatalog java-main no-op shell sqoop ssh subwf$ cd ..$ mkdir oozie-apps
安装目录下,调度shell脚本的示例路径 examples/apps/shell
oozie 命令
执行工作流任务
$ oozie job -oozie http://192.168.32.130:11000/oozie -config oozie-apps/shell/job.properties -run
检查工作流任务状态
$ oozie job -oozie http://192.168.32.130:11000/oozie -info 0000000-200420185350972-oozie-jack-W
要通过Oozie web控制台检查工作流作业状态,可以使用浏览器转到“http://localhost:11000/oozie”。
为了避免在每个 Oozie 命令中使用 Oozie URL 提供-oozie选项,可以在shell环境中将 OOZIE_URL 环境变量设置为 Oozie URL。例如:
$ export OOZIE_URL="http://192.168.32.130:11000/oozie"$$ oozie job -info 0000000-200420185350972-oozie-jack-W··· ···
杀掉工作流任务
$ oozie job -oozie http://192.168.32.130:11000/oozie -kill 0000000-200420185350972-oozie-jack-W
调度shell命令
参考examples下shell工作流任务示例,编辑job.properties 和 workflow.xml 文件。执行运行shell命令的工作流任务。
$ cp -r examples/apps/shell/ oozie-apps/$ ls oozie-apps/shelljob.properties workflow.xml
job.properties
# job.properties## Licensed to the Apache Software Foundation (ASF) under one# or more contributor license agreements. See the NOTICE file# distributed with this work for additional information# regarding copyright ownership. The ASF licenses this file# to you under the Apache License, Version 2.0 (the# "License"); you may not use this file except in compliance# with the License. You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.#nameNode=hdfs://192.168.32.130:8020# yarn.resourcemanager.address ${yarn.resourcemanager.hostname}:8032jobTracker=192.168.32.130:8032queueName=defaultexamplesRoot=oozie-apps# 执行应用程序信息路径 hdfs://192.168.32.130:8020/user/jack/oozie-apps/shelloozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/shell
workflow.xml
<!--Licensed to the Apache Software Foundation (ASF) under oneor more contributor license agreements. See the NOTICE filedistributed with this work for additional informationregarding copyright ownership. The ASF licenses this fileto you under the Apache License, Version 2.0 (the"License"); you may not use this file except in compliancewith the License. You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--><workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf"><start to="shell-node"/><action name="shell-node"><shell xmlns="uri:oozie:shell-action:0.2"><job-tracker>${jobTracker}</job-tracker><name-node>${nameNode}</name-node><configuration><property><name>mapred.job.queue.name</name><value>${queueName}</value></property></configuration><exec>echo</exec><argument>my_output=Hello Oozie</argument><capture-output/></shell><ok to="check-output"/><error to="fail"/></action><decision name="check-output"><switch><case to="end">${wf:actionData('shell-node')['my_output'] eq 'Hello Oozie'}</case><default to="fail-output"/></switch></decision><kill name="fail"><message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message></kill><kill name="fail-output"><message>Incorrect output, expected [Hello Oozie] but was [${wf:actionData('shell-node')['my_output']}]</message></kill><end name="end"/></workflow-app>
执行
$ # 上传任务$ ~/Documents/hadoop/bin/hadoop fs -put oozie-apps oozie-apps$ # 查看目录信息$ ~/Documents/hadoop/bin/hadoop fs -ls /user/jack/oozie-apps/shell/Found 2 items-rw-r--r-- 1 jack supergroup 1047 2020-04-21 00:48 /user/jack/oozie-apps/shell/job.properties-rw-r--r-- 1 jack supergroup 2075 2020-04-21 00:48 /user/jack/oozie-apps/shell/workflow.xml$ # 执行工作流任务$ bin/oozie job -oozie http://192.168.32.130:11000/oozie -config oozie-apps/shell/job.properties -runjob: 0000000-200420185350972-oozie-jack-W$ # 检查工作流任务状态,也可以通过Oozie web控制台检查工作流作业状态,http://localhost:11000/oozie$ bin/oozie job -oozie http://192.168.32.130:11000/oozie -info 0000000-200420185350972-oozie-jack-WJob ID : 0000000-200420185350972-oozie-jack-W------------------------------------------------------------------------------------------------------------------------------------Workflow Name : shell-wfApp Path : hdfs://192.168.32.130:8020/user/jack/oozie-apps/shellStatus : SUCCEEDEDRun : 0User : jackGroup : -Created : 2020-04-21 07:58 GMTStarted : 2020-04-21 07:58 GMTLast Modified : 2020-04-21 07:58 GMTEnded : 2020-04-21 07:58 GMTCoordAction ID: -Actions------------------------------------------------------------------------------------------------------------------------------------ID Status Ext ID Ext Status Err Code------------------------------------------------------------------------------------------------------------------------------------0000000-200420185350972-oozie-jack-W@:start: OK - OK -------------------------------------------------------------------------------------------------------------------------------------0000000-200420185350972-oozie-jack-W@shell-node OK job_1586921478592_0021 SUCCEEDED -------------------------------------------------------------------------------------------------------------------------------------0000000-200420185350972-oozie-jack-W@check-output OK - end -------------------------------------------------------------------------------------------------------------------------------------0000000-200420185350972-oozie-jack-W@end OK - OK -------------------------------------------------------------------------------------------------------------------------------------
调度shell脚本
修改上述shell命令工作流任务,编辑 job.properties 和 workflow.xml 文件,创建shell脚本,创建运行shell脚本的工作流任务。
$ cp -r examples/apps/shell oozie-apps/shell-script$ ls oozie-apps/shell-scriptjob.properties workflow.xml$ vi oozie-apps/shell-script/batch.sh#!/bin/bashecho "Hello! It's time to run. [`date`]" > `p=/tmp/oozie;[[ ! -d "${p}" ]] && mkdir -p ${p};echo ${p}/workflow.log`
job.properties
nameNode=hdfs://192.168.32.130:8020# yarn.resourcemanager.address ${yarn.resourcemanager.hostname}:8032jobTracker=192.168.32.130:8032queueName=defaultexamplesRoot=oozie-apps# 执行应用程序信息路径 hdfs://192.168.32.130:8020/user/jack/oozie-apps/shell-scriptoozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/shell-scriptEXEC=batch.sh
workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf"><start to="shell-node"/><action name="shell-node"><shell xmlns="uri:oozie:shell-action:0.2"><job-tracker>${jobTracker}</job-tracker><name-node>${nameNode}</name-node><configuration><property><name>mapred.job.queue.name</name><value>${queueName}</value></property></configuration><exec>${EXEC}</exec><file>${EXEC}#${EXEC}</file><capture-output/></shell><ok to="end"/><error to="fail"/></action><kill name="fail"><message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message></kill><end name="end"/></workflow-app>
执行
$ # 上传任务 -f 覆盖$ ~/Documents/hadoop/bin/hadoop fs -put oozie-apps/shell-script oozie-apps/$ # 查看目录信息$ ~/Documents/hadoop/bin/hadoop fs -ls /user/jack/oozie-apps/shell-script/Found 3 items-rw-r--r-- 1 jack supergroup 131 2020-04-21 02:30 /user/jack/oozie-apps/shell-script/batch.sh-rw-r--r-- 1 jack supergroup 1164 2020-04-21 02:30 /user/jack/oozie-apps/shell-script/job.properties-rw-r--r-- 1 jack supergroup 1640 2020-04-21 02:30 /user/jack/oozie-apps/shell-script/workflow.xml$ export OOZIE_URL="http://192.168.32.130:11000/oozie"$ # 执行工作流任务$ bin/oozie job -config oozie-apps/shell-script/job.properties -runjob: 0000001-200420185350972-oozie-jack-W$ # 检查工作流任务状态,也可以通过Oozie web控制台检查工作流作业状态,http://localhost:11000/oozie$ bin/oozie job -info 0000001-200420185350972-oozie-jack-WJob ID : 0000001-200420185350972-oozie-jack-W------------------------------------------------------------------------------------------------------------------------------------Workflow Name : shell-wfApp Path : hdfs://192.168.32.130:8020/user/jack/oozie-apps/shell-scriptStatus : SUCCEEDEDRun : 0User : jackGroup : -Created : 2020-04-21 09:32 GMTStarted : 2020-04-21 09:32 GMTLast Modified : 2020-04-21 09:32 GMTEnded : 2020-04-21 09:32 GMTCoordAction ID: -Actions------------------------------------------------------------------------------------------------------------------------------------ID Status Ext ID Ext Status Err Code------------------------------------------------------------------------------------------------------------------------------------0000001-200420185350972-oozie-jack-W@:start: OK - OK -------------------------------------------------------------------------------------------------------------------------------------0000001-200420185350972-oozie-jack-W@shell-node OK job_1586921478592_0022 SUCCEEDED -------------------------------------------------------------------------------------------------------------------------------------0000001-200420185350972-oozie-jack-W@end OK - OK -------------------------------------------------------------------------------------------------------------------------------------$ # 查看输出内容$ cat /tmp/oozie/workflow.logHello! It's time to run. [Tue Apr 21 02:33:44 PDT 2020]
注意:当使用使用完全分布式集群时,该任务由Yarn的resource manager分配容器执行,具体运行位置信息,需要查看JobHistory。
逻辑调度多个shell任务
复制 oozie-apps/shell-script 文件夹,创建 log.sh 脚本。
使用多action,创建多个shell脚本连续运行的工作流任务。
$ cp oozie-apps/shell-script oozie-apps/shell-scripts$ ls oozie-apps/shell-scriptsbatch.sh job.properties workflow.xml$ vi oozie-apps/shell-scripts/log.sh#!/bin/bash# echo -n 末尾不自动换行echo "bytes length of current file: `cat /tmp/oozie/workflow.log | wc -c`" >> /tmp/oozie/workflow.log
job.properties
nameNode=hdfs://192.168.32.130:8020# yarn.resourcemanager.address ${yarn.resourcemanager.hostname}:8032jobTracker=192.168.32.130:8032queueName=defaultexamplesRoot=oozie-apps# 执行应用程序信息路径 hdfs://192.168.32.130:8020/user/jack/oozie-apps/shell-scriptsoozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/shell-scriptsEXEC1=batch.shEXEC2=log.sh
workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf"><start to="shell-node1"/><action name="shell-node1"><shell xmlns="uri:oozie:shell-action:0.2"><job-tracker>${jobTracker}</job-tracker><name-node>${nameNode}</name-node><configuration><property><name>mapred.job.queue.name</name><value>${queueName}</value></property></configuration><exec>${EXEC1}</exec><file>${EXEC1}#${EXEC1}</file><capture-output/></shell><ok to="shell-node2"/><error to="fail"/></action><action name="shell-node2"><shell xmlns="uri:oozie:shell-action:0.2"><job-tracker>${jobTracker}</job-tracker><name-node>${nameNode}</name-node><configuration><property><name>mapred.job.queue.name</name><value>${queueName}</value></property></configuration><exec>${EXEC2}</exec><file>${EXEC2}#${EXEC2}</file><capture-output/></shell><ok to="end"/><error to="fail"/></action><kill name="fail"><message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message></kill><end name="end"/></workflow-app>
执行
$ ~/Documents/hadoop/bin/hadoop fs -put oozie-apps/shell-scripts oozie-apps/shell-scripts$ ~/Documents/hadoop/bin/hadoop fs -ls /user/jack/oozie-apps/shell-scriptsFound 4 items-rw-r--r-- 1 jack supergroup 131 2020-04-21 04:49 /user/jack/oozie-apps/shell-scripts/batch.sh-rw-r--r-- 1 jack supergroup 1179 2020-04-21 04:49 /user/jack/oozie-apps/shell-scripts/job.properties-rw-r--r-- 1 jack supergroup 117 2020-04-21 04:49 /user/jack/oozie-apps/shell-scripts/log.sh-rw-r--r-- 1 jack supergroup 2241 2020-04-21 04:49 /user/jack/oozie-apps/shell-scripts/workflow.xml$ export OOZIE_URL="http://192.168.32.130:11000/oozie"$ bin/oozie job -config oozie-apps/shell-scripts/job.properties -runjob: 0000004-200420185350972-oozie-jack-W$ bin/oozie job -info 0000004-200420185350972-oozie-jack-WJob ID : 0000004-200420185350972-oozie-jack-W------------------------------------------------------------------------------------------------------------------------------------Workflow Name : shell-wfApp Path : hdfs://192.168.32.130:8020/user/jack/oozie-apps/shell-scriptsStatus : RUNNINGRun : 0User : jackGroup : -Created : 2020-04-21 11:51 GMTStarted : 2020-04-21 11:51 GMTLast Modified : 2020-04-21 11:52 GMTEnded : -CoordAction ID: -Actions------------------------------------------------------------------------------------------------------------------------------------ID Status Ext ID Ext Status Err Code------------------------------------------------------------------------------------------------------------------------------------0000004-200420185350972-oozie-jack-W@:start: OK - OK -------------------------------------------------------------------------------------------------------------------------------------0000004-200420185350972-oozie-jack-W@shell-node1 OK job_1586921478592_0025 SUCCEEDED -------------------------------------------------------------------------------------------------------------------------------------0000004-200420185350972-oozie-jack-W@shell-node2 RUNNING job_1586921478592_0026 RUNNING -------------------------------------------------------------------------------------------------------------------------------------$ bin/oozie job -info 0000004-200420185350972-oozie-jack-WJob ID : 0000004-200420185350972-oozie-jack-W------------------------------------------------------------------------------------------------------------------------------------Workflow Name : shell-wfApp Path : hdfs://192.168.32.130:8020/user/jack/oozie-apps/shell-scriptsStatus : SUCCEEDEDRun : 0User : jackGroup : -Created : 2020-04-21 11:51 GMTStarted : 2020-04-21 11:51 GMTLast Modified : 2020-04-21 11:52 GMTEnded : 2020-04-21 11:52 GMTCoordAction ID: -Actions------------------------------------------------------------------------------------------------------------------------------------ID Status Ext ID Ext Status Err Code------------------------------------------------------------------------------------------------------------------------------------0000004-200420185350972-oozie-jack-W@:start: OK - OK -------------------------------------------------------------------------------------------------------------------------------------0000004-200420185350972-oozie-jack-W@shell-node1 OK job_1586921478592_0025 SUCCEEDED -------------------------------------------------------------------------------------------------------------------------------------0000004-200420185350972-oozie-jack-W@shell-node2 OK job_1586921478592_0026 SUCCEEDED -------------------------------------------------------------------------------------------------------------------------------------0000004-200420185350972-oozie-jack-W@end OK - OK -------------------------------------------------------------------------------------------------------------------------------------$ cat /tmp/oozie/workflow.logHello! It's time to run. [Tue Apr 21 04:52:10 PDT 2020]bytes length of current file: 56
