Coordinator周期性调度任务

oozie时间设置

配置oozie-site.xml文件,在oozie-default.xml中可找到。

  1. <property>
  2. <name>oozie.processing.timezone</name>
  3. <value>GMT+0800</value>
  4. <!-- <value>UTC</value> -->
  5. <description>
  6. Oozie server timezone. Valid values are UTC and GMT(+/-)####, for example 'GMT+0530' would be India
  7. timezone. All dates parsed and genered dates by Oozie Coordinator/Bundle will be done in the specified
  8. timezone. The default value of 'UTC' should not be changed under normal circumtances. If for any reason
  9. is changed, note that GMT(+/-)#### timezones do not observe DST changes.
  10. </description>
  11. </property>

修改js框架中的关于时间设置的代码,位于oozie/oozie-server/webapps/oozie/oozie-console.js 17行。

  1. function getTimeZone() {
  2. Ext.state.Manager.setProvider(new Ext.state.CookieProvider());
  3. // return Ext.state.Manager.get("TimezoneId","GMT");
  4. return Ext.state.Manager.get("TimezoneId","GMT+0800");
  5. }

重启oozie服务,并重启浏览器(清除缓存)。

  1. $ bin/oozied.sh stop
  2. $ bin/oozied.sh start

参考

$ date —utc # 以UTC打印当前时间 Wed Apr 22 06:52:52 UTC 2020 $ date # 当前时区PDT,显示时间 GMT/UTC -0700 Tue Apr 21 23:52:53 PDT 2020 $ date -R # 以RFC 2822格式输出日期和时间 Tue, 21 Apr 2020 23:52:55 -0700

  1. - GMT时间
  2. - 格林威治标准时间(Greenwich Mean Time,简称G.M.T
  3. - UTC时间
  4. - 世界协调时间 (Coordinated Universal Time
  5. - 又称世界标准时间、世界统一时间,以格林威治时间GMT为准
  6. - UTCGMT时间是几乎等同的
  7. <br />
  8. <a name="qIy0Z"></a>
  9. ## job.properties
  10. Oozie在固定时区中处理协调器作业,该时区非DST(通常是UTC),这个时区被称为“Oozie processing timezone”,Oozie处理时区。<br />Oozie处理时区用于解析协调器作业的启动/结束时间、作业暂停时间和数据集的初始实例。而且,所有的协调器数据集的实例URI模板都被解析为Oozie处理时区中的一个datetime。<br />协调程序应用程序中使用的所有日期时间和协调程序应用程序的作业参数必须在Oozie处理时区中指定。如果Oozie处理时区是UTC,那么限定符就是`Z`。如果Oozie处理时区不是UTC,那么限定符必须是GMT offset `(+/-)####` 。<br />例如,UTC 中的日期时间是 `2012-08-12T00:00Z` GMT+5:30 中的日期时间是 `2012-08-12T05:30+0530`
  11. 下例中,未作oozie处理时区的配置 `oozie.processing.timezone` 属性值为默认的 `UTC` ,那系统时间以UTC为参照系, `date -u` 显示当前时间 `Wed Apr 22 07:33:11 UTC 2020` ,那么 `start` 的值需要大于该值,且使用限定符 `Z`

nameNode=hdfs://192.168.32.130:8020 jobTracker=192.168.32.130:8032 queueName=default examplesRoot=oozie-apps

oozie.coord.application.path=${nameNode}/user/${user.name}/${examplesRoot}/cron

start 需设为未来时间,否则任务失败

Wed Apr 22 07:33:11 UTC 2020

start=2020-04-22T07:36Z end=2020-04-22T07:42Z

workflowAppUri=${nameNode}/user/${user.name}/${examplesRoot}/cron

EXEC=log.sh

  1. <a name="x8V1r"></a>
  2. ## log.sh
  3. ```shell
  4. #!/bin/bash
  5. echo "Hello! It's time to run. [`date`]" >> `p=/tmp/oozie;[[ ! -d "${p}" ]] && mkdir -p ${p};echo ${p}/coordinator_wf.log`

coordinator.xml

  • 修改执行频率,frequency=”${coord:minutes(5)}”
  • 时区以 oozie-site.xmloozie-default.xmloozie.processing.timezone 属性值为准。
    1. <coordinator-app name="cron-coord" frequency="${coord:minutes(5)}"
    2. start="${start}" end="${end}" timezone="UTC"
    3. xmlns="uri:oozie:coordinator:0.2">
    4. <action>
    5. <workflow>
    6. <app-path>${workflowAppUri}</app-path>
    7. <configuration>
    8. <property>
    9. <name>jobTracker</name>
    10. <value>${jobTracker}</value>
    11. </property>
    12. <property>
    13. <name>nameNode</name>
    14. <value>${nameNode}</value>
    15. </property>
    16. <property>
    17. <name>queueName</name>
    18. <value>${queueName}</value>
    19. </property>
    20. </configuration>
    21. </workflow>
    22. </action>
    23. </coordinator-app>

workflow.xml

  1. <workflow-app xmlns="uri:oozie:workflow:0.5" name="one-op-wf">
  2. <start to="action1"/>
  3. <action name="action1">
  4. <shell xmlns="uri:oozie:shell-action:0.2">
  5. <job-tracker>${jobTracker}</job-tracker>
  6. <name-node>${nameNode}</name-node>
  7. <configuration>
  8. <property>
  9. <name>mapred.job.queue.name</name>
  10. <value>${queueName}</value>
  11. </property>
  12. </configuration>
  13. <exec>${EXEC}</exec>
  14. <file>${EXEC}#${EXEC}</file>
  15. </shell>
  16. <ok to="end"/>
  17. <error to="end"/>
  18. </action>
  19. <end name="end"/>
  20. </workflow-app>

执行

  1. $ ~/Documents/hadoop/bin/hadoop fs -put oozie-apps/cron oozie-apps/
  2. $ ~/Documents/hadoop/bin/hadoop fs -ls oozie-apps/cron/ # 当前时区为GMT-0700
  3. Found 3 items
  4. -rw-r--r-- 1 jack supergroup 1595 2020-04-21 23:56 oozie-apps/cron/coordinator.xml
  5. -rw-r--r-- 1 jack supergroup 1179 2020-04-21 23:56 oozie-apps/cron/job.properties
  6. -rw-r--r-- 1 jack supergroup 1449 2020-04-21 23:56 oozie-apps/cron/workflow.xml
  7. $ export OOZIE_URL="http://192.168.32.130:11000/oozie"
  8. $ bin/oozie job -config oozie-apps/cron/job.properties -run
  9. job: 0000010-200420185350972-oozie-jack-C
  10. $ bin/oozie job -info 0000010-200420185350972-oozie-jack-C
  11. Job ID : 0000010-200420185350972-oozie-jack-C
  12. ------------------------------------------------------------------------------------------------------------------------------------
  13. Job Name : cron-coord
  14. App Path : hdfs://192.168.32.130:8020/user/jack/oozie-apps/cron
  15. Status : RUNNING
  16. Start Time : 2020-04-22 07:36 GMT
  17. End Time : 2020-04-22 07:42 GMT
  18. Pause Time : -
  19. Concurrency : 1
  20. ------------------------------------------------------------------------------------------------------------------------------------
  21. ID Status Ext ID Err Code Created Nominal Time
  22. 0000010-200420185350972-oozie-jack-C@1 WAITING - - 2020-04-22 07:35 GMT 2020-04-22 07:36 GMT
  23. ------------------------------------------------------------------------------------------------------------------------------------
  24. 0000010-200420185350972-oozie-jack-C@2 WAITING - - 2020-04-22 07:35 GMT 2020-04-22 07:41 GMT
  25. ------------------------------------------------------------------------------------------------------------------------------------
  26. $ cat /tmp/oozie/coordinator_wf.log # 输出结果为空
  27. $ cat /tmp/oozie/coordinator_wf.log # 执行结束,且成功
  28. Hello! It's time to run. [Wed Apr 22 00:36:20 PDT 2020]
  29. Hello! It's time to run. [Wed Apr 22 00:41:11 PDT 2020]

可以看到,自动设置了2个任务,按5分钟的时间间隔分割。