oozie安装目录下,oozie-examples.tar.gz 中存在一些案例。
解压到当前目录,并查看。创建oozie-apps目录,保存测试示例。
$ tar -zxf oozie-examples.tar.gz$ cd examples/$ lsapps input-data src$ ls appsaggregator cron custom-main demo hadoop-el hive map-reduce pig sla sqoop-freeform streamingbundle cron-schedule datelist-java-main distcp hcatalog java-main no-op shell sqoop ssh subwf$ cd ..$ mkdir oozie-apps
调度MR任务 - oozie示例
使用 oozie 自带示例 examples/apps/map-reduce 及数据文件 examples/input-data ,复制到 oozie-apps 中,并修改配置文件。
$ cp -r examples/apps/map-reduce oozie-apps/map-reduce$ cp -r examples/input-data oozie-apps/input-data$ ls oozie-apps/map-reduce/job.properties job-with-config-class.properties libworkflow-with-config-class.xml workflow.xml$ ls oozie-apps/map-reduce/liboozie-examples-4.0.0-cdh5.3.6.jar
job.properties
nameNode=hdfs://192.168.32.130:8020# yarn.resourcemanager.address ${yarn.resourcemanager.hostname}:8032jobTracker=192.168.32.130:8032queueName=defaultexamplesRoot=oozie-apps# 执行应用程序路径 hdfs://192.168.32.130:8020/user/jack/oozie-apps/map-reduce/workflow.xmloozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/map-reduce/workflow.xmloutputDir=map-reduce
workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.2" name="map-reduce-wf"><start to="mr-node"/><action name="mr-node"><map-reduce><job-tracker>${jobTracker}</job-tracker><name-node>${nameNode}</name-node><prepare><delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}"/></prepare><configuration><property><name>mapred.job.queue.name</name><value>${queueName}</value></property><property><name>mapred.mapper.class</name><value>org.apache.oozie.example.SampleMapper</value></property><property><name>mapred.reducer.class</name><value>org.apache.oozie.example.SampleReducer</value></property><property><name>mapred.map.tasks</name><value>1</value></property><property><name>mapred.input.dir</name><value>/user/${wf:user()}/${examplesRoot}/input-data/text</value></property><property><name>mapred.output.dir</name><value>/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}</value></property></configuration></map-reduce><ok to="end"/><error to="fail"/></action><kill name="fail"><message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message></kill><end name="end"/></workflow-app>
prepare 元素,指定该节点任务的准备条件,
属性 mapred.mapper.class 和 mapred.reducer.class 的值,指向的 class ,放置在该工作目录下的 lib 中,以 jar 包的形式保存(oozie-examples-4.0.0-cdh5.3.6.jar)。
这些类的实现源码,在 examples/src 目录下。
属性 mapred.input.dir 和 mapred.output.dir 的值,表示MapReduce程序的输入输出目录位置。输入数据在 examples/input-data 目录中,需上传到指定位置 /user/jack/oozie-apps/ 下。
SampleMapper.java
使用的是旧式API
/*** Licensed to the Apache Software Foundation (ASF) under one* or more contributor license agreements. See the NOTICE file* distributed with this work for additional information* regarding copyright ownership. The ASF licenses this file* to you under the Apache License, Version 2.0 (the* "License"); you may not use this file except in compliance* with the License. You may obtain a copy of the License at** http://www.apache.org/licenses/LICENSE-2.0** Unless required by applicable law or agreed to in writing, software* distributed under the License is distributed on an "AS IS" BASIS,* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.* See the License for the specific language governing permissions and* limitations under the License.*/package org.apache.oozie.example;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapred.JobConf;import org.apache.hadoop.mapred.Mapper;import org.apache.hadoop.mapred.OutputCollector;import org.apache.hadoop.mapred.Reporter;import java.io.IOException;public class SampleMapper implements Mapper<LongWritable, Text, LongWritable, Text> {public void configure(JobConf jobConf) {}public void map(LongWritable key, Text value,OutputCollector<LongWritable, Text> collector, Reporter reporter)throws IOException {// 字节偏移量,行文本collector.collect(key, value);}public void close() throws IOException {}}
SampleReducer.java
/*** Licensed to the Apache Software Foundation (ASF) under one* or more contributor license agreements. See the NOTICE file* distributed with this work for additional information* regarding copyright ownership. The ASF licenses this file* to you under the Apache License, Version 2.0 (the* "License"); you may not use this file except in compliance* with the License. You may obtain a copy of the License at** http://www.apache.org/licenses/LICENSE-2.0** Unless required by applicable law or agreed to in writing, software* distributed under the License is distributed on an "AS IS" BASIS,* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.* See the License for the specific language governing permissions and* limitations under the License.*/package org.apache.oozie.example;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapred.JobConf;import org.apache.hadoop.mapred.OutputCollector;import org.apache.hadoop.mapred.Reducer;import org.apache.hadoop.mapred.Reporter;import java.io.IOException;import java.util.Iterator;public class SampleReducer implements Reducer<LongWritable, Text, LongWritable, Text> {public void configure(JobConf jobConf) {}public void reduce(LongWritable key, Iterator<Text> values, OutputCollector<LongWritable, Text> collector, Reporter reporter)throws IOException {// 循环输出,不作处理while (values.hasNext()) {collector.collect(key, values.next());}}public void close() throws IOException {}}
执行 job.properties
$ ~/Documents/hadoop/bin/hadoop fs -put oozie-apps/input-data oozie-apps/input-data$ ~/Documents/hadoop/bin/hadoop fs -put oozie-apps/map-reduce/* oozie-apps/map-reduce$ ~/Documents/hadoop/bin/hadoop fs -ls oozie-apps/map-reduceFound 5 items-rw-r--r-- 1 jack supergroup 1028 2020-04-21 05:59 oozie-apps/map-reduce/job-with-config-class.properties-rw-r--r-- 1 jack supergroup 1019 2020-04-21 05:59 oozie-apps/map-reduce/job.propertiesdrwxr-xr-x - jack supergroup 0 2020-04-21 05:59 oozie-apps/map-reduce/lib-rw-r--r-- 1 jack supergroup 2274 2020-04-21 05:59 oozie-apps/map-reduce/workflow-with-config-class.xml-rw-r--r-- 1 jack supergroup 2559 2020-04-21 05:59 oozie-apps/map-reduce/workflow.xml$ # 执行任务$ export OOZIE_URL="http://192.168.32.130:11000/oozie"$ bin/oozie job -config oozie-apps/map-reduce/job.properties -runjob: 0000007-200420185350972-oozie-jack-W$ bin/oozie job -info 0000007-200420185350972-oozie-jack-WJob ID : 0000007-200420185350972-oozie-jack-W------------------------------------------------------------------------------------------------------------------------------------Workflow Name : map-reduce-wfApp Path : hdfs://192.168.32.130:8020/user/jack/oozie-apps/map-reduce/workflow.xmlStatus : SUCCEEDEDRun : 0User : jackGroup : -Created : 2020-04-21 13:00 GMTStarted : 2020-04-21 13:00 GMTLast Modified : 2020-04-21 13:01 GMTEnded : 2020-04-21 13:01 GMTCoordAction ID: -Actions------------------------------------------------------------------------------------------------------------------------------------ID Status Ext ID Ext Status Err Code------------------------------------------------------------------------------------------------------------------------------------0000007-200420185350972-oozie-jack-W@:start: OK - OK -------------------------------------------------------------------------------------------------------------------------------------0000007-200420185350972-oozie-jack-W@mr-node OK job_1586921478592_0029 SUCCEEDED -------------------------------------------------------------------------------------------------------------------------------------0000007-200420185350972-oozie-jack-W@end OK - OK -------------------------------------------------------------------------------------------------------------------------------------$ ~/Documents/hadoop/bin/hadoop fs -ls oozie-apps/output-data/map-reduceFound 2 items-rw-r--r-- 1 jack supergroup 0 2020-04-21 06:01 oozie-apps/output-data/map-reduce/_SUCCESS-rw-r--r-- 1 jack supergroup 1547 2020-04-21 06:01 oozie-apps/output-data/map-reduce/part-00000$ # 默认Map处理,Reduce不作处理$ ~/Documents/hadoop/bin/hadoop fs -cat oozie-apps/output-data/map-reduce/*0 To be or not to be, that is the question;42 Whether 'tis nobler in the mind to suffer84 The slings and arrows of outrageous fortune,129 Or to take arms against a sea of troubles,172 And by opposing, end them. To die, to sleep;217 No more; and by a sleep to say we end255 The heart-ache and the thousand natural shocks302 That flesh is heir to ? 'tis a consummation346 Devoutly to be wish'd. To die, to sleep;387 To sleep, perchance to dream. Ay, there's the rub,438 For in that sleep of death what dreams may come,487 When we have shuffled off this mortal coil,531 Must give us pause. There's the respect571 That makes calamity of so long life,608 For who would bear the whips and scorns of time,657 Th'oppressor's wrong, the proud man's contumely,706 The pangs of despised love, the law's delay,751 The insolence of office, and the spurns791 That patient merit of th'unworthy takes,832 When he himself might his quietus make871 With a bare bodkin? who would fardels bear,915 To grunt and sweat under a weary life,954 But that the dread of something after death,999 The undiscovered country from whose bourn1041 No traveller returns, puzzles the will,1081 And makes us rather bear those ills we have1125 Than fly to others that we know not of?1165 Thus conscience does make cowards of us all,1210 And thus the native hue of resolution1248 Is sicklied o'er with the pale cast of thought,1296 And enterprises of great pitch and moment1338 With this regard their currents turn awry,1381 And lose the name of action.
job-with-config-class.properties
nameNode=hdfs://192.168.32.130:8020jobTracker=192.168.32.130:8032queueName=defaultexamplesRoot=oozie-appsoozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/map-reduce/workflow-with-config-class.xmloutputDir=map-reduce
workflow-with-config-class.xml
<workflow-app xmlns="uri:oozie:workflow:0.5" name="map-reduce-wf"><start to="mr-node"/><action name="mr-node"><map-reduce><job-tracker>${jobTracker}</job-tracker><name-node>${nameNode}</name-node><prepare><delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}"/></prepare><!-- most of the <configuration> properties are being set by SampleOozieActionConfigurator --><configuration><property><name>mapred.job.queue.name</name><value>${queueName}</value></property><!-- These two are not Hadoop properties, but SampleOozieActionConfigurator can use them --><property><name>examples.root</name><value>${examplesRoot}</value></property><property><name>output.dir.name</name><value>${outputDir}</value></property></configuration><config-class>org.apache.oozie.example.SampleOozieActionConfigurator</config-class></map-reduce><ok to="end"/><error to="fail"/></action><kill name="fail"><message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message></kill><end name="end"/></workflow-app>
与 workflow.xml 的差异,通过 config-class 元素来统一配置,该元素引用的class类,引用配置类 org.apache.oozie.example.SampleOozieActionConfigurator 。
该类的实现源码,在 examples/src 目录下。
SampleOozieActionConfigurator.java
/*** Licensed to the Apache Software Foundation (ASF) under one* or more contributor license agreements. See the NOTICE file* distributed with this work for additional information* regarding copyright ownership. The ASF licenses this file* to you under the Apache License, Version 2.0 (the* "License"); you may not use this file except in compliance* with the License. You may obtain a copy of the License at** http://www.apache.org/licenses/LICENSE-2.0** Unless required by applicable law or agreed to in writing, software* distributed under the License is distributed on an "AS IS" BASIS,* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.* See the License for the specific language governing permissions and* limitations under the License.*/package org.apache.oozie.example;import org.apache.hadoop.fs.Path;import org.apache.hadoop.mapred.FileInputFormat;import org.apache.hadoop.mapred.FileOutputFormat;import org.apache.hadoop.mapred.JobConf;import org.apache.oozie.action.hadoop.OozieActionConfigurator;import org.apache.oozie.action.hadoop.OozieActionConfiguratorException;public class SampleOozieActionConfigurator implements OozieActionConfigurator {@Overridepublic void configure(JobConf actionConf) throws OozieActionConfiguratorException {if (actionConf.getUser() == null) {throw new OozieActionConfiguratorException("No user set");}if (actionConf.get("examples.root") == null) {throw new OozieActionConfiguratorException("examples.root not set");}if (actionConf.get("output.dir.name") == null) {throw new OozieActionConfiguratorException("output.dir.name not set");}actionConf.setMapperClass(SampleMapper.class);actionConf.setReducerClass(SampleReducer.class);actionConf.setNumMapTasks(1);FileInputFormat.setInputPaths(actionConf,new Path("/user/" + actionConf.getUser() + "/"+ actionConf.get("examples.root") + "/input-data/text"));FileOutputFormat.setOutputPath(actionConf,new Path("/user/" + actionConf.getUser() + "/"+ actionConf.get("examples.root") + "/output-data/"+ actionConf.get("output.dir.name")));}}
该 OozieActionConfigurator 接口的实现类,定义了 Mapper类、Reducer类、Map任务数、输入目录和输出目录。
执行 job-with-config-class.properties
$ # 更新 job-with-config-class.properties$ ~/Documents/hadoop/bin/hadoop fs -put -f oozie-apps/map-reduce/job-with-config-class.properties oozie-apps/map-reduce/$ ~/Documents/hadoop/bin/hadoop fs -ls oozie-apps/map-reduceFound 5 items-rw-r--r-- 1 jack supergroup 1028 2020-04-21 18:14 oozie-apps/map-reduce/job-with-config-class.properties-rw-r--r-- 1 jack supergroup 1019 2020-04-21 05:59 oozie-apps/map-reduce/job.propertiesdrwxr-xr-x - jack supergroup 0 2020-04-21 05:59 oozie-apps/map-reduce/lib-rw-r--r-- 1 jack supergroup 2274 2020-04-21 05:59 oozie-apps/map-reduce/workflow-with-config-class.xml-rw-r--r-- 1 jack supergroup 2559 2020-04-21 05:59 oozie-apps/map-reduce/workflow.xml$ # 执行任务$ export OOZIE_URL="http://192.168.32.130:11000/oozie"$ bin/oozie job -config oozie-apps/map-reduce/job.properties -runjob: 0000008-200420185350972-oozie-jack-W$ bin/oozie job -info 0000008-200420185350972-oozie-jack-WJob ID : 0000008-200420185350972-oozie-jack-W------------------------------------------------------------------------------------------------------------------------------------Workflow Name : map-reduce-wfApp Path : hdfs://192.168.32.130:8020/user/jack/oozie-apps/map-reduce/workflow-with-config-class.xmlStatus : RUNNINGRun : 0User : jackGroup : -Created : 2020-04-22 01:18 GMTStarted : 2020-04-22 01:18 GMTLast Modified : 2020-04-22 01:18 GMTEnded : -CoordAction ID: -Actions------------------------------------------------------------------------------------------------------------------------------------ID Status Ext ID Ext Status Err Code------------------------------------------------------------------------------------------------------------------------------------0000008-200420185350972-oozie-jack-W@:start: OK - OK -------------------------------------------------------------------------------------------------------------------------------------0000008-200420185350972-oozie-jack-W@mr-node RUNNING job_1586921478592_0031 RUNNING -------------------------------------------------------------------------------------------------------------------------------------[jack@master oozie]$ bin/oozie job -info 0000008-200420185350972-oozie-jack-WJob ID : 0000008-200420185350972-oozie-jack-W------------------------------------------------------------------------------------------------------------------------------------Workflow Name : map-reduce-wfApp Path : hdfs://192.168.32.130:8020/user/jack/oozie-apps/map-reduce/workflow-with-config-class.xmlStatus : SUCCEEDEDRun : 0User : jackGroup : -Created : 2020-04-22 01:18 GMTStarted : 2020-04-22 01:18 GMTLast Modified : 2020-04-22 01:19 GMTEnded : 2020-04-22 01:19 GMTCoordAction ID: -Actions------------------------------------------------------------------------------------------------------------------------------------ID Status Ext ID Ext Status Err Code------------------------------------------------------------------------------------------------------------------------------------0000008-200420185350972-oozie-jack-W@:start: OK - OK -------------------------------------------------------------------------------------------------------------------------------------0000008-200420185350972-oozie-jack-W@mr-node OK job_1586921478592_0031 SUCCEEDED -------------------------------------------------------------------------------------------------------------------------------------0000008-200420185350972-oozie-jack-W@end OK - OK -------------------------------------------------------------------------------------------------------------------------------------$ ~/Documents/hadoop/bin/hadoop fs -ls oozie-apps/output-data/map-reduceFound 2 items-rw-r--r-- 1 jack supergroup 0 2020-04-21 18:18 oozie-apps/output-data/map-reduce/_SUCCESS-rw-r--r-- 1 jack supergroup 1547 2020-04-21 18:18 oozie-apps/output-data/map-reduce/part-00000$ # 默认Map处理,Reduce不作处理$ ~/Documents/hadoop/bin/hadoop fs -cat oozie-apps/output-data/map-reduce/*0 To be or not to be, that is the question;42 Whether 'tis nobler in the mind to suffer84 The slings and arrows of outrageous fortune,129 Or to take arms against a sea of troubles,172 And by opposing, end them. To die, to sleep;217 No more; and by a sleep to say we end255 The heart-ache and the thousand natural shocks302 That flesh is heir to ? 'tis a consummation346 Devoutly to be wish'd. To die, to sleep;387 To sleep, perchance to dream. Ay, there's the rub,438 For in that sleep of death what dreams may come,487 When we have shuffled off this mortal coil,531 Must give us pause. There's the respect571 That makes calamity of so long life,608 For who would bear the whips and scorns of time,657 Th'oppressor's wrong, the proud man's contumely,706 The pangs of despised love, the law's delay,751 The insolence of office, and the spurns791 That patient merit of th'unworthy takes,832 When he himself might his quietus make871 With a bare bodkin? who would fardels bear,915 To grunt and sweat under a weary life,954 But that the dread of something after death,999 The undiscovered country from whose bourn1041 No traveller returns, puzzles the will,1081 And makes us rather bear those ills we have1125 Than fly to others that we know not of?1165 Thus conscience does make cowards of us all,1210 And thus the native hue of resolution1248 Is sicklied o'er with the pale cast of thought,1296 And enterprises of great pitch and moment1338 With this regard their currents turn awry,1381 And lose the name of action.
WordCount示例
示例位置 hadoop/share/hadoop/mapreduce2/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar ,放入lib目录下,修改 job.properties 和 workflow.xml 文件,并执行。
workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.2" name="map-reduce-wf"><start to="mr-node"/><action name="mr-node"><map-reduce><job-tracker>${jobTracker}</job-tracker><name-node>${nameNode}</name-node><prepare><delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}"/></prepare><configuration><property><name>mapred.job.queue.name</name><value>${queueName}</value></property><property><name>mapred.mapper.class</name><value>org.apache.hadoop.examples.WordCount$TokenizerMapper</value></property><property><name>mapred.reducer.class</name><value>org.apache.hadoop.examples.WordCount$IntSumReducer</value></property><property><name>mapred.map.tasks</name><value>1</value></property><property><name>mapred.input.dir</name><value>/user/${wf:user()}/${examplesRoot}/input-data/text</value></property><property><name>mapred.output.dir</name><value>/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}</value></property></configuration></map-reduce><ok to="end"/><error to="fail"/></action><kill name="fail"><message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message></kill><end name="end"/></workflow-app>
WordCount 的内部类,编译成jar包,org.apache.hadoop.examples.WordCount$IntSumReducer.class 和 org.apache.hadoop.examples.WordCount$TokenizerMapper.class 是编译之后的class导入名称,符号 $ 后的类是前面的类的内部类。
