Reference

使用CDH编译好的版本,直接解压缩,配置安装

Building Oozie

编译二进制版本oozie源文件。
使用Cloudera等厂商的发行版本tar包,可略过编译。

Server Installation

System Requirements

  • Unix (tested in Linux and Mac OS X)
    当前使用版本信息如下。

    1. $ head -n 1 /etc/issue
    2. CentOS release 6.4 (Final)
  • Java 1.6+
    当前使用版本信息如下。

    1. $ java -version
    2. java version "1.7.0_67"
    3. Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
    4. Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
  • Hadoop
    当前使用版本信息如下。

    1. $ Documents/hadoop/bin/hadoop version
    2. Hadoop 2.5.0-cdh5.3.6
    3. Subversion http://github.com/cloudera/hadoop -r 6743ef286bfdd317b600adbdb154f982cf2fac7a
    4. Compiled by jenkins on 2015-07-28T22:14Z
    5. Compiled with protoc 2.5.0
    6. From source with checksum 9c7775296a534f91809cc23d2d15ffcc
    7. This command was run using /home/jack/Documents/hadoop/share/hadoop/common/hadoop-common-2.5.0-cdh5.3.6.jar
  • ExtJS library (optional, to enable Oozie webconsole)

Server Installation

IMPORTANT: Oozie ignores any set value for OOZIE_HOME , Oozie computes its home automatically.

  • Build an Oozie binary distribution
  • Download a Hadoop binary distribution

  • Download ExtJS library (it must be version 2.2)

NOTE: The ExtJS library is not bundled with Oozie because it uses a different license.
NOTE: It is recommended to use a Oozie Unix user for the Oozie server.

使用Cloudera发行版,解压配置即可使用。

Hadoop环境

NOTE: Configure the Hadoop cluster with proxyuser for the Oozie process.
Hadoop core-site.xml 中需要以下两个属性:

  1. <!-- OOZIE -->
  2. <property>
  3. <name>hadoop.proxyuser.[OOZIE_SERVER_USER].hosts</name>
  4. <value>[OOZIE_SERVER_HOSTNAME]</value>
  5. </property>
  6. <property>
  7. <name>hadoop.proxyuser.[OOZIE_SERVER_USER].groups</name>
  8. <value>[USER_GROUPS_THAT_ALLOW_IMPERSONATION]</value>
  9. </property>

用特定的值替换大写字母部分,然后重新启动Hadoop。

使用现有Hadoop环境,伪分布式集群。(具体查看hadoop集群配置部分)

  • core-site.xml 的代理用户配置
  • mapred-site.xml 的 MapReduce JobHistory Server 配置
  • … …

修改 core-site.xml ,配置proxy user。现有Oozie环境用户,与hadoop环境的超级用户(jack)相同。
代理用户原理,查看官方文档,比如:hadoop 3.2.1 - Proxy user - Superusers Acting On Behalf Of Other Users -[https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html)

  1. <property>
  2. <name>hadoop.proxyuser.jack.hosts</name>
  3. <value>*</value>
  4. </property>
  5. <property>
  6. <name>hadoop.proxyuser.jack.groups</name>
  7. <value>*</value>
  8. </property>

启动Hadoop集群,进程状态如下。

  1. $ jps -l
  2. 3355 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
  3. 3064 org.apache.hadoop.hdfs.server.namenode.NameNode
  4. 3631 org.apache.hadoop.yarn.server.nodemanager.NodeManager
  5. 3525 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
  6. 57245 sun.tools.jps.Jps
  7. 4019 org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer
  8. 3191 org.apache.hadoop.hdfs.server.datanode.DataNode

Oozie环境

解压 oozie-4.0.0-cdh5.3.6.tar.gz

  1. $ cd ~/Documents
  2. $ tar zxvf ~/Downloads/oozie-4.0.0-cdh5.3.6.tar.gz -C .
  3. $ mv oozie-4.0.0-cdh5.3.6.tar.gz oozie
  4. $ cd oozie
  5. $ ls
  6. bin lib NOTICE.txt oozie-hadooplibs-4.0.0-cdh5.3.6.tar.gz oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz src
  7. conf libtools oozie-core oozie-server oozie.war
  8. docs LICENSE.txt oozie-examples.tar.gz oozie-sharelib-4.0.0-cdh5.3.6.tar.gz release-log.txt

IMPORTANT: all Oozie server scripts ( oozie-setup.sh , oozied.sh , oozie-start.sh , oozie-run.sh and oozie-stop.sh ) run only under the Unix user that owns the Oozie installation directory, if necessary use sudo -u OOZIE_USER when invoking the scripts.

As of Oozie 3.3.2, use of oozie-start.sh , oozie-run.sh , and oozie-stop.sh has been deprecated and will print a warning. The oozied.sh script should be used instead; passing it start , run , or stop as an argument will perform the behaviors of oozie-start.sh , oozie-run.sh , and oozie-stop.sh respectively.
Create a libext/ directory in the directory where Oozie was expanded.

If using a version of Hadoop bundled in Oozie hadooplibs/ , copy the corresponding Hadoop JARs from hadooplibs/ to the libext/ directory. If using a different version of Hadoop, copy the required Hadoop JARs from such version in the libext/ directory.

解压 oozie-hadooplibs-4.0.0-cdh5.3.6.tar.gz
创建libext文件夹, 拷贝依赖的相关jar包。

  1. $ cd ~/Documents/oozie
  2. $ mkdir libext
  3. $ tar zxvf oozie-hadooplibs-4.0.0-cdh5.3.6.tar.gz
  4. $ cp oozie-4.0.0-cdh5.3.6/hadooplibs/hadooplib-2.5.0-cdh5.3.6.oozie-4.0.0-cdh5.3.6/* libext/

If using the ExtJS library copy the ZIP file to the libext/ directory.

拷贝 ext-2.2.ziplibext

  1. $ cp ~/Downloads/ext-2.2.zip libext/

修改oozie配置文件,使用MySQL创建保存元数据的oozie数据库。

  1. <property>
  2. <name>oozie.service.JPAService.jdbc.driver</name>
  3. <value>com.mysql.jdbc.Driver</value>
  4. </property>
  5. <property>
  6. <name>oozie.service.JPAService.jdbc.url</name>
  7. <value>jdbc:mysql://192.168.32.130:3306/oozie</value>
  8. </property>
  9. <property>
  10. <name>oozie.service.JPAService.jdbc.username</name>
  11. <value>root</value>
  12. </property>
  13. <property>
  14. <name>oozie.service.JPAService.jdbc.password</name>
  15. <value>123456</value>
  16. </property>
  17. <property>
  18. <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
  19. <value>*=/home/jack/Documents/hadoop/etc/hadoop</value>
  20. <description>让Oozie引用Hadoop的配置文件“*=”不能删</description>
  21. </property>

拷贝MySQL 连接驱动 jar包到 libext/ 中。

  1. $ cp ~/Downloads/connector/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar libext/

创建oozie数据库。

  1. $ mysql -uroot -p123456
  2. Welcome to the MySQL monitor. Commands end with ; or \g.
  3. Your MySQL connection id is 9446
  4. Server version: 5.1.73 Source distribution
  5. Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
  6. Oracle is a registered trademark of Oracle Corporation and/or its
  7. affiliates. Other names may be trademarks of their respective
  8. owners.
  9. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
  10. mysql> create database oozie;
  11. mysql> quit;


初始化oozie

A sharelib create|upgrade -fs fs_default_name [-locallib sharelib] command is available when running oozie-setup.sh for uploading new or upgrading existing sharelib into hdfs where the first argument is the default fs name and the second argument is the Oozie sharelib to install, it can be a tarball or the expanded version of it. If the second argument is omitted, the Oozie sharelib tarball from the Oozie installation directory will be used.

prepare-war [-d directory] command is for creating war files for oozie with an optional alternative directory other than libext.

db create|upgrade|postupgrade -run [-sqlfile ] command is for create, upgrade or postupgrade oozie db with an optional sql file

Run the oozie-setup.sh script to configure Oozie with all the components added to the libext/ directory.

  1. $ bin/oozie-setup.sh prepare-war [-d directory] [-secure]
  2. sharelib create -fs <FS_URI> [-locallib ]
  3. sharelib upgrade -fs <FS_URI> [-locallib ]
  4. db create|upgrade|postupgrade -run [-sqlfile ]

The -secure option will configure Oozie to use HTTP (SSL); refer to Setting Up Oozie with HTTPS (SSL) for more details.

上传sharelib相关文件到HDFS上,查看当前用户对应目录下share文件夹,确认上传内容。该部分内容,会由oozie执行工作流任务时引用。(无需解压,上传时,自动解压,且必须是oozie服务用户。)
创建oozie.sql文件及在MySQL中初始化元数据数据库oozie。
创建oozie server项目的war包,位于oozie安装目录下 oozie-server/webapp/ 中,用于启动Web服务。

  1. $ bin/oozie-setup.sh sharelib create -fs hdfs://192.168.32.130:8020 -locallib oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz
  2. ··· ···
  3. $ bin/oozie-setup.sh db create -run -sqlfile oozie.sql
  4. ··· ···
  5. $ bin/oozie-setup.sh prepare-war
  6. ··· ···
  7. INFO: Adding extension: /home/jack/Documents/oozie/libext/stax-api-1.0-2.jar
  8. INFO: Adding extension: /home/jack/Documents/oozie/libext/xmlenc-0.52.jar
  9. INFO: Adding extension: /home/jack/Documents/oozie/libext/xz-1.0.jar
  10. INFO: Adding extension: /home/jack/Documents/oozie/libext/zookeeper-3.4.5-cdh5.3.6.jar
  11. New Oozie WAR file with added 'ExtJS library, JARs' at /home/jack/Documents/oozie/oozie-server/webapps/oozie.war
  12. INFO: Oozie is ready to be started

启动与关闭

启动命令

  1. $ bin/oozied.sh start
  2. Setting OOZIE_HOME: /home/jack/Documents/oozie
  3. Setting OOZIE_CONFIG: /home/jack/Documents/oozie/conf
  4. Sourcing: /home/jack/Documents/oozie/conf/oozie-env.sh
  5. setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
  6. Setting OOZIE_CONFIG_FILE: oozie-site.xml
  7. Setting OOZIE_DATA: /home/jack/Documents/oozie/data
  8. Setting OOZIE_LOG: /home/jack/Documents/oozie/logs
  9. Setting OOZIE_LOG4J_FILE: oozie-log4j.properties
  10. Setting OOZIE_LOG4J_RELOAD: 10
  11. Setting OOZIE_HTTP_HOSTNAME: master
  12. Setting OOZIE_HTTP_PORT: 11000
  13. Setting OOZIE_ADMIN_PORT: 11001
  14. Setting OOZIE_HTTPS_PORT: 11443
  15. Setting OOZIE_BASE_URL: http://master:11000/oozie
  16. Setting CATALINA_BASE: /home/jack/Documents/oozie/oozie-server
  17. Setting OOZIE_HTTPS_KEYSTORE_FILE: /home/jack/.keystore
  18. Setting OOZIE_HTTPS_KEYSTORE_PASS: password
  19. Setting OOZIE_INSTANCE_ID: master
  20. Setting CATALINA_OUT: /home/jack/Documents/oozie/logs/catalina.out
  21. Setting CATALINA_PID: /home/jack/Documents/oozie/oozie-server/temp/oozie.pid
  22. Using CATALINA_OPTS: -Xmx1024m -Dderby.stream.error.file=/home/jack/Documents/oozie/logs/derby.log
  23. Adding to CATALINA_OPTS: -Doozie.home.dir=/home/jack/Documents/oozie -Doozie.config.dir=/home/jack/Documents/oozie/conf -Doozie.log.dir=/home/jack/Documents/oozie/logs -Doozie.data.dir=/home/jack/Documents/oozie/data -Doozie.instance.id=master -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=oozie-log4j.properties -Doozie.log4j.reload=10 -Doozie.http.hostname=master -Doozie.admin.port=11001 -Doozie.http.port=11000 -Doozie.https.port=11443 -Doozie.base.url=http://master:11000/oozie -Doozie.https.keystore.file=/home/jack/.keystore -Doozie.https.keystore.pass=password -Djava.library.path=
  24. Using CATALINA_BASE: /home/jack/Documents/oozie/oozie-server
  25. Using CATALINA_HOME: /home/jack/Documents/oozie/oozie-server
  26. Using CATALINA_TMPDIR: /home/jack/Documents/oozie/oozie-server/temp
  27. Using JRE_HOME: /usr/java/jdk/jre
  28. Using CLASSPATH: /home/jack/Documents/oozie/oozie-server/bin/bootstrap.jar
  29. Using CATALINA_PID: /home/jack/Documents/oozie/oozie-server/temp/oozie.pid

关闭命令

  1. $ bin/oozied.sh stop
  2. ··· ···

访问 Web 界面
http://192.168.32.130:11000/oozie/
image.png

Trouble Shooting

框架间环境变量相互影响

用户环境变量污染,Sqoop2中同样使用到tomcat,环境变量中配置有 CATALINA_BASE 变量。当处于该情况时,打war包时,会引用 CATALINA_BASE 变量定义的sqoop2安装目录,导致出错。

  1. $ bin/oozie-setup.sh prepare-war
  2. setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
  3. ··· ···
  4. File/Dir does no exist: /home/jack/Documents/sqoop2/server/conf/ssl/server.xml

Client Installation

Copy and expand the oozie-client TAR.GZ file bundled with the distribution. Add the bin/ directory to the PATH .

有关oozie命令行工具的完整参考资料,请参阅 Command Line Interface Utilities
注意:oozie服务器的安装包括Oozie客户机。Oozie客户机应该只安装在远程机器上。

Oozie Share Lib Installation

略过,见1.2.3节中sharelib上传操作。

Expand the oozie-sharelib TAR.GZ file bundled with the distribution.
The share/ directory must be copied to the Oozie HOME directory in HDFS:
$ hadoop fs -put share share
IMPORTANT: This must be done using the Oozie Hadoop (HDFS) user. If a share directory already exists in HDFS, it must be deleted before copying it again.

See the Workflow Functional Specification for more information about the Oozie ShareLib.
::Go back to Oozie Documentation Index::