Oozie的使用

1.Oozie调度shell脚本

1.1解压官方案例模板

[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ tar -zxvf oozie-examples.tar.gz

1.2创建工作目录

mkdir oozie-apps

1.3拷贝任务模板到oozie-apps

cp -r examples/apps/shell/ oozie-apps

1.4编写脚本p1.sh

vi oozie-apps/shell/p1.sh

内容如下:

#!/bin/bash
date > /home/hadoop/cdh/p1.log

1.5修改配置文件

1.5.1job.properties
#HDFS地址
nameNode=hdfs://hadoop01:8020
#ResourceManager地址
jobTracker=hadoop03:8032
#队列名称
queueName=default
examplesRoot=oozie-apps
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/shell
EXEC=p1.sh
1.5.2workflow.xml
<workflow-app
	xmlns="uri:oozie:workflow:0.4" name="shell-wf">
	<start to="shell-node"/>
	<action name="shell-node">
		<shell
			xmlns="uri:oozie:shell-action:0.2">
			<job-tracker>${jobTracker}</job-tracker>
			<name-node>${nameNode}</name-node>
			<configuration>
				<property>
					<name>mapred.job.queue.name</name>
					<value>${queueName}</value>
				</property>
			</configuration>
			<exec>${EXEC}</exec>
			<!-- <argument>my_output=Hello Oozie</argument> -->
			<file>/user/atguigu/oozie-apps/shell/${EXEC}#${EXEC}</file>
			<capture-output/>
		</shell>
		<ok to="end"/>
		<error to="fail"/>
	</action>
	<decision name="check-output">
		<switch>
			<case to="end">
            ${wf:actionData('shell-node')['my_output'] eq 'Hello Oozie'}
        </case>
			<default to="fail-output"/>
		</switch>
	</decision>
	<kill name="fail">
		<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
	</kill>
	<kill name="fail-output">
		<message>Incorrect output, expected [Hello Oozie] but was [${wf:actionData('shell-node')['my_output']}]</message>
	</kill>
	<end name="end"/>
</workflow-app>

1.6上传任务到hdfs

新建文件夹

hdfs dfs -mkdir /user/hadoop/oozie-apps/

上传

[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ /home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put oozie-apps/shell /user/hadoop/oozie-apps/

1.7执行任务

启动oozie

/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/oozied.sh start

执行任务

bin/oozie job -oozie http://hadoop03:11000/oozie -config oozie-apps/shell/job.properties -run

执行成功后会给出任务编号

杀死某个任务的命令

bin/oozie job -oozie http://hadoop03:11000/oozie -kill 0000000-200216155457484-oozie-hado-W

2.Oozie逻辑调度执行多个Job

2.1增加脚本

vi oozie-apps/shell/p2.sh

内容如下:

#!/bin/bash
date > /home/hadoop/cdh/p2.log

2.2修改配置文件

2.2.1job.properties
#HDFS地址
nameNode=hdfs://hadoop01:8020
#ResourceManager地址
jobTracker=hadoop03:8032
#队列名称
queueName=default
examplesRoot=oozie-apps
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/shell
EXEC1=p1.sh
EXEC2=p2.sh
2.2.2workflow.xml
<workflow-app
	xmlns="uri:oozie:workflow:0.4" name="shell-wf">
	<start to="p1-shell-node"/>
	<action name="p1-shell-node">
		<shell
			xmlns="uri:oozie:shell-action:0.2">
			<job-tracker>${jobTracker}</job-tracker>
			<name-node>${nameNode}</name-node>
			<configuration>
				<property>
					<name>mapred.job.queue.name</name>
					<value>${queueName}</value>
				</property>
			</configuration>
			<exec>${EXEC1}</exec>
			<file>/user/atguigu/oozie-apps/shell/${EXEC1}#${EXEC1}</file>
			<!-- <argument>my_output=Hello Oozie</argument>-->
			<capture-output/>
		</shell>
		<ok to="p2-shell-node"/>
		<error to="fail"/>
	</action>
	<action name="p2-shell-node">
		<shell
			xmlns="uri:oozie:shell-action:0.2">
			<job-tracker>${jobTracker}</job-tracker>
			<name-node>${nameNode}</name-node>
			<configuration>
				<property>
					<name>mapred.job.queue.name</name>
					<value>${queueName}</value>
				</property>
			</configuration>
			<exec>${EXEC2}</exec>
			<file>/user/admin/oozie-apps/shell/${EXEC2}#${EXEC2}</file>
			<!-- <argument>my_output=Hello Oozie</argument>-->
			<capture-output/>
		</shell>
		<ok to="end"/>
		<error to="fail"/>
	</action>
	<decision name="check-output">
		<switch>
			<case to="end">
                ${wf:actionData('shell-node')['my_output'] eq 'Hello Oozie'}
            </case>
			<default to="fail-output"/>
		</switch>
	</decision>
	<kill name="fail">
		<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
	</kill>
	<kill name="fail-output">
		<message>Incorrect output, expected [Hello Oozie] but was [${wf:actionData('shell-node')['my_output']}]</message>
	</kill>
	<end name="end"/>
</workflow-app>

2.3上传任务到hdfs

先在hdfs删除刚刚的目录

/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -rm -r -f /user/hadoop/oozie-apps/shell

然后上传

[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ /home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put oozie-apps/shell /user/hadoop/oozie-apps/

2.4执行任务

启动oozie

/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/oozied.sh start

执行任务

bin/oozie job -oozie http://hadoop03:11000/oozie -config oozie-apps/shell/job.properties -run

3.Oozie调度MapReduce任务

3.1拷贝官方案例模板

cp -r /home/hadoop/cdh/oozie-4.0.0-cdh5.3.6/examples/apps/map-reduce/ oozie-apps/

3.2测试mapreduce jar

这里我用的官方案例的wordcount jar,可以上传自己的jar

vi wc.txt

wc.txt

Spark
Spark Hadoop
Spark Hadoop Hive
Spark Hadoop Hive Oozie
/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -mkdir /user/hadoop/mr_input
/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -mkdir /user/hadoop/mr_output
/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put wc.txt /user/hadoop/mr_input
/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/yarn jar /home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar wordcount /user/hadoop/mr_input /user/hadoop/mr_output/wcout.txt

确认jar包无误后可以继续下一步

3.3修改配置文件

只用修改两个

3.3.1job.properties
nameNode=hdfs://hadoop01:8020
jobTracker=hadoop03:8032
queueName=default
examplesRoot=oozie-apps
#hdfs://hadoop01:8020/user/hadoop/oozie-apps/map-reduce/workflow.xml
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/map-reduce/workflow.xml
outputDir=map-reduce
3.3.2workflow.xml
<workflow-app
	xmlns="uri:oozie:workflow:0.2" name="map-reduce-wf">
	<start to="mr-node"/>
	<action name="mr-node">
		<map-reduce>
			<job-tracker>${jobTracker}</job-tracker>
			<name-node>${nameNode}</name-node>
			<prepare>
				<delete path="${nameNode}/user/hadoop/mr_output/"/>
			</prepare>
			<configuration>
				<property>
					<name>mapred.job.queue.name</name>
					<value>${queueName}</value>
				</property>
				<!-- 配置调度MR任务时,使用新的API -->
				<property>
					<name>mapred.mapper.new-api</name>
					<value>true</value>
				</property>
				<property>
					<name>mapred.reducer.new-api</name>
					<value>true</value>
				</property>
				<!-- 指定Job Key输出类型 -->
				<property>
					<name>mapreduce.job.output.key.class</name>
					<value>org.apache.hadoop.io.Text</value>
				</property>
				<!-- 指定Job Value输出类型 -->
				<property>
					<name>mapreduce.job.output.value.class</name>
					<value>org.apache.hadoop.io.IntWritable</value>
				</property>
				<!-- 指定输入路径 -->
				<property>
					<name>mapred.input.dir</name>
					<value>/user/hadoop/mr_input/</value>
				</property>
				<!-- 指定输出路径 -->
				<property>
					<name>mapred.output.dir</name>
					<value>/user/hadoop/mr_output/</value>
				</property>
				<!-- 指定Map类 -->
				<property>
					<name>mapreduce.job.map.class</name>
					<value>org.apache.hadoop.examples.WordCount$TokenizerMapper</value>
				</property>
				<!-- 指定Reduce类 -->
				<property>
					<name>mapreduce.job.reduce.class</name>
					<value>org.apache.hadoop.examples.WordCount$IntSumReducer</value>
				</property>
				<property>
					<name>mapred.map.tasks</name>
					<value>1</value>
				</property>
			</configuration>
		</map-reduce>
		<ok to="end"/>
		<error to="fail"/>
	</action>
	<kill name="fail">
		<message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
	</kill>
	<end name="end"/>
</workflow-app>

3.4拷贝待执行的jar包

cp -a  /home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar oozie-apps/map-reduce/lib

3.5上传到hdfs

/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put oozie-apps/map-reduce/ /user/hadoop/oozie-apps

3.6执行任务

[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ bin/oozie job -oozie http://hadoop03:11000/oozie -config oozie-apps/map-reduce/job.properties -run

提交成功!

在这里插入图片描述

去oozie Web UI 里面看一眼

http://hadoop03:11000/oozie/

在这里插入图片描述

4.Oozie定时任务

4.1检查时区

date -R

如果显示的时区不是+0800,跟现在的时间不同步,则给集群同步到上海时间

sudo cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime

4.2修改oozie-site.xml文件

[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ vi /home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/conf/oozie-site.xml

加入oozie.processing.timezone属性

<property>
	<name>oozie.processing.timezone</name>
	<value>GMT+0800</value>
	<description>
            Oozie server timezone. Valid values are UTC and GMT(+/-)####, for example 'GMT+0530' would be India
            timezone. All dates parsed and genered dates by Oozie Coordinator/Bundle will be done in the specified
            timezone. The default value of 'UTC' should not be changed under normal circumtances. If for any reason
            is changed, note that GMT(+/-)#### timezones do not observe DST changes.
        </description>
</property>
    

4.3修改js框架中的关于时间设置的代码

vi /home/hadoop/cdh/oozie-4.0.0-cdh5.3.6/oozie-server/webapps/oozie/oozie-console.js

修改如下

function getTimeZone() {
    Ext.state.Manager.setProvider(new Ext.state.CookieProvider());
    return Ext.state.Manager.get("TimezoneId","GMT+0800");
}

4.4重启oozie服务

注意,这里要清一下浏览器缓存并重启浏览器!!

[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ bin/oozied.sh stop
[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ bin/oozied.sh start

4.5拷贝官方案例模板

[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ cp -r examples/apps/cron/ oozie-apps/

4.6修改配置文件

4.6.1job.properties
nameNode=hdfs://hadoop01:8020
jobTracker=hadoop03:8032
queueName=default
examplesRoot=oozie-apps

oozie.coord.application.path=${nameNode}/user/${user.name}/${examplesRoot}/cron
#start:必须设置为未来时间,否则任务失败
start=2020-02-19T18:00+0800
end=2020-02-19T18:30+0800
workflowAppUri=${nameNode}/user/${user.name}/${examplesRoot}/cron

EXEC=p1.sh
4.6.2coordinator.xml

注意这里frequency的值最小为5

<coordinator-app name="cron-coord" frequency="${coord:minutes(5)}" start="${start}" end="${end}" timezone="GMT+0800"
                 xmlns="uri:oozie:coordinator:0.2">
        <action>
        <workflow>
            <app-path>${workflowAppUri}</app-path>
            <configuration>
                <property>
                    <name>jobTracker</name>
                    <value>${jobTracker}</value>
                </property>
                <property>
                    <name>nameNode</name>
                    <value>${nameNode}</value>
                </property>
                <property>
                    <name>queueName</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
        </workflow>
    </action>
</coordinator-app>
4.6.3workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
    <start to="p1-shell-node"/>
    <action name="p1-shell-node">
        <shell xmlns="uri:oozie:shell-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <exec>${EXEC}</exec>
            <!--<argument>my_output=Hello Oozie</argument>-->
			<file>/user/hadoop/oozie-apps/cron/${EXEC}#${EXEC}</file>
            <capture-output/>
        </shell>
        <ok to="end"/>
        <error to="fail"/>
    </action>
	
    <decision name="check-output">
        <switch>
            <case to="end">
                ${wf:actionData('shell-node')['my_output'] eq 'Hello Oozie'}
            </case>
            <default to="fail-output"/>
        </switch>
    </decision>
    <kill name="fail">
        <message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <kill name="fail-output">
        <message>Incorrect output, expected [Hello Oozie] but was [${wf:actionData('shell-node')['my_output']}]</message>
    </kill>
    <end name="end"/>
</workflow-app>
4.6.4p1.sh
vi p1.sh
#!/bin/bash
date > /home/hadoop/cdh/time.log

4.7上传到hdfs

/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put oozie-apps/cron/ /user/hadoop/oozie-apps

4.8执行任务

bin/oozie job -oozie http://hadoop03:11000/oozie -config oozie-apps/cron/job.properties -run

提交成功~

在这里插入图片描述
在这里插入图片描述
正在Running在这里插入图片描述

显示运行成功

在这里插入图片描述

看一下结果

[hadoop@hadoop03 cdh]$ cat time.log

在这里插入图片描述

发布了108 篇原创文章 · 获赞 288 · 访问量 6万+

猜你喜欢

转载自blog.csdn.net/a805814077/article/details/104398679