在Job中运行过程中通过System.out打印调试日志,如何查看呢?
添加日志输出语句
- 代码中增加日志输出代码
Map类中增加输出日志:System.out.println(String.format(“k2:%s,v2:%s”,k2.toString(),v2.toString()));
......
public class WordCountMapper extends Mapper<LongWritable, Text, Text, LongWritable> {
......
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
// 把每行内容按照空格进行切分
String[] words = value.toString().split(" ");
// 迭代切割出来的单词数据
for (String word : words) {
Text k2 = new Text(word);
LongWritable v2 = new LongWritable(1L);
System.out.println(String.format("k2:%s,v2:%s",k2.toString(),v2.toString()));
// 输出结果
context.write(k2, v2);
}
}
}
Reduce中增加输出日志:System.out.println(String.format(“k3:%s,v3:%s”, k3.toString(), v3.toString()));
......
public class WordCountReducer extends Reducer<Text, LongWritable, Text, LongWritable> {
......
@Override
protected void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
// 累加同一个key的次数
Long sum = 0L;
for (LongWritable value : values) {
sum = sum + value.get();
}
// 组装结果
Text k3 = key;
LongWritable v3 = new LongWritable(sum);
System.out.println(String.format("k3:%s,v3:%s", k3.toString(), v3.toString()));
// 输出结果
context.write(k3, v3);
}
}
- 重新打包并上传到Hadoop集群中
修改Hadoop集群配置
- 修改配置文件:$HADOOP_HOME/etc/hadoop/yarn-site.xml
添加以下配置信息
<property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log.server.url</name> <value>http://hadoop101:19888/jobhistory/logs/</value> </property>
同步配置文件到集群中其他机器上
- 停止集群
- 重启集群
启动日志服务器
启动Job日志服务器,分别在集群的所有机器上启动该服务
[root@hadoop101 hadoop-3.2.1]# mr-jobhistory-daemon.sh start historyserver WARNING: Use of this script to start the MR JobHistory daemon is deprecated. WARNING: Attempting to execute replacement "mapred --daemon start" instead.
jps查看进程
[root@hadoop101 hadoop-3.2.1]# jps 6085 NameNode 6997 JobHistoryServer 7094 Jps 6553 ResourceManager 6300 SecondaryNameNode
其中:JobHistoryServer为启动的日志服务器
停止日志服务器
[root@hadoop101 soft]# mr-jobhistory-daemon.sh stop historyserver
WARNING: Use of this script to stop the MR JobHistory daemon is deprecated.
WARNING: Attempting to execute replacement "mapred --daemon stop" instead.
查看日志
- 重新启动job任务
- 访问:http://hadoop101:8088/cluster
- 找到最新运行的Job任务记录,点击行尾的History连接打开
- 通过做成Map tasks/Reduce tasks或者底部的超链接打开对应任务的列表页面
- 在点击具体的任务名打开任务详情
- 点击logs超链接查看此任务的详细日志信息