在Job中运行过程中通过System.out打印调试日志,如何查看呢?

添加日志输出语句

  1. 代码中增加日志输出代码

Map类中增加输出日志:System.out.println(String.format(“k2:%s,v2:%s”,k2.toString(),v2.toString()));

  1. ......
  2. public class WordCountMapper extends Mapper<LongWritable, Text, Text, LongWritable> {
  3. ......
  4. @Override
  5. protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
  6. // 把每行内容按照空格进行切分
  7. String[] words = value.toString().split(" ");
  8. // 迭代切割出来的单词数据
  9. for (String word : words) {
  10. Text k2 = new Text(word);
  11. LongWritable v2 = new LongWritable(1L);
  12. System.out.println(String.format("k2:%s,v2:%s",k2.toString(),v2.toString()));
  13. // 输出结果
  14. context.write(k2, v2);
  15. }
  16. }
  17. }

Reduce中增加输出日志:System.out.println(String.format(“k3:%s,v3:%s”, k3.toString(), v3.toString()));

......
public class WordCountReducer extends Reducer<Text, LongWritable, Text, LongWritable> {
......
    @Override
    protected void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
        // 累加同一个key的次数
        Long sum = 0L;
        for (LongWritable value : values) {
            sum = sum + value.get();
        }
        // 组装结果
        Text k3 = key;
        LongWritable v3 = new LongWritable(sum);
        System.out.println(String.format("k3:%s,v3:%s", k3.toString(), v3.toString()));
        // 输出结果
        context.write(k3, v3);
    }
}
  1. 重新打包并上传到Hadoop集群中

修改Hadoop集群配置

  1. 修改配置文件:$HADOOP_HOME/etc/hadoop/yarn-site.xml
  2. 添加以下配置信息

     <property>
         <name>yarn.log-aggregation-enable</name>
         <value>true</value>
     </property>
     <property>
         <name>yarn.log.server.url</name>
         <value>http://hadoop101:19888/jobhistory/logs/</value>
     </property>
    
  3. 同步配置文件到集群中其他机器上

  4. 停止集群
  5. 重启集群

启动日志服务器

  1. 启动Job日志服务器,分别在集群的所有机器上启动该服务

    [root@hadoop101 hadoop-3.2.1]# mr-jobhistory-daemon.sh start historyserver
    WARNING: Use of this script to start the MR JobHistory daemon is deprecated.
    WARNING: Attempting to execute replacement "mapred --daemon start" instead.
    
  2. jps查看进程

    [root@hadoop101 hadoop-3.2.1]# jps
    6085 NameNode
    6997 JobHistoryServer
    7094 Jps
    6553 ResourceManager
    6300 SecondaryNameNode
    

    其中:JobHistoryServer为启动的日志服务器

停止日志服务器

[root@hadoop101 soft]# mr-jobhistory-daemon.sh stop historyserver
WARNING: Use of this script to stop the MR JobHistory daemon is deprecated.
WARNING: Attempting to execute replacement "mapred --daemon stop" instead.

查看日志

  1. 重新启动job任务
  2. 访问:http://hadoop101:8088/cluster

image.png

  1. 找到最新运行的Job任务记录,点击行尾的History连接打开

image.png

  1. 通过做成Map tasks/Reduce tasks或者底部的超链接打开对应任务的列表页面

image.png

  1. 在点击具体的任务名打开任务详情

image.png

  1. 点击logs超链接查看此任务的详细日志信息

image.png