本文所用环境

windows 10 x64
IntelliJ IDEA 2019.1.1
JDK 8

本文要达成的目标

在IDE中运行hadoop示例程序WordCount

说明

hadoop程序可以在单机运行,也可以在(伪)分布式环境中运行,区别的只是一些xml配置。
本文没有进行特殊配置,默认情况下,hadoop的Map Reduce会直接在本地运行,不访问(伪)分布式服务。
在根据本文配置后,后续可以配置xml文件让程序在指定的分布式HDFS和/或YARN系统上运行;也可以开始在本地开发并测试hadoop的Map Reduce程序。

新建项目

创建一个Maven项目,如下图,下一步
1.png


写上项目的基本信息,用于maven项目信息,如下,下一步
2.png
写上项目名称,用于在intellij idea中显示,以及项目文件夹;完成
3.png
进入后IntelliJ IDEA检测到了Maven项目,这里可以允许它自动导入,这样每次修改就不用再点刷新了
4.png
然后界面如下
5.png

配置Maven

添加编译插件,让Maven知道我们要编译的版本是Java 8
在前加入如下内容

  1. <build>
  2. <plugins>
  3. <plugin>
  4. <groupId>org.apache.maven.plugins</groupId>
  5. <artifactId>maven-compiler-plugin</artifactId>
  6. <version>3.8.0</version>
  7. <configuration>
  8. <source>8</source>
  9. <target>8</target>
  10. </configuration>
  11. </plugin>
  12. </plugins>
  13. </build>

添加hadoop依赖库,这里使用变量(property)指定hadoop版本,方便多个依赖库保持一致
添加在刚刚的与之间

  1. <properties>
  2. <hadoopversion>3.1.1</hadoopversion>
  3. </properties>
  4. <dependencies>
  5. <dependency>
  6. <groupId>org.apache.hadoop</groupId>
  7. <artifactId>hadoop-client</artifactId>
  8. <version>${hadoopversion}</version>
  9. </dependency>
  10. </dependencies>

如下图
7.png


创建类(源码文件)

如图新建一个Java Class
8.png
9.png

编写源码

直接使用官网教程示例,粘贴到代码,覆盖原来的所有内容

第一次运行

点击代码左侧的绿三角,再点菜单中的绿三角,直接运行
a.png
就……失败了,如下图,点击出错的行
b.png
它就能告诉你怎么肥四,哪里的错,定位过去
c.png
显然,对,显然,这里需要在程序后给两个参数,一个输入文件夹,一个输出文件夹
谁让你没看官网呢……
看了也没用,因为那个不是在IDE里运行的示例,是在命令行,而且是配置了HDFS的机器上

准备输入文件

在项目根目录,添加输入目录data
d.png
在文件夹中创建一个File,名为example.txt,内容如下,当然也可以自己写其他的内容

  1. hello world
  2. hello again

配置运行参数

在第一次点绿三角尝试运行(虽然失败了)之后,项目就有了一个运行配置,默认就和类名一样,叫WordCount

打开运行配置,如下图
f.png
在Program arguments中添加:data output
中间有空格,就表示左边是第一个参数,右边是第二个。这样,程序就能找到两个对应的输入参数了。
h.png

再次尝试运行

结果,找不到HADOOP_HOME
e.png
全文如下

  1. "C:\Program Files\Java\jdk1.8.0_212\bin\java.exe" -javaagent:C:\Users\cdarling\AppData\Local\JetBrains\Toolbox\apps\IDEA-JDK11\ch-0\191.6707.61\lib\idea_rt.jar=65326:C:\Users\cdarling\AppData\Local\JetBrains\Toolbox\apps\IDEA-JDK11\ch-0\191.6707.61\bin -Dfile.encoding=UTF-8 -classpath "C:\Program Files\Java\jdk1.8.0_212\jre\lib\charsets.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\deploy.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\access-bridge-64.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\cldrdata.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\dnsns.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\jaccess.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\jfxrt.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\localedata.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\nashorn.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\sunec.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\sunjce_provider.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\sunmscapi.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\sunpkcs11.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\zipfs.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\javaws.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\jce.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\jfr.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\jfxswt.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\jsse.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\management-agent.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\plugin.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\resources.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\rt.jar;C:\Users\cdarling\IdeaProjects\hadoop-mr-1\target\classes;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-client\3.1.1\hadoop-client-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-common\3.1.1\hadoop-common-3.1.1.jar;C:\Users\cdarling\.m2\repository\com\google\guava\guava\11.0.2\guava-11.0.2.jar;C:\Users\cdarling\.m2\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;C:\Users\cdarling\.m2\repository\org\apache\commons\commons-math3\3.1.1\commons-math3-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\httpcomponents\httpclient\4.5.2\httpclient-4.5.2.jar;C:\Users\cdarling\.m2\repository\org\apache\httpcomponents\httpcore\4.4.4\httpcore-4.4.4.jar;C:\Users\cdarling\.m2\repository\commons-codec\commons-codec\1.11\commons-codec-1.11.jar;C:\Users\cdarling\.m2\repository\commons-io\commons-io\2.5\commons-io-2.5.jar;C:\Users\cdarling\.m2\repository\commons-net\commons-net\3.6\commons-net-3.6.jar;C:\Users\cdarling\.m2\repository\commons-collections\commons-collections\3.2.2\commons-collections-3.2.2.jar;C:\Users\cdarling\.m2\repository\org\eclipse\jetty\jetty-servlet\9.3.19.v20170502\jetty-servlet-9.3.19.v20170502.jar;C:\Users\cdarling\.m2\repository\org\eclipse\jetty\jetty-security\9.3.19.v20170502\jetty-security-9.3.19.v20170502.jar;C:\Users\cdarling\.m2\repository\org\eclipse\jetty\jetty-webapp\9.3.19.v20170502\jetty-webapp-9.3.19.v20170502.jar;C:\Users\cdarling\.m2\repository\org\eclipse\jetty\jetty-xml\9.3.19.v20170502\jetty-xml-9.3.19.v20170502.jar;C:\Users\cdarling\.m2\repository\javax\servlet\jsp\jsp-api\2.1\jsp-api-2.1.jar;C:\Users\cdarling\.m2\repository\com\sun\jersey\jersey-servlet\1.19\jersey-servlet-1.19.jar;C:\Users\cdarling\.m2\repository\commons-logging\commons-logging\1.1.3\commons-logging-1.1.3.jar;C:\Users\cdarling\.m2\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar;C:\Users\cdarling\.m2\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;C:\Users\cdarling\.m2\repository\commons-beanutils\commons-beanutils\1.9.3\commons-beanutils-1.9.3.jar;C:\Users\cdarling\.m2\repository\org\apache\commons\commons-configuration2\2.1.1\commons-configuration2-2.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\commons\commons-lang3\3.4\commons-lang3-3.4.jar;C:\Users\cdarling\.m2\repository\org\slf4j\slf4j-api\1.7.25\slf4j-api-1.7.25.jar;C:\Users\cdarling\.m2\repository\org\apache\avro\avro\1.7.7\avro-1.7.7.jar;C:\Users\cdarling\.m2\repository\org\codehaus\jackson\jackson-core-asl\1.9.13\jackson-core-asl-1.9.13.jar;C:\Users\cdarling\.m2\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.13\jackson-mapper-asl-1.9.13.jar;C:\Users\cdarling\.m2\repository\com\thoughtworks\paranamer\paranamer\2.3\paranamer-2.3.jar;C:\Users\cdarling\.m2\repository\org\xerial\snappy\snappy-java\1.0.5\snappy-java-1.0.5.jar;C:\Users\cdarling\.m2\repository\com\google\re2j\re2j\1.1\re2j-1.1.jar;C:\Users\cdarling\.m2\repository\com\google\protobuf\protobuf-java\2.5.0\protobuf-java-2.5.0.jar;C:\Users\cdarling\.m2\repository\com\google\code\gson\gson\2.2.4\gson-2.2.4.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-auth\3.1.1\hadoop-auth-3.1.1.jar;C:\Users\cdarling\.m2\repository\com\nimbusds\nimbus-jose-jwt\4.41.1\nimbus-jose-jwt-4.41.1.jar;C:\Users\cdarling\.m2\repository\com\github\stephenc\jcip\jcip-annotations\1.0-1\jcip-annotations-1.0-1.jar;C:\Users\cdarling\.m2\repository\net\minidev\json-smart\2.3\json-smart-2.3.jar;C:\Users\cdarling\.m2\repository\net\minidev\accessors-smart\1.2\accessors-smart-1.2.jar;C:\Users\cdarling\.m2\repository\org\ow2\asm\asm\5.0.4\asm-5.0.4.jar;C:\Users\cdarling\.m2\repository\org\apache\curator\curator-framework\2.12.0\curator-framework-2.12.0.jar;C:\Users\cdarling\.m2\repository\org\apache\curator\curator-client\2.12.0\curator-client-2.12.0.jar;C:\Users\cdarling\.m2\repository\org\apache\curator\curator-recipes\2.12.0\curator-recipes-2.12.0.jar;C:\Users\cdarling\.m2\repository\com\google\code\findbugs\jsr305\3.0.0\jsr305-3.0.0.jar;C:\Users\cdarling\.m2\repository\org\apache\htrace\htrace-core4\4.1.0-incubating\htrace-core4-4.1.0-incubating.jar;C:\Users\cdarling\.m2\repository\org\apache\commons\commons-compress\1.4.1\commons-compress-1.4.1.jar;C:\Users\cdarling\.m2\repository\org\tukaani\xz\1.0\xz-1.0.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-simplekdc\1.0.1\kerb-simplekdc-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-client\1.0.1\kerb-client-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerby-config\1.0.1\kerby-config-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-core\1.0.1\kerb-core-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerby-pkix\1.0.1\kerby-pkix-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerby-asn1\1.0.1\kerby-asn1-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerby-util\1.0.1\kerby-util-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-common\1.0.1\kerb-common-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-crypto\1.0.1\kerb-crypto-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-util\1.0.1\kerb-util-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\token-provider\1.0.1\token-provider-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-admin\1.0.1\kerb-admin-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-server\1.0.1\kerb-server-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-identity\1.0.1\kerb-identity-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerby-xdr\1.0.1\kerby-xdr-1.0.1.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\core\jackson-databind\2.7.8\jackson-databind-2.7.8.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\core\jackson-core\2.7.8\jackson-core-2.7.8.jar;C:\Users\cdarling\.m2\repository\org\codehaus\woodstox\stax2-api\3.1.4\stax2-api-3.1.4.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\woodstox\woodstox-core\5.0.3\woodstox-core-5.0.3.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-hdfs-client\3.1.1\hadoop-hdfs-client-3.1.1.jar;C:\Users\cdarling\.m2\repository\com\squareup\okhttp\okhttp\2.7.5\okhttp-2.7.5.jar;C:\Users\cdarling\.m2\repository\com\squareup\okio\okio\1.6.0\okio-1.6.0.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\core\jackson-annotations\2.7.8\jackson-annotations-2.7.8.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-yarn-api\3.1.1\hadoop-yarn-api-3.1.1.jar;C:\Users\cdarling\.m2\repository\javax\xml\bind\jaxb-api\2.2.11\jaxb-api-2.2.11.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-yarn-client\3.1.1\hadoop-yarn-client-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-core\3.1.1\hadoop-mapreduce-client-core-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-yarn-common\3.1.1\hadoop-yarn-common-3.1.1.jar;C:\Users\cdarling\.m2\repository\javax\servlet\javax.servlet-api\3.1.0\javax.servlet-api-3.1.0.jar;C:\Users\cdarling\.m2\repository\org\eclipse\jetty\jetty-util\9.3.19.v20170502\jetty-util-9.3.19.v20170502.jar;C:\Users\cdarling\.m2\repository\com\sun\jersey\jersey-core\1.19\jersey-core-1.19.jar;C:\Users\cdarling\.m2\repository\javax\ws\rs\jsr311-api\1.1.1\jsr311-api-1.1.1.jar;C:\Users\cdarling\.m2\repository\com\sun\jersey\jersey-client\1.19\jersey-client-1.19.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\module\jackson-module-jaxb-annotations\2.7.8\jackson-module-jaxb-annotations-2.7.8.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\jaxrs\jackson-jaxrs-json-provider\2.7.8\jackson-jaxrs-json-provider-2.7.8.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\jaxrs\jackson-jaxrs-base\2.7.8\jackson-jaxrs-base-2.7.8.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\3.1.1\hadoop-mapreduce-client-jobclient-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-common\3.1.1\hadoop-mapreduce-client-common-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-annotations\3.1.1\hadoop-annotations-3.1.1.jar" WordCount data output
  2. SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
  3. SLF4J: Defaulting to no-operation (NOP) logger implementation
  4. SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
  5. log4j:WARN No appenders could be found for logger (org.apache.htrace.core.Tracer).
  6. log4j:WARN Please initialize the log4j system properly.
  7. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
  8. Exception in thread "main" java.lang.RuntimeException: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
  9. at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:737)
  10. at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:272)
  11. at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:288)
  12. at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:840)
  13. at org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:522)
  14. at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:562)
  15. at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:534)
  16. at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:561)
  17. at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:539)
  18. at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:332)
  19. at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:162)
  20. at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:113)
  21. at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:151)
  22. at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
  23. at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
  24. at java.security.AccessController.doPrivileged(Native Method)
  25. at javax.security.auth.Subject.doAs(Subject.java:422)
  26. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
  27. at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
  28. at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
  29. at WordCount.main(WordCount.java:59)
  30. Caused by: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
  31. at org.apache.hadoop.util.Shell.fileNotFoundException(Shell.java:549)
  32. at org.apache.hadoop.util.Shell.getHadoopHomeDir(Shell.java:570)
  33. at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:593)
  34. at org.apache.hadoop.util.Shell.<clinit>(Shell.java:690)
  35. at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:78)
  36. at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1665)
  37. at org.apache.hadoop.security.SecurityUtil.setConfigurationInternal(SecurityUtil.java:102)
  38. at org.apache.hadoop.security.SecurityUtil.<clinit>(SecurityUtil.java:86)
  39. at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:315)
  40. at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:303)
  41. at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1827)
  42. at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:709)
  43. at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:659)
  44. at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:570)
  45. at org.apache.hadoop.mapreduce.task.JobContextImpl.<init>(JobContextImpl.java:72)
  46. at org.apache.hadoop.mapreduce.Job.<init>(Job.java:150)
  47. at org.apache.hadoop.mapreduce.Job.getInstance(Job.java:193)
  48. at org.apache.hadoop.mapreduce.Job.getInstance(Job.java:212)
  49. at WordCount.main(WordCount.java:50)
  50. Caused by: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
  51. at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:469)
  52. at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:440)
  53. at org.apache.hadoop.util.Shell.<clinit>(Shell.java:517)
  54. ... 15 more
  55. Process finished with exit code 1

准备winutils.exe和hadoop.dll文件

在windows中运行hadoop程序,需要winutils.exe和hadoop.dll,针对不同版本的hadoop,会有不同版本的这两个程序文件。它们是用hadoop源码编译得到的,编译过程比较复杂,这里就提供下载链接好了。
下载之后,需要注意,在版本号文件夹下,一定要先放bin文件夹,再放winutils.exe和hadoop.dll等文件,不要去掉bin文件夹。在我的电脑上,我把3.1.1版本的文件夹,放到了C:\files\hadoop\311,所以winutils.exe和hadoop.dll就在C:\files\hadoop\311\bin文件夹中。

配置HADOOP_HOME环境变量

在运行配置中,给它添加两个环境变量
Environment variables就是环境变量,点右边的详情,出来的小窗口中,点加号

g.png
然后添加两个环境变量,对应你自己下载的hadoop对应版本的winutils.exe和hadoop.dll文件。

成功的运行

点绿色运行按钮,输出如下,都是红色文字,暂时不需要深究,都是hadoop给出的log信息。

  1. "C:\Program Files\Java\jdk1.8.0_212\bin\java.exe" -javaagent:C:\Users\cdarling\AppData\Local\JetBrains\Toolbox\apps\IDEA-JDK11\ch-0\191.6707.61\lib\idea_rt.jar=49229:C:\Users\cdarling\AppData\Local\JetBrains\Toolbox\apps\IDEA-JDK11\ch-0\191.6707.61\bin -Dfile.encoding=UTF-8 -classpath "C:\Program Files\Java\jdk1.8.0_212\jre\lib\charsets.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\deploy.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\access-bridge-64.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\cldrdata.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\dnsns.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\jaccess.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\jfxrt.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\localedata.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\nashorn.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\sunec.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\sunjce_provider.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\sunmscapi.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\sunpkcs11.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\ext\zipfs.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\javaws.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\jce.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\jfr.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\jfxswt.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\jsse.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\management-agent.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\plugin.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\resources.jar;C:\Program Files\Java\jdk1.8.0_212\jre\lib\rt.jar;C:\Users\cdarling\IdeaProjects\hadoop-mr-1\target\classes;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-client\3.1.1\hadoop-client-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-common\3.1.1\hadoop-common-3.1.1.jar;C:\Users\cdarling\.m2\repository\com\google\guava\guava\11.0.2\guava-11.0.2.jar;C:\Users\cdarling\.m2\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;C:\Users\cdarling\.m2\repository\org\apache\commons\commons-math3\3.1.1\commons-math3-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\httpcomponents\httpclient\4.5.2\httpclient-4.5.2.jar;C:\Users\cdarling\.m2\repository\org\apache\httpcomponents\httpcore\4.4.4\httpcore-4.4.4.jar;C:\Users\cdarling\.m2\repository\commons-codec\commons-codec\1.11\commons-codec-1.11.jar;C:\Users\cdarling\.m2\repository\commons-io\commons-io\2.5\commons-io-2.5.jar;C:\Users\cdarling\.m2\repository\commons-net\commons-net\3.6\commons-net-3.6.jar;C:\Users\cdarling\.m2\repository\commons-collections\commons-collections\3.2.2\commons-collections-3.2.2.jar;C:\Users\cdarling\.m2\repository\org\eclipse\jetty\jetty-servlet\9.3.19.v20170502\jetty-servlet-9.3.19.v20170502.jar;C:\Users\cdarling\.m2\repository\org\eclipse\jetty\jetty-security\9.3.19.v20170502\jetty-security-9.3.19.v20170502.jar;C:\Users\cdarling\.m2\repository\org\eclipse\jetty\jetty-webapp\9.3.19.v20170502\jetty-webapp-9.3.19.v20170502.jar;C:\Users\cdarling\.m2\repository\org\eclipse\jetty\jetty-xml\9.3.19.v20170502\jetty-xml-9.3.19.v20170502.jar;C:\Users\cdarling\.m2\repository\javax\servlet\jsp\jsp-api\2.1\jsp-api-2.1.jar;C:\Users\cdarling\.m2\repository\com\sun\jersey\jersey-servlet\1.19\jersey-servlet-1.19.jar;C:\Users\cdarling\.m2\repository\commons-logging\commons-logging\1.1.3\commons-logging-1.1.3.jar;C:\Users\cdarling\.m2\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar;C:\Users\cdarling\.m2\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;C:\Users\cdarling\.m2\repository\commons-beanutils\commons-beanutils\1.9.3\commons-beanutils-1.9.3.jar;C:\Users\cdarling\.m2\repository\org\apache\commons\commons-configuration2\2.1.1\commons-configuration2-2.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\commons\commons-lang3\3.4\commons-lang3-3.4.jar;C:\Users\cdarling\.m2\repository\org\slf4j\slf4j-api\1.7.25\slf4j-api-1.7.25.jar;C:\Users\cdarling\.m2\repository\org\apache\avro\avro\1.7.7\avro-1.7.7.jar;C:\Users\cdarling\.m2\repository\org\codehaus\jackson\jackson-core-asl\1.9.13\jackson-core-asl-1.9.13.jar;C:\Users\cdarling\.m2\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.13\jackson-mapper-asl-1.9.13.jar;C:\Users\cdarling\.m2\repository\com\thoughtworks\paranamer\paranamer\2.3\paranamer-2.3.jar;C:\Users\cdarling\.m2\repository\org\xerial\snappy\snappy-java\1.0.5\snappy-java-1.0.5.jar;C:\Users\cdarling\.m2\repository\com\google\re2j\re2j\1.1\re2j-1.1.jar;C:\Users\cdarling\.m2\repository\com\google\protobuf\protobuf-java\2.5.0\protobuf-java-2.5.0.jar;C:\Users\cdarling\.m2\repository\com\google\code\gson\gson\2.2.4\gson-2.2.4.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-auth\3.1.1\hadoop-auth-3.1.1.jar;C:\Users\cdarling\.m2\repository\com\nimbusds\nimbus-jose-jwt\4.41.1\nimbus-jose-jwt-4.41.1.jar;C:\Users\cdarling\.m2\repository\com\github\stephenc\jcip\jcip-annotations\1.0-1\jcip-annotations-1.0-1.jar;C:\Users\cdarling\.m2\repository\net\minidev\json-smart\2.3\json-smart-2.3.jar;C:\Users\cdarling\.m2\repository\net\minidev\accessors-smart\1.2\accessors-smart-1.2.jar;C:\Users\cdarling\.m2\repository\org\ow2\asm\asm\5.0.4\asm-5.0.4.jar;C:\Users\cdarling\.m2\repository\org\apache\curator\curator-framework\2.12.0\curator-framework-2.12.0.jar;C:\Users\cdarling\.m2\repository\org\apache\curator\curator-client\2.12.0\curator-client-2.12.0.jar;C:\Users\cdarling\.m2\repository\org\apache\curator\curator-recipes\2.12.0\curator-recipes-2.12.0.jar;C:\Users\cdarling\.m2\repository\com\google\code\findbugs\jsr305\3.0.0\jsr305-3.0.0.jar;C:\Users\cdarling\.m2\repository\org\apache\htrace\htrace-core4\4.1.0-incubating\htrace-core4-4.1.0-incubating.jar;C:\Users\cdarling\.m2\repository\org\apache\commons\commons-compress\1.4.1\commons-compress-1.4.1.jar;C:\Users\cdarling\.m2\repository\org\tukaani\xz\1.0\xz-1.0.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-simplekdc\1.0.1\kerb-simplekdc-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-client\1.0.1\kerb-client-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerby-config\1.0.1\kerby-config-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-core\1.0.1\kerb-core-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerby-pkix\1.0.1\kerby-pkix-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerby-asn1\1.0.1\kerby-asn1-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerby-util\1.0.1\kerby-util-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-common\1.0.1\kerb-common-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-crypto\1.0.1\kerb-crypto-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-util\1.0.1\kerb-util-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\token-provider\1.0.1\token-provider-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-admin\1.0.1\kerb-admin-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-server\1.0.1\kerb-server-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerb-identity\1.0.1\kerb-identity-1.0.1.jar;C:\Users\cdarling\.m2\repository\org\apache\kerby\kerby-xdr\1.0.1\kerby-xdr-1.0.1.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\core\jackson-databind\2.7.8\jackson-databind-2.7.8.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\core\jackson-core\2.7.8\jackson-core-2.7.8.jar;C:\Users\cdarling\.m2\repository\org\codehaus\woodstox\stax2-api\3.1.4\stax2-api-3.1.4.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\woodstox\woodstox-core\5.0.3\woodstox-core-5.0.3.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-hdfs-client\3.1.1\hadoop-hdfs-client-3.1.1.jar;C:\Users\cdarling\.m2\repository\com\squareup\okhttp\okhttp\2.7.5\okhttp-2.7.5.jar;C:\Users\cdarling\.m2\repository\com\squareup\okio\okio\1.6.0\okio-1.6.0.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\core\jackson-annotations\2.7.8\jackson-annotations-2.7.8.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-yarn-api\3.1.1\hadoop-yarn-api-3.1.1.jar;C:\Users\cdarling\.m2\repository\javax\xml\bind\jaxb-api\2.2.11\jaxb-api-2.2.11.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-yarn-client\3.1.1\hadoop-yarn-client-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-core\3.1.1\hadoop-mapreduce-client-core-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-yarn-common\3.1.1\hadoop-yarn-common-3.1.1.jar;C:\Users\cdarling\.m2\repository\javax\servlet\javax.servlet-api\3.1.0\javax.servlet-api-3.1.0.jar;C:\Users\cdarling\.m2\repository\org\eclipse\jetty\jetty-util\9.3.19.v20170502\jetty-util-9.3.19.v20170502.jar;C:\Users\cdarling\.m2\repository\com\sun\jersey\jersey-core\1.19\jersey-core-1.19.jar;C:\Users\cdarling\.m2\repository\javax\ws\rs\jsr311-api\1.1.1\jsr311-api-1.1.1.jar;C:\Users\cdarling\.m2\repository\com\sun\jersey\jersey-client\1.19\jersey-client-1.19.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\module\jackson-module-jaxb-annotations\2.7.8\jackson-module-jaxb-annotations-2.7.8.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\jaxrs\jackson-jaxrs-json-provider\2.7.8\jackson-jaxrs-json-provider-2.7.8.jar;C:\Users\cdarling\.m2\repository\com\fasterxml\jackson\jaxrs\jackson-jaxrs-base\2.7.8\jackson-jaxrs-base-2.7.8.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\3.1.1\hadoop-mapreduce-client-jobclient-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-common\3.1.1\hadoop-mapreduce-client-common-3.1.1.jar;C:\Users\cdarling\.m2\repository\org\apache\hadoop\hadoop-annotations\3.1.1\hadoop-annotations-3.1.1.jar" WordCount data output
  2. SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
  3. SLF4J: Defaulting to no-operation (NOP) logger implementation
  4. SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
  5. log4j:WARN No appenders could be found for logger (org.apache.htrace.core.Tracer).
  6. log4j:WARN Please initialize the log4j system properly.
  7. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
  8. Process finished with exit code 0

在项目文件夹中,可以看到多了一个output文件夹,里面已经有程序输出了,如下图
i.png
文件内容中的hello 2表示在原始输入文件中有两个hello。另外两个词只出现过一次。
_SUCCESS表示运行状态为成功,而part-r-00000是运行结果,编号为00000。在分布式运行中,可能会生成大量的输出文件,实际读取时,需要合并(concatenate)它们,才是全部的程序输出结果。crc文件是校验和,用来确认文件的完整性的,暂时不需要深究。

再次成功运行

想要再次运行,要记得删除output文件夹。
MapReduce运算一般比较大型,生成时间较长,因此不会轻易覆盖已经生成的数据。