单机搭建配置

r2.10.1-SingleCluster.html

Ha集群搭建

在HDFS基础上搭建Yarn

优先搭建HDFS环境,参考【HDFS环境搭建

机器规划

HOST NN JNN DN ZKFC ZK RM NM
master-01 name JNN zkfc zk rm
master-02 name data zkfc rm nm
node-01 JNN data zk nm
node-02 JNN data zk nm

mapred-site.xml

  1. <property>
  2. <name>mapreduce.framework.name</name>
  3. <value>yarn</value>
  4. </property>
  5. <!-- 3.x版本必须添加以下配置,否则无法运行MapReduce -->
  6. <property>
  7. <name>yarn.app.mapreduce.am.env</name>
  8. <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  9. </property>
  10. <property>
  11. <name>mapreduce.map.env</name>
  12. <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  13. </property>
  14. <property>
  15. <name>mapreduce.reduce.env</name>
  16. <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  17. </property>

yarn-site.xml

  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>hadoop-ha</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.rm-ids</name>
    <value>rm1,rm2</value>
  </property>
  <property>
    <name>yarn.resourcemanager.hostname.rm1</name>
    <value>master01</value>
  </property>
  <property>
    <name>yarn.resourcemanager.hostname.rm2</name>
    <value>node01</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address.rm1</name>
    <value>master01:8088</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address.rm2</name>
    <value>node01:8088</value>
  </property>
  <property>
    <name>yarn.resourcemanager.zk-address</name>
    <value>master01:2181,node01:2181,node02:2181</value>
  </property>

<!-- 客户端通过该地址向RM提交对应用程序操作 -->
  <property>
    <name>yarn.resourcemanager.address.rm1</name>
    <value>master01:8032</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address.rm2</name>
    <value>node01:8032</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address.rm1</name>
    <value>master01:8030</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address.rm2</name>
    <value>node01:8030</value>
  </property>

服务启动

  • cp mapred-site.xml.template mapred-site.xml
  • 每个节点更新以上mapred-site.xml和yarn-site.xml
    scp mapred-site.xml yarn-site.xml node02:pwd
  • start-yarn.sh 或 stop-yarn.sh
  • 启动RM:yarn-daemon.sh start resourcemanager
  • 停止RM:yarn-daemon.sh stop resourcemanager
  • http://172.16.179.150:8088/cluster/cluster

    服务验证

    部署测试数据

    hdfs dfs -mkdir -p /user/root/input
    hdfs dfs -D dfs.blocksize=1048576 -put data.txt /user/root/input

cd /opt/bigdata/hadoop/share/hadoop/mapreduce
#/user/god/output必须不存在,否则会报错
hadoop jar hadoop-mapreduce-examples-2.10.1.jar wordcount /user/root/input /user/root/output

执行输出日志如下:
image.png

查看结果

hdfs dfs -ls /user/root/output
image.png

查看明细

hdfs dfs -cat /user/root/output/part-r-00000

下载到本地

hdfs dfs -get /user/root/output/part-r-00000

基础命令

yarn node -list 查看节点
yarn application -list 查看正在运行的程序
yarn application -kill application_10000_100