集群准备

描述:hadoop HA 集群的搭建依赖于 zookeeper,所以选取三台当做 zookeeper 集群 ,总共准备了四台主机,分别是 hadoop1,hadoop2,hadoop3,hadoop4 其中 hadoop1 和 hadoop2 做 namenode 的主备切换,hadoop3 和 hadoop4 做 resourcemanager 的主备切换
四台机器
Hdoop集群搭建 - 图1

集群服务器准备

1、 修改主机名
2、 修改 IP 地址
3、 添加主机名和 IP 映射
4、 添加普通用户 hadoop 用户并配置 sudoer 权限
5、 设置系统启动级别
6、 关闭防火墙/关闭 Selinux
7、 安装 JDK 两种准备方式:
1、 每个节点都单独设置,这样比较麻烦。线上环境可以编写脚本实现
2、 虚拟机环境可是在做完以上 7 步之后,就进行克隆
然后接着再给你的集群配置 SSH 免密登陆和搭建时间同步服务
1在root用户下输入ssh-keygen -t rsa 一路回车
2秘钥生成后在~/.ssh/目录下,有两个文件id_rsa(私钥)和id_rsa.pub(公钥),将公钥复制到authorized_keys并赋予authorized_keys600权限

  1. cat id_rsa.pub >> authorized_keys
  2. chmod 600 authorized_keys

3同理在其他节点上进行相同的操作,然后将公钥复制到master节点上的authoized_keys
4将master节点上的authoized_keys覆盖其他节点的authoized_keys,达到节点之间ssh免登录
8、 配置 SSH 免密登录
9、 同步服务器时间

集群安装

1.安装zookeeper集群

2.安装hadoop集群

2.1获取安装包

  1. wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.10.1/hadoop-2.10.1.tar.gz

2.2解压

  1. [hadoop@hadoop1 ~]$ tar -zxvf hadoop-2.7.5-centos-6.7.tar.gz -C /usr/local/

2.3修改配置文件

  1. [hadoop@hadoop1 ~]$ cd /usr/local/hadoop-2.7.5/etc/hadoop/
  2. [hadoop@hadoop1 hadoop]$ echo $JAVA_HOME
  3. /usr/local/jdk1.8.0_73
  4. [hadoop@hadoop1 hadoop]$ vi hadoop-env.sh
  5. export JAVA_HOME=/usr/java/jdk1.8

2.3.1修改core-site.xml

  1. [hadoop@hadoop1 hadoop]$ vi core-site.xml
  2. <configuration>
  3. <!-- 指定hdfsnameservicemyha01 -->
  4. <property>
  5. <name>fs.defaultFS</name>
  6. <value>hdfs://myha01/</value>
  7. </property>
  8. <!-- 指定hadoop临时目录 -->
  9. <property>
  10. <name>hadoop.tmp.dir</name>
  11. <value>/home/hadoop/data/hadoopdata/</value>
  12. </property>
  13. <!-- 指定zookeeper地址 -->
  14. <property>
  15. <name>ha.zookeeper.quorum</name>
  16. <value>hadoop1:2181,hadoop2:2181,hadoop3:2181,hadoop4:2181</value>
  17. </property>
  18. <!-- hadoop链接zookeeper的超时时长设置 -->
  19. <property>
  20. <name>ha.zookeeper.session-timeout.ms</name>
  21. <value>1000</value>
  22. <description>ms</description>
  23. </property>
  24. </configuration>

2.3.2修改hdfs-site.xml

  1. [hadoop@hadoop1 hadoop]$ vi hdfs-site.xml
  2. <configuration>
  3. <!-- 指定副本数 -->
  4. <property>
  5. <name>dfs.replication</name>
  6. <value>2</value>
  7. </property>
  8. <!-- 配置namenodedatanode的工作目录-数据存储目录 -->
  9. <property>
  10. <name>dfs.namenode.name.dir</name>
  11. <value>/home/hadoop/data/hadoopdata/dfs/name</value>
  12. </property>
  13. <property>
  14. <name>dfs.datanode.data.dir</name>
  15. <value>/home/hadoop/data/hadoopdata/dfs/data</value>
  16. </property>
  17. <!-- 启用webhdfs -->
  18. <property>
  19. <name>dfs.webhdfs.enabled</name>
  20. <value>true</value>
  21. </property>
  22. <!--指定hdfsnameservicemyha01,需要和core-site.xml中的保持一致
  23. dfs.ha.namenodes.[nameservice id]为在nameservice中的每一个NameNode设置唯一标示符。
  24. 配置一个逗号分隔的NameNode ID列表。这将是被DataNode识别为所有的NameNode
  25. 例如,如果使用"myha01"作为nameservice ID,并且使用"nn1""nn2"作为NameNodes标示符
  26. -->
  27. <property>
  28. <name>dfs.nameservices</name>
  29. <value>myha01</value>
  30. </property>
  31. <!-- myha01下面有两个NameNode,分别是nn1nn2 -->
  32. <property>
  33. <name>dfs.ha.namenodes.myha01</name>
  34. <value>nn1,nn2</value>
  35. </property>
  36. <!-- nn1RPC通信地址 -->
  37. <property>
  38. <name>dfs.namenode.rpc-address.myha01.nn1</name>
  39. <value>hadoop1:9000</value>
  40. </property>
  41. <!-- nn1http通信地址 -->
  42. <property>
  43. <name>dfs.namenode.http-address.myha01.nn1</name>
  44. <value>hadoop1:50070</value>
  45. </property>
  46. <!-- nn2RPC通信地址 -->
  47. <property>
  48. <name>dfs.namenode.rpc-address.myha01.nn2</name>
  49. <value>hadoop2:9000</value>
  50. </property>
  51. <!-- nn2http通信地址 -->
  52. <property>
  53. <name>dfs.namenode.http-address.myha01.nn2</name>
  54. <value>hadoop2:50070</value>
  55. </property>
  56. <!-- 指定NameNodeedits元数据的共享存储位置。也就是JournalNode列表
  57. url的配置格式:qjournal://host1:port1;host2:port2;host3:port3/journalId
  58. journalId推荐使用nameservice,默认端口号是:8485 -->
  59. <property>
  60. <name>dfs.namenode.shared.edits.dir</name>
  61. <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/myha01</value>
  62. </property>
  63. <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
  64. <property>
  65. <name>dfs.journalnode.edits.dir</name>
  66. <value>/home/hadoop/data/journaldata</value>
  67. </property>
  68. <!-- 开启NameNode失败自动切换 -->
  69. <property>
  70. <name>dfs.ha.automatic-failover.enabled</name>
  71. <value>true</value>
  72. </property>
  73. <!-- 配置失败自动切换实现方式 -->
  74. <property>
  75. <name>dfs.client.failover.proxy.provider.myha01</name>
  76. <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  77. </property>
  78. <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行 -->
  79. <property>
  80. <name>dfs.ha.fencing.methods</name>
  81. <value>
  82. sshfence
  83. shell(/bin/true)
  84. </value>
  85. </property>
  86. <!-- 使用sshfence隔离机制时需要ssh免登陆 -->
  87. <property>
  88. <name>dfs.ha.fencing.ssh.private-key-files</name>
  89. <value>/home/hadoop/.ssh/id_rsa</value>
  90. </property>
  91. <!-- 配置sshfence隔离机制超时时间 -->
  92. <property>
  93. <name>dfs.ha.fencing.ssh.connect-timeout</name>
  94. <value>30000</value>
  95. </property>
  96. <property>
  97. <name>ha.failover-controller.cli-check.rpc-timeout.ms</name>
  98. <value>60000</value>
  99. </property>
  100. </configuration>

2.3.3修改mapped-site.xml

  1. [hadoop@hadoop1 hadoop]$ cp mapred-site.xml.template mapred-site.xml
  2. [hadoop@hadoop1 hadoop]$ vi mapred-site.xml
  3. <configuration>
  4. <!-- 指定mr框架为yarn方式 -->
  5. <property>
  6. <name>mapreduce.framework.name</name>
  7. <value>yarn</value>
  8. </property>
  9. <!-- 指定mapreduce jobhistory地址 -->
  10. <property>
  11. <name>mapreduce.jobhistory.address</name>
  12. <value>hadoop1:10020</value>
  13. </property>
  14. <!-- 任务历史服务器的web地址 -->
  15. <property>
  16. <name>mapreduce.jobhistory.webapp.address</name>
  17. <value>hadoop1:19888</value>
  18. </property>
  19. </configuration>

2.3.4修改yarn-site.xml

  1. [hadoop@hadoop1 hadoop]$ vi yarn-site.xml
  2. <configuration>
  3. <!-- 开启RM高可用 -->
  4. <property>
  5. <name>yarn.resourcemanager.ha.enabled</name>
  6. <value>true</value>
  7. </property>
  8. <!-- 指定RMcluster id -->
  9. <property>
  10. <name>yarn.resourcemanager.cluster-id</name>
  11. <value>yrc</value>
  12. </property>
  13. <!-- 指定RM的名字 -->
  14. <property>
  15. <name>yarn.resourcemanager.ha.rm-ids</name>
  16. <value>rm1,rm2</value>
  17. </property>
  18. <!-- 分别指定RM的地址 -->
  19. <property>
  20. <name>yarn.resourcemanager.hostname.rm1</name>
  21. <value>hadoop3</value>
  22. </property>
  23. <property>
  24. <name>yarn.resourcemanager.hostname.rm2</name>
  25. <value>hadoop4</value>
  26. </property>
  27. <!-- 指定zk集群地址 -->
  28. <property>
  29. <name>yarn.resourcemanager.zk-address</name>
  30. <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
  31. </property>
  32. <property>
  33. <name>yarn.nodemanager.aux-services</name>
  34. <value>mapreduce_shuffle</value>
  35. </property>
  36. <property>
  37. <name>yarn.log-aggregation-enable</name>
  38. <value>true</value>
  39. </property>
  40. <property>
  41. <name>yarn.log-aggregation.retain-seconds</name>
  42. <value>86400</value>
  43. </property>
  44. <!-- 启用自动恢复 -->
  45. <property>
  46. <name>yarn.resourcemanager.recovery.enabled</name>
  47. <value>true</value>
  48. </property>
  49. <!-- 制定resourcemanager的状态信息存储在zookeeper集群上 -->
  50. <property>
  51. <name>yarn.resourcemanager.store.class</name>
  52. <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
  53. </property>
  54. </configuration>

2.3.5修改slaves

[hadoop@hadoop1 hadoop]$ vi slaves 

hadoop1
hadoop2
hadoop3
hadoop4

2.4将hadoop安装包分发到其他集群节点

重点强调: 每台服务器中的hadoop安装包的目录必须一致, 安装包的配置信息还必须保持一致
重点强调: 每台服务器中的hadoop安装包的目录必须一致, 安装包的配置信息还必须保持一致
重点强调: 每台服务器中的hadoop安装包的目录必须一致, 安装包的配置信息还必须保持一致

[hadoop@hadoop1 apps]$ scp -r hadoop-2.7.5/ hadoop2:$PWD
[hadoop@hadoop1 apps]$ scp -r hadoop-2.7.5/ hadoop3:$PWD
[hadoop@hadoop1 apps]$ scp -r hadoop-2.7.5/ hadoop4:$PWD

2.5配置Hadoop环境变量

千万注意:
1、如果你使用root用户进行安装。 vi /etc/profile 即可 系统变量
2、如果你使用普通用户进行安装。 vi ~/.bashrc 用户变量
本人是用的hadoop用户安装的

[hadoop@hadoop1 ~]$ vi .bashrc
export HADOOP_HOME=/home/hadoop/apps/hadoop-2.7.5
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:

Hdoop集群搭建 - 图2
使环境变量生效

[hadoop@hadoop1 bin]$ source ~/.bashrc

2.6查看hadpop版本

[hadoop@hadoop4 ~]$ hadoop version
Hadoop 2.7.5
Subversion Unknown -r Unknown
Compiled by root on 2017-12-24T05:30Z
Compiled with protoc 2.5.0
From source with checksum 9f118f95f47043332d51891e37f736e9
This command was run using /home/hadoop/apps/hadoop-2.7.5/share/hadoop/common/hadoop-common-2.7.5.jar
[hadoop@hadoop4 ~]$

Hadoop集群初始化

1.启动zookeeper服务器

在每台服务器上都启动,这里是节点角色是一台Master两台follower一台Observer

[hadoop@hadoop1 conf]$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/hadoop/apps/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@hadoop1 conf]$ jps
2674 Jps
2647 QuorumPeerMain
[hadoop@hadoop1 conf]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/hadoop/apps/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
[hadoop@hadoop1 conf]$

observer节点设置

[root@hadoop1 logs]# vim /usr/local/zookeeper/conf/zoo.cfg 

#在本身节点加上这条,其他不用加
peerType=observer

server.1=10.4.7.111:2888:3888
server.2=10.4.7.112:2888:3888
server.3=10.4.7.113:2888:3888
#在每个节点都标注这个observer
server.4=10.4.7.114:2888:3888:observer

2.在配置的各个journalnode节点启动该进程

按照之前的规划,我的是在hadoop1、hadoop2、hadoop3上进行启动,启动命令如下
hadoop1

[hadoop@hadoop1 conf]$ hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/hadoop/apps/hadoop-2.7.5/logs/hadoop-hadoop-journalnode-hadoop1.out
[hadoop@hadoop1 conf]$ jps
2739 JournalNode
2788 Jps
2647 QuorumPeerMain
[hadoop@hadoop1 conf]$

3.格式化namenode

先选取一个namenode(hadoop1)节点进行格式化

[hadoop@hadoop1 ~]$ hadoop namenode -format

Hdoop集群搭建 - 图3
出现这句话就对了

4.把在hadoop1节点上生成的元数据复制到另一个namenode上

这里要检查两个节点下目录文件是不是一样的,否则会发生启动不起来namenode

[hadoop@hadoop1 ~]$ cd data/
[hadoop@hadoop1 data]$ ls
hadoopdata  journaldata  zkdata
[hadoop@hadoop1 data]$ scp -r hadoopdata/ hadoop2:$PWDVERSION                                                                       100%  206     0.2KB/s   00:00    fsimage_0000000000000000000.md5                                               100%   62     0.1KB/s   00:00    fsimage_0000000000000000000                                                   100%  323     0.3KB/s   00:00    seen_txid                                                                     100%    2     0.0KB/s   00:00    [hadoop@hadoop1 data]$

5.格式化zkfc

重点强调:只能在nameonde节点进行
重点强调:只能在nameonde节点进行
重点强调:只能在nameonde节点进行

[hadoop@hadoop1 data]$ hdfs zkfc -formatZK

Hdoop集群搭建 - 图4

启动集群

1.启动HDFS

可以从启动输出日志里面看到启动了哪些进程

[hadoop@hadoop1 ~]$ start-dfs.sh
Starting namenodes on [hadoop1 hadoop2]
hadoop2: starting namenode, logging to /home/hadoop/apps/hadoop-2.7.5/logs/hadoop-hadoop-namenode-hadoop2.out
hadoop1: starting namenode, logging to /home/hadoop/apps/hadoop-2.7.5/logs/hadoop-hadoop-namenode-hadoop1.out
hadoop3: starting datanode, logging to /home/hadoop/apps/hadoop-2.7.5/logs/hadoop-hadoop-datanode-hadoop3.out
hadoop4: starting datanode, logging to /home/hadoop/apps/hadoop-2.7.5/logs/hadoop-hadoop-datanode-hadoop4.out
hadoop2: starting datanode, logging to /home/hadoop/apps/hadoop-2.7.5/logs/hadoop-hadoop-datanode-hadoop2.out
hadoop1: starting datanode, logging to /home/hadoop/apps/hadoop-2.7.5/logs/hadoop-hadoop-datanode-hadoop1.out
Starting journal nodes [hadoop1 hadoop2 hadoop3]
hadoop3: journalnode running as process 16712. Stop it first.
hadoop2: journalnode running as process 3049. Stop it first.
hadoop1: journalnode running as process 2739. Stop it first.
Starting ZK Failover Controllers on NN hosts [hadoop1 hadoop2]
hadoop2: starting zkfc, logging to /home/hadoop/apps/hadoop-2.7.5/logs/hadoop-hadoop-zkfc-hadoop2.out
hadoop1: starting zkfc, logging to /home/hadoop/apps/hadoop-2.7.5/logs/hadoop-hadoop-zkfc-hadoop1.out
[hadoop@hadoop1 ~]$

查看各个节点是否正常,也就是是否有这些进程
hadoop1
Hdoop集群搭建 - 图5
hadoop2
Hdoop集群搭建 - 图6
hadoop3
Hdoop集群搭建 - 图7
hadoop4
Hdoop集群搭建 - 图8

2.启动YARN

在主备 resourcemanager 中随便选择一台进行启动

[hadoop@hadoop4 ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/apps/hadoop-2.7.5/logs/yarn-hadoop-resourcemanager-hadoop4.out
hadoop3: starting nodemanager, logging to /home/hadoop/apps/hadoop-2.7.5/logs/yarn-hadoop-nodemanager-hadoop3.out
hadoop2: starting nodemanager, logging to /home/hadoop/apps/hadoop-2.7.5/logs/yarn-hadoop-nodemanager-hadoop2.out
hadoop4: starting nodemanager, logging to /home/hadoop/apps/hadoop-2.7.5/logs/yarn-hadoop-nodemanager-hadoop4.out
hadoop1: starting nodemanager, logging to /home/hadoop/apps/hadoop-2.7.5/logs/yarn-hadoop-nodemanager-hadoop1.out
[hadoop@hadoop4 ~]$

正常启动之后,检查各节点的进程
hadoop1
Hdoop集群搭建 - 图9
hadoop2
Hdoop集群搭建 - 图10
hadoop3
Hdoop集群搭建 - 图11
hadoop4
Hdoop集群搭建 - 图12
若备用节点的 resourcemanager 没有启动起来,则手动启动起来,在hadoop3上进行手动启动

[hadoop@hadoop3 ~]$ yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /home/hadoop/apps/hadoop-2.7.5/logs/yarn-hadoop-resourcemanager-hadoop3.out
[hadoop@hadoop3 ~]$ jps
17492 ResourceManager
16612 QuorumPeerMain
16712 JournalNode
17532 Jps
17356 NodeManager
16830 DataNode
[hadoop@hadoop3 ~]$

3.启动mapreduce任务历史服务器

[hadoop@hadoop1 ~]$ mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /home/hadoop/apps/hadoop-2.7.5/logs/mapred-hadoop-historyserver-hadoop1.out
[hadoop@hadoop1 ~]$ jps
4016 NodeManager
2739 JournalNode
4259 Jps
3844 DFSZKFailoverController
2647 QuorumPeerMain
3546 DataNode
4221 JobHistoryServer
3407 NameNode
[hadoop@hadoop1 ~]$

hadoop1这个节点有即可Hdoop集群搭建 - 图13

4.查看各主节点的状态

HDFS

[hadoop@hadoop1 ~]$ hdfs haadmin -getServiceState nn1
standby
[hadoop@hadoop1 ~]$ hdfs haadmin -getServiceState nn2
active

YARN

[hadoop@hadoop4 ~]$ yarn rmadmin -getServiceState rm1
standby
[hadoop@hadoop4 ~]$ yarn rmadmin -getServiceState rm2
active

5.WEB页面查看

HDFS
hadoop1
http://hadoop1:50070/dfshealth.html#tab-overview
image.png
hadoop2
http://hadoop2:50070/dfshealth.html#tab-overview
image.png
YARN
standby节点会自动跳到avtive节点
image.png
MapReduce历史服务器web界面
http://hadoop1:19888/jobhistory
image.png

关闭集群

关闭 Hadoop 集群也是在 Master 节点上执行的:
stop-yarn.sh
stop-dfs.sh
mr-jobhistory-daemon.sh stop historyserver