一 准备工作

  1. 1.将所有服务全部停掉


2.全部打快照
3.在/opt下创建ha
sudo mkdir ha


4.修改所属主和所属组
sudo chown atguigu:atguigu ha


5.将原Hadoop复制到ha下
cp -r /opt/module/hadoop-3.1.3/ /opt/ha/


6.删除ha下的hadoop中的data,logs和/tmp/
rm -rf /opt/ha/hadoop-3.1.3/data /opt/ha/hadoop-3.1.3/logs


7.将ha分发到其它节点
sudo scp -r /opt/ha root@hadoop103:/opt/
sudo scp -r /opt/ha root@hadoop104:/opt/


8.将hadoop103和hadoop104中的ha所属主和所属组修改成atguigu
sudo chown -R atguigu:atguigu /opt/ha


9.修改三台节点的环境变量
sudo vim /etc/profile.d/my_env.sh
export HADOOP_HOME=/opt/ha/hadoop-3.1.3
source /etc/profile.d/my_env
echo $HADOOP_HOME


10.sudo rm -rf /tmp/
(三台节点都需要删)





二 多NN的配置(手动进行故障转移)

core-site.xml

fs.defaultFS
hdfs://mycluster


hadoop.data.dir
/opt/ha/hadoop-3.1.3/data



hadoop.tmp.dir
/opt/ha/hadoop-3.1.3/data




hdfs-site.xml

dfs.namenode.name.dir
file://${hadoop.data.dir}/name


dfs.datanode.data.dir
file://${hadoop.data.dir}/data


dfs.nameservices
mycluster


dfs.ha.namenodes.mycluster
nn1,nn2,nn3


dfs.namenode.rpc-address.mycluster.nn1
hadoop102:9820


dfs.namenode.rpc-address.mycluster.nn2
hadoop103:9820


dfs.namenode.rpc-address.mycluster.nn3
hadoop104:9820


dfs.namenode.http-address.mycluster.nn1
hadoop102:9870


dfs.namenode.http-address.mycluster.nn2
hadoop103:9870


dfs.namenode.http-address.mycluster.nn3
hadoop104:9870


dfs.namenode.shared.edits.dir
qjournal://hadoop102:8485;hadoop103:8485;hadoop104:8485/mycluster



dfs.client.failover.proxy.provider.mycluster
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider



dfs.ha.fencing.methods
sshfence



dfs.ha.fencing.ssh.private-key-files
/home/atguigu/.ssh/id_rsa



dfs.journalnode.edits.dir
${hadoop.data.dir}/jn


=================================xsync进行分发===============================================

1.在(hadoop102,hadoop103,hadoop104)JournalNode节点上,输入以下命令启动journalnode服务
hdfs —daemon start journalnode

2.在(hadoop102)上,对其进行格式化,并启动
hdfs namenode -format
hdfs —daemon start namenode


3.在hadoop103和hadoop104上,同步nn1的元数据信息
hdfs namenode -bootstrapStandby

4.在hadoop103和hadoop104上启动NameNode
hdfs —daemon start namenode
(启动后可以尝试打开各NameNode的页面)


5.将hadoop102切换为Active
hdfs haadmin -transitionToActive nn1


6.查看是否Active
hdfs haadmin -getServiceState nn1


三 多NN的配置(自动进行故障转移)


(1)在hdfs-site.xml中增加

dfs.ha.automatic-failover.enabled
true

(2)在core-site.xml文件中增加

ha.zookeeper.quorum
hadoop102:2181,hadoop103:2181,hadoop104:2181

=================================xsync进行分发===============================================


(1)关闭所有HDFS服务:
stop-dfs.sh
(2)启动Zookeeper集群:
zkCluster.sh start
(3)初始化HA在Zookeeper中状态:
hdfs zkfc -formatZK
(4)启动HDFS服务:
start-dfs.sh