资源规划

组件 LTSR003 LTSR005 LTSR006 LTSR007 LTSR008
OS centos7.6 centos7.6 centos7.6 centos7.6 centos7.6
JDK jvm jvm jvm jvm jvm
HDFS DataNode/HTTPFS DataNode/HTTPFS DataNode/HTTPFS DataNode/HTTPFS NameNode/DataNode/HTTPFS
YARN NodeManager NodeManager NodeManager NodeManager ResourceManager/NodeManager/mr-jobhistory
Hive HiveServer2/Metastore/CLI/Beeline CLI/Beeline N.A N.A N.A

安装介质

版本:apache-hive-2.3.4-bin.tar.gz
下载:http://archive.apache.org/dist

环境准备

安装JDK

参考:《 01. 安装JDK

安装Hadoop

参考:《 06. 安装Hadoop

安装MySQL

参考:《 03. 安装Mysql 》

Hive服务端安装

解压缩

  1. cd ~/software/
  2. wget http://archive.apache.org/dist/hive/hive-2.3.4/apache-hive-2.3.4-bin.tar.gz
  3. tar -zxvf apache-hive-2.3.4-bin.tar.gz -C ~/modules/
  4. rm apache-hive-2.3.4-bin.tar.gz

创建相关目录

  1. cd ~/modules/apache-hive-2.3.4-bin/conf
  2. cp hive-env.sh.template hive-env.sh
  3. cp hive-default.xml.template hive-site.xml
  4. cp hive-log4j2.properties.template hive-log4j2.properties
  5. mkdir ~/modules/apache-hive-2.3.4-bin/logs
  6. mkdir ~/modules/apache-hive-2.3.4-bin/tmpdir

配置Hive

  1. 配置hive-log4j2.properties。

    1. vi ~/modules/apache-hive-2.3.4-bin/conf/hive-log4j2.properties

    配置如下:

    1. # 日志目录需要提前创建
    2. property.hive.log.dir=/home/bigdata/modules/apache-hive-2.3.4-bin/logs
  2. 配置hive-env.sh。

    vi ~/modules/apache-hive-2.3.4-bin/conf/hive-env.sh
    

    配置如下:

    export HADOOP_HOME=/home/bigdata/modules/hadoop-2.7.2
    export HIVE_CONF_DIR=/home/bigdata/modules/apache-hive-2.3.4-bin/conf
    export HIVE_AUX_JARS_PATH=/home/bigdata/modules/apache-hive-2.3.4-bin/lib
    

    3.配置hive-site.xml。

    vi ~/modules/apache-hive-2.3.4-bin/conf/hive-site.xml
    

    服务端配置:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
     <property>
         <name>system:java.io.tmpdir</name>
         <value>/home/bigdata/modules/apache-hive-2.3.4-bin/tmpdir</value>
     </property>
     <property>
         <name>system:user.name</name>
         <value>bigdata</value>
     </property>
     <property>
         <name>hive.metastore.warehouse.dir</name>
         <value>/user/hive/warehouse</value>
     </property>
    <!-- Hive服务端配置 -->
     <property>
         <name>javax.jdo.option.ConnectionURL</name>
         <value>jdbc:mysql://LTSR001:3306/hive2_metadata?createDatabaseIfNotExist=true&amp;useSSL=false</value>
     </property>
     <property>
         <name>javax.jdo.option.ConnectionDriverName</name>
         <value>com.mysql.jdbc.Driver</value>
     </property>
     <property>
         <name>javax.jdo.option.ConnectionUserName</name>
         <value>hive2</value>
     </property>
     <property>
         <name>javax.jdo.option.ConnectionPassword</name>
         <value>hive2</value>
     </property>
    </configuration>
    

    客户端配置:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
     <property>
         <name>system:java.io.tmpdir</name>
         <value>/home/bigdata/modules/apache-hive-2.3.4-bin/tmpdir</value>
     </property>
     <property>
         <name>system:user.name</name>
         <value>bigdata</value>
     </property>
    <!-- Hive客户端配置 -->
    <property>
         <name>hive.metastore.warehouse.dir</name>
         <value>/user/hive/warehouse</value>
     </property>
    <property> 
     <name>hive.metastore.uris</name>
     <value>thrift://LTSR003:9083</value>
    </property>
    <property> 
     <name>hive.metastore.local</name>
     <value>false</value>
    </property>
    </configuration>
    

    配置Hadoop

    1.配置core-site.xml。

    vi ~/modules/hadoop-2.7.2/etc/hadoop/core-site.xml
    

    配置如下:

     <!-- hiveserver2增加了权限控制,需要在hadoop的配置 -->
     <property>
         <name>hadoop.proxyuser.bigdata.hosts</name>
         <value>*</value>
     </property>
     <property>
         <name>hadoop.proxyuser.bigdata.groups</name>
         <value>*</value>
     </property>
    

    2.配置hdfs-site.xml。

    vi ~/modules/hadoop-2.7.2/etc/hadoop/hdfs-site.xml
    

    配置如下:

     <property>  
         <name>dfs.webhdfs.enabled</name>  
         <value>true</value>  
     </property>
    

    修改完配置文件后注意分发到集群其他节点。

    scp -r ~/modules/hadoop-2.7.2/etc/hadoop/core-site.xml bigdata@LTSR005:~/modules/hadoop-2.7.2/etc/hadoop/
    scp -r ~/modules/hadoop-2.7.2/etc/hadoop/hdfs-site.xml bigdata@LTSR005:~/modules/hadoop-2.7.2/etc/hadoop/
    scp -r ~/modules/hadoop-2.7.2/etc/hadoop/core-site.xml bigdata@LTSR006:~/modules/hadoop-2.7.2/etc/hadoop/
    scp -r ~/modules/hadoop-2.7.2/etc/hadoop/hdfs-site.xml bigdata@LTSR006:~/modules/hadoop-2.7.2/etc/hadoop/
    scp -r ~/modules/hadoop-2.7.2/etc/hadoop/core-site.xml bigdata@LTSR007:~/modules/hadoop-2.7.2/etc/hadoop/
    scp -r ~/modules/hadoop-2.7.2/etc/hadoop/hdfs-site.xml bigdata@LTSR007:~/modules/hadoop-2.7.2/etc/hadoop/
    scp -r ~/modules/hadoop-2.7.2/etc/hadoop/core-site.xml bigdata@LTSR008:~/modules/hadoop-2.7.2/etc/hadoop/
    scp -r ~/modules/hadoop-2.7.2/etc/hadoop/hdfs-site.xml bigdata@LTSR008:~/modules/hadoop-2.7.2/etc/hadoop/
    

    之后在HDFS上创建Hive仓库存储目录。

    # 启动HDFS并创建Hive相关目录
    cd ~/modules/hadoop-2.7.2/
    # 创建Hive相关路径并赋权
    bin/hdfs dfs -mkdir -p /user/hive/warehouse
    bin/hdfs dfs -mkdir -p /user/hive/tmp
    bin/hdfs dfs -mkdir -p /user/hive/log
    bin/hdfs dfs -chmod -R 777 /user/hive/warehouse
    bin/hdfs dfs -chmod -R 777 /user/hive/tmp
    bin/hdfs dfs -chmod -R 777 /user/hive/log
    

    环境变量设置

    vi ~/.bashrc # :$到达行尾添加
    

    配置如下:

    export HIVE_HOME=/home/bigdata/modules/apache-hive-2.3.4-bin
    export PATH=$HIVE_HOME/bin:$PATH
    

    环境变量生效:

    source ~/.bashrc
    

    集成MySQL

    1.上传MySQL驱动包至${HIVE_HOME}/lib,推荐mysql-connector-java-5.1.40.jar以上版本。(下载地址:https://mvnrepository.com/artifact/mysql/mysql-connector-java

    cd ~/software/
    wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.47/mysql-connector-java-5.1.47.jar
    cp mysql-connector-java-5.1.47.jar ~/modules/apache-hive-2.3.4-bin/lib/
    

    2.创建元数据库用户。

    # LTSR001(MySQL安装节点)
    mysql -uroot -p123456
    CREATE USER 'hive2'@'%' IDENTIFIED BY 'hive2';
    CREATE user 'hive2'@'localhost' IDENTIFIED BY 'hive2';
    GRANT ALL PRIVILEGES ON *.* TO 'hive2'@'%' WITH GRANT OPTION;
    GRANT ALL PRIVILEGES ON *.* TO 'hive2'@'localhost' with grant option;
    flush privileges; 
    quit;
    

    3.初始化元数据,看到 schemaTool completed ,即初始化成功!

    # LTSR003节点(注意要在hosts中配置ltsr001)
    cd /home/bigdata/modules/apache-hive-2.3.4-bin
    schematool -initSchema -dbType mysql -verbose
    

    4.元数据库授权。

    # LTSR001(MySQL安装节点)
    mysql -uroot -p123456
    use mysql;
    select User, Host from user;
    update user set host='%' where host='localhost';
    # 删除root用户的其他host(%之外)
    use hive2_metadata;
    grant all on hive2_metadata.* to hive2@'%' identified by 'hive2';
    grant all on hive2_metadata.* to hive2@localhost identified by 'hive2';
    ALTER DATABASE hive2_metadata CHARACTER SET latin1;
    flush privileges;
    quit;
    

    5.配置hive-site.xml(以上章节已配置)。
    6.无秘钥登录,确保本节点到集群其他节点无秘钥登录。
    7.Hive服务启动与测试

    # 需提前启动HDFS和Yarn服务
    # 创建数据文件
    vi ~/datas/stu.txt
    

    内容如下(注意:请检查确定列分割符为\t):

    00001    zhangsan
    00002    lisi
    00003    wangwu
    00004    zhaoliu
    

    创建库表并加载数据到Hive表:

    cd ~/modules/apache-hive-2.3.4-bin/bin
    ./hive
    # 打开debug模式
    ./hive -hiveconf hive.root.logger=DEBUG,console
    # 创建表
    hive>> CREATE TABLE stu(id INT,name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ;
    # 加载数据
    hive>> load data local inpath '/home/bigdata/datas/stu.txt' into table stu;
    # 查看库表
    hive>> select * from stu;
    

    服务端验证

    CLI本地连接Hive

    cd ~/modules/apache-hive-2.3.4-bin
    # 启动CLI,将自动开启Metastore服务
    bin/hive
    # 查看数据库
    hive>> show databases;
    # 使用默认数据库
    hive>> use default;
    # 查看表
    hive>> show tables;
    

    Beeline本地连接Hive

    cd ~/modules/apache-hive-2.3.4-bin
    # 1)启动hiveserver2,将自动开启Metastore服务
    bin/hiveserver2 >/dev/null 2>&1 &
    # 检查hiveserver2是否正常启动
    ps -aux| grep hiveserver2
    # 检查hiveserver2端口
    # sudo yum install net-tools # netstat安装
    netstat -nl|grep 10000
    # 检查metastore是否正常启动
    ps -aux| grep metastore
    # 2)启动beeline。
    # 方式1
    bin/beeline
    beeline>> !connect jdbc:hive2://localhost:10000 hive2 hive2
    # 方式2
    bin/beeline -u jdbc:hive2://localhost:10000
    

    Hive客户端安装

    # 分发Hive到客户端节点
    scp -r ~/modules/apache-hive-2.3.4-bin bigdata@LTSR005:~/modules/
    

    注意:分发后需修改Hive配置。

    客户端验证

    CLI远程连接Hive

    # 服务端需开启metastore服务
    cd ~/modules/apache-hive-2.3.4-bin
    bin/hive --service metastore  >/dev/null 2>&1 &
    # 启动客户端CLI
    bin/hive
    

    Beeline远程连接Hive

    # 服务端需开启HiveServer2服务(自动开启Metastore服务)
    cd ~/modules/apache-hive-2.3.4-bin
    # 启动客户端Beeline
    # 方式1
    bin/beeline
    beeline>> !connect jdbc:hive2://LTSR003:10000 hive2 hive2
    # 方式2
    bin/beeline -u jdbc:hive2://LTSR003:10000