下载
http://archive.apache.org/dist/hbase/
src是hbase源码包,需要手动编译。安装直接下载.tar.gz文件即可
兼容问题
选择版本需考虑与其他工具hive、hdfs等版本是否兼容适配,否则会出现依赖jar缺失或者部分功能不能相互支持的问题
jdk兼容问题
hadoop兼容问题
zookeeper兼容问题
An Apache ZooKeeper quorum is required. The exact version depends on your version of HBase, though the minimum ZooKeeper version is 3.4.x due to the useMulti feature made default in 1.0.0
需要 Apache ZooKeeper 仲裁。 确切的版本取决于您的 HBase 版本,但最低 ZooKeeper 版本是 3.4.x,因为 useMulti 功能在 1.0.0 中默认设置
部署
分布式部署hbase前,需要先安装hadoop和zookeeper,hbase的物理存储基于hdfs,hbase集群节点之间协调服务基于zookeeper
HBase的安装也分为三种,单机版、伪分布式、分布式。
单机版
单机版是在一台机器上运行,不使用分布式存储系统,文件直接读写本地文件系统的文件。
- 安装jdk
- 解压hbase 扩展名tar.gz的安装包
- 配置hbase环境变量
- 为hbase-env.sh配置JAVA_HOME
修改配置文件hbase-site.xml,主要添加如下三个参数
<configuration>
<!--指定hbase产生的数据的存储位置 -->
<property>
<name>hbase.rootdir</name>
<value>file:///home/jinge/data/hbase</value>
</property>
<!--hbase依赖于zookeeper,指定zookeeper产生的数据位置 -->
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/jinge/data/zookeeper</value>
</property>
<!--控制HBase是否检查流功能,单机模式下需要设置为false-->
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
</configuration>
启动hbase start-hbase.sh
此时通过jps查过会发现多了一个Hmaster的进程- 进入hbase交互命令行 bin/hbase.sh shell
此时通过jps查看 会新增一个main 的进程
以上便是单机版部署,非常简单,通常情况下用于本地开发和调试,工作生产环境 不会使用这种部署方式,仅做了解即可。
伪分布式
伪分布式意味着扔部署在单台机器上,但是会启动不同的守护进程,构建hbase集群各角色。hmaster,regionserver,zookeeper各自启动一个jvm进程运行。用hbase集群内置zookeeper。
修改hbase-site.xml
<!--将本地文件系统更改为HDFS实例的地址,即hbase产生的数据将位于hdfs集群上 -->
<property>
<name>hbase.rootdir</name>
<value>hdfs://hmaster1:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/jinge/data/zookeeper</value>
</property>
<!--指示HBase以分布式模式运行 -->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
启动hdfs
start-dfs.sh- 启动hbase
start-hbase.sh
开启顺序是 zookeeper( HQuorumPeer )->master(HMaster)->regionserver(HRgionServer)
完全分布式
在完全分布式部署群集中,每个节点可以运行一个或多个hbase的守护进程,包括hbase的主备节点,regionserver或者zookeeper主从节点等。
可以如下设置分布式集群
节点名称 | hmaster | zookeeper | regionserver |
---|---|---|---|
mainer | 主 | 是 | 否 |
node1 | 备主 | 是 | 是 |
node2 | 否 | 是 | 是 |
在伪分布式基础上,要确保群集间机器设置了ssh免密
修改HBASE_HOME/conf/regionservers,设置集群节点host
idc-rhcluster-65
idc-rhcluster-66
idc-rhcluster-67
idc-rhcluster-68
idc-rhcluster-59
idc-rhcluster-60
idc-rhcluster-62
idc-rhcluster-63修改HBASE_HOME/conf/backup-masters,配置高可用,备份master
idc-rhcluster2-66
修改hbase-env.sh,不使用内置zookeeper集群
export HBASE_MANAGES_ZK=false
修改hbase-site.xml,配置外部zookeeper集群信息
<property>
<name>hbase.zookeeper.quorum</name>
<value>idc-rhcluster-65,idc-rhcluster-66,idc-rhcluster-67:2181</value>
</property>
启动zookeeper
ZOOKEEPER_HOME/bin/zkServer.sh start- 启动hadoop
start-all.sh - 启动hbase
- 集群启动/关停 sh HBASE_HOME/bin/start-hbase.sh
- 单台master启动/关停
sh HBASE_HOME/bin/hbase-daemon.sh start/stop master - 单台regionserver启动/关停
sh HBASE_HOME/bin/hbase-daemon.sh start/stop regionserver
进程信息如下
启动或关停过程中要严格按照顺序
启动顺序: zookeeper-> hadoop -> hbase
关停顺序:hbase -> hadoop -> zookeeper
附:生产环境配置
hbase-site.xml
官网各参数详细说明:https://hbase.apache.org/book.html#config.files Default Configuration
中文地址:http://abloz.com/hbase/book.html HBase 默认配置
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- /** * * Licensed to the Apache Software Foundation (ASF) under one *
or more contributor license agreements. See the NOTICE file * distributed
with this work for additional information * regarding copyright ownership.
The ASF licenses this file * to you under the Apache License, Version 2.0
(the * "License"); you may not use this file except in compliance * with
the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0
* * Unless required by applicable law or agreed to in writing, software *
distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the
License for the specific language governing permissions and * limitations
under the License. */ -->
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://ns1/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>idc-rhcluster-65,idc-rhcluster-66,idc-rhcluster-67:2181</value>
</property>
<property>
<name>hbase.client.write.buffer</name>
<value>8388608</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>1200000</value>
</property>
<property>
<name>hbase.zookeeper.property.tickTime</name>
<value>6000</value>
</property>
<property>
<name>hbase.regionserver.handler.count</name>
<value>20</value>
</property>
<property>
<name>hbase.hregion.memstore.mslab.enabled</name>
<value>true</value>
</property>
<property>
<name>hbase.regionserver.maxlogs</name>
<value>32</value>
</property>
<property>
<name>hbase.regionserver.thread.compaction.large</name>
<value>5</value>
</property>
<property>
<name>hbase.regionserver.thread.compaction.small</name>
<value>5</value>
</property>
<property>
<name>hbase.rpc.timeout</name>
<value>300000</value>
</property>
<property>
<name>hbase.master.maxclockskew</name>
<value>180000</value>
<description>Time difference of regionserver from master</description>
</property>
<property>
<name>index.builder.threads.max</name>
<value>40</value>
</property>
<property>
<name>index.writer.threads.max</name>
<value>40</value>
</property>
<property>
<name>index.tablefactory.cache.size</name>
<value>20</value>
</property>
<property>
<name>hbase.client.scanner.timeout.period</name>
<value>6000000</value>
</property>
<property>
<name>hbase.client.operation.timeout</name>
<value>6000000</value>
</property>
<property>
<name>hbase.ipc.server.max.callqueue.size</name>
<value>2140000000</value>
</property>
<property>
<name>phoenix.query.timeoutMs</name>
<value>600000000</value>
</property>
<property>
<name>phoenix.coprocessor.maxServerCacheTimeToLiveMs</name>
<value>300000</value>
</property>
<property>
<name>phoenix.query.threadPoolSize</name>
<value>300</value>
</property>
<property>
<name>phoenix.index.mutableBatchSizeThreshold</name>
<value>10</value>
</property>
<property>
<name>dfs.client.socket-timeout</name>
<value>300000</value>
</property>
<property>
<name>hbase.wal.provider</name>
<value>multiwal</value>
</property>
<property>
<name>hbase.regionserver.global.memstore.upperLimit</name>
<value>0.4</value>
</property>
<property>
<name>hbase.regionserver.global.memstore.lowerLimit</name>
<value>0.35</value>
</property>
<property>
<name>hbase.hregion.memstore.flush.size</name>
<value>268435456</value>
</property>
<property>
<name>hbase.hregion.memstore.block.multiplier</name>
<value>12</value>
</property>
<property>
<name>hbase.regionserver.thread.compaction.large</name>
<value>20</value>
</property>
<property>
<name>hbase.regionserver.thread.compaction.small</name>
<value>20</value>
</property>
<property>
<name>hbase.hregion.majorcompaction</name>
<value>0</value>
</property>
<property>
<name>hfile.block.cache.size</name>
<value>0.4</value>
</property>
<property>
<name>hbase.hregion.max.filesize</name>
<value>10737418240</value>
</property>
<property>
<name>hbase.hlog.split.skip.errors</name>
<value>true</value>
</property>
<property>
<name>hbase.regionserver.maxlogs</name>
<value>60</value>
</property>
<property>
<name>hbase.regionserver.executor.openregion.threads</name>
<value>100</value>
</property>
<property>
<name>hbase.coprocessor.user.region.classes</name>
<value>org.apache.hadoop.hbase.coprocessor.AggregateImplementation</value>
</property>
<property>
<name>hbase.regionserver.wal.codec</name>
<value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
<property>
<name>hbase.region.server.rpc.scheduler.factory.class</name>
<value>org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory</value>
<description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
</property>
<property>
<name>hbase.rpc.controllerfactory.class</name>
<value>org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory</value>
<description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
</property>
</configuration>
hbase-env.sh
JAVA_HOME=/home/jdk1.8.0_172
export JAVA_HOME
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
export HBASE_LIBRARY_PATH=/home/hadoopadmin/hadoopNative
export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xmx8192m -Xms8192m -XX:CMSInitiatingOccupancyFraction=70"
export HBASE_REGIONSERVER_OPTS="-XX:UseG1GC -Xmx20480m -Xms20480m -XX:ParallelGCThreads=28 -XX:ConcGCThreads=16 -XX:G1HeapRegionSize=32M -XX:MaxGCPauseMillis=500 -XX:InitiatingHeapOccupancyPercent=85"
export HBASE_PID_DIR=/home/hadoopadmin/hbase-1.2.4/pid
export HBASE_MANAGES_ZK=false