下载

http://archive.apache.org/dist/hbase/

src是hbase源码包,需要手动编译。安装直接下载.tar.gz文件即可

兼容问题

选择版本需考虑与其他工具hive、hdfs等版本是否兼容适配,否则会出现依赖jar缺失或者部分功能不能相互支持的问题

jdk兼容问题

image.png

hadoop兼容问题

image.png

zookeeper兼容问题

An Apache ZooKeeper quorum is required. The exact version depends on your version of HBase, though the minimum ZooKeeper version is 3.4.x due to the useMulti feature made default in 1.0.0
需要 Apache ZooKeeper 仲裁。 确切的版本取决于您的 HBase 版本,但最低 ZooKeeper 版本是 3.4.x,因为 useMulti 功能在 1.0.0 中默认设置

部署

分布式部署hbase前,需要先安装hadoop和zookeeper,hbase的物理存储基于hdfs,hbase集群节点之间协调服务基于zookeeper
HBase的安装也分为三种,单机版、伪分布式、分布式。

单机版

单机版是在一台机器上运行,不使用分布式存储系统,文件直接读写本地文件系统的文件。

  • 安装jdk
  • 解压hbase 扩展名tar.gz的安装包
  • 配置hbase环境变量
  • 为hbase-env.sh配置JAVA_HOME
  • 修改配置文件hbase-site.xml,主要添加如下三个参数

    1. <configuration>
    2. <!--指定hbase产生的数据的存储位置 -->
    3. <property>
    4. <name>hbase.rootdir</name>
    5. <value>file:///home/jinge/data/hbase</value>
    6. </property>
    7. <!--hbase依赖于zookeeper,指定zookeeper产生的数据位置 -->
    8. <property>
    9. <name>hbase.zookeeper.property.dataDir</name>
    10. <value>/home/jinge/data/zookeeper</value>
    11. </property>
    12. <!--控制HBase是否检查流功能,单机模式下需要设置为false-->
    13. <property>
    14. <name>hbase.unsafe.stream.capability.enforce</name>
    15. <value>false</value>
    16. </property>
    17. </configuration>
  • 启动hbase start-hbase.sh
    此时通过jps查过会发现多了一个Hmaster的进程

  • 进入hbase交互命令行 bin/hbase.sh shell
    此时通过jps查看 会新增一个main 的进程

以上便是单机版部署,非常简单,通常情况下用于本地开发和调试,工作生产环境 不会使用这种部署方式,仅做了解即可。

伪分布式

伪分布式意味着扔部署在单台机器上,但是会启动不同的守护进程,构建hbase集群各角色。hmaster,regionserver,zookeeper各自启动一个jvm进程运行。用hbase集群内置zookeeper。

  • 修改hbase-site.xml

    1. <!--将本地文件系统更改为HDFS实例的地址,即hbase产生的数据将位于hdfs集群上 -->
    2. <property>
    3. <name>hbase.rootdir</name>
    4. <value>hdfs://hmaster1:9000/hbase</value>
    5. </property>
    6. <property>
    7. <name>hbase.zookeeper.property.dataDir</name>
    8. <value>/home/jinge/data/zookeeper</value>
    9. </property>
    10. <!--指示HBase以分布式模式运行 -->
    11. <property>
    12. <name>hbase.cluster.distributed</name>
    13. <value>true</value>
    14. </property>
  • 启动hdfs
    start-dfs.sh

  • 启动hbase
    start-hbase.sh
    开启顺序是 zookeeper( HQuorumPeer )->master(HMaster)->regionserver(HRgionServer)

完全分布式

在完全分布式部署群集中,每个节点可以运行一个或多个hbase的守护进程,包括hbase的主备节点,regionserver或者zookeeper主从节点等。

可以如下设置分布式集群

节点名称 hmaster zookeeper regionserver
mainer
node1 备主
node2

在伪分布式基础上,要确保群集间机器设置了ssh免密

  • 修改HBASE_HOME/conf/regionservers,设置集群节点host

    idc-rhcluster-65
    idc-rhcluster-66
    idc-rhcluster-67
    idc-rhcluster-68
    idc-rhcluster-59
    idc-rhcluster-60
    idc-rhcluster-62
    idc-rhcluster-63

  • 修改HBASE_HOME/conf/backup-masters,配置高可用,备份master

    idc-rhcluster2-66

  • 修改hbase-env.sh,不使用内置zookeeper集群

    export HBASE_MANAGES_ZK=false

  • 修改hbase-site.xml,配置外部zookeeper集群信息

    1. <property>
    2. <name>hbase.zookeeper.quorum</name>
    3. <value>idc-rhcluster-65,idc-rhcluster-66,idc-rhcluster-67:2181</value>
    4. </property>
  • 启动zookeeper
    ZOOKEEPER_HOME/bin/zkServer.sh start

  • 启动hadoop
    start-all.sh
  • 启动hbase
    • 集群启动/关停 sh HBASE_HOME/bin/start-hbase.sh
    • 单台master启动/关停
      sh HBASE_HOME/bin/hbase-daemon.sh start/stop master
    • 单台regionserver启动/关停
      sh HBASE_HOME/bin/hbase-daemon.sh start/stop regionserver


进程信息如下
image.png
启动或关停过程中要严格按照顺序
启动顺序: zookeeper-> hadoop -> hbase
关停顺序:hbase -> hadoop -> zookeeper


附:生产环境配置

hbase-site.xml

官网各参数详细说明:https://hbase.apache.org/book.html#config.files Default Configuration

中文地址:http://abloz.com/hbase/book.html HBase 默认配置

  1. <?xml version="1.0" encoding="utf-8"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3. <!-- /** * * Licensed to the Apache Software Foundation (ASF) under one *
  4. or more contributor license agreements. See the NOTICE file * distributed
  5. with this work for additional information * regarding copyright ownership.
  6. The ASF licenses this file * to you under the Apache License, Version 2.0
  7. (the * "License"); you may not use this file except in compliance * with
  8. the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0
  9. * * Unless required by applicable law or agreed to in writing, software *
  10. distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT
  11. WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the
  12. License for the specific language governing permissions and * limitations
  13. under the License. */ -->
  14. <configuration>
  15. <property>
  16. <name>hbase.rootdir</name>
  17. <value>hdfs://ns1/hbase</value>
  18. </property>
  19. <property>
  20. <name>hbase.cluster.distributed</name>
  21. <value>true</value>
  22. </property>
  23. <property>
  24. <name>hbase.zookeeper.quorum</name>
  25. <value>idc-rhcluster-65,idc-rhcluster-66,idc-rhcluster-67:2181</value>
  26. </property>
  27. <property>
  28. <name>hbase.client.write.buffer</name>
  29. <value>8388608</value>
  30. </property>
  31. <property>
  32. <name>zookeeper.session.timeout</name>
  33. <value>1200000</value>
  34. </property>
  35. <property>
  36. <name>hbase.zookeeper.property.tickTime</name>
  37. <value>6000</value>
  38. </property>
  39. <property>
  40. <name>hbase.regionserver.handler.count</name>
  41. <value>20</value>
  42. </property>
  43. <property>
  44. <name>hbase.hregion.memstore.mslab.enabled</name>
  45. <value>true</value>
  46. </property>
  47. <property>
  48. <name>hbase.regionserver.maxlogs</name>
  49. <value>32</value>
  50. </property>
  51. <property>
  52. <name>hbase.regionserver.thread.compaction.large</name>
  53. <value>5</value>
  54. </property>
  55. <property>
  56. <name>hbase.regionserver.thread.compaction.small</name>
  57. <value>5</value>
  58. </property>
  59. <property>
  60. <name>hbase.rpc.timeout</name>
  61. <value>300000</value>
  62. </property>
  63. <property>
  64. <name>hbase.master.maxclockskew</name>
  65. <value>180000</value>
  66. <description>Time difference of regionserver from master</description>
  67. </property>
  68. <property>
  69. <name>index.builder.threads.max</name>
  70. <value>40</value>
  71. </property>
  72. <property>
  73. <name>index.writer.threads.max</name>
  74. <value>40</value>
  75. </property>
  76. <property>
  77. <name>index.tablefactory.cache.size</name>
  78. <value>20</value>
  79. </property>
  80. <property>
  81. <name>hbase.client.scanner.timeout.period</name>
  82. <value>6000000</value>
  83. </property>
  84. <property>
  85. <name>hbase.client.operation.timeout</name>
  86. <value>6000000</value>
  87. </property>
  88. <property>
  89. <name>hbase.ipc.server.max.callqueue.size</name>
  90. <value>2140000000</value>
  91. </property>
  92. <property>
  93. <name>phoenix.query.timeoutMs</name>
  94. <value>600000000</value>
  95. </property>
  96. <property>
  97. <name>phoenix.coprocessor.maxServerCacheTimeToLiveMs</name>
  98. <value>300000</value>
  99. </property>
  100. <property>
  101. <name>phoenix.query.threadPoolSize</name>
  102. <value>300</value>
  103. </property>
  104. <property>
  105. <name>phoenix.index.mutableBatchSizeThreshold</name>
  106. <value>10</value>
  107. </property>
  108. <property>
  109. <name>dfs.client.socket-timeout</name>
  110. <value>300000</value>
  111. </property>
  112. <property>
  113. <name>hbase.wal.provider</name>
  114. <value>multiwal</value>
  115. </property>
  116. <property>
  117. <name>hbase.regionserver.global.memstore.upperLimit</name>
  118. <value>0.4</value>
  119. </property>
  120. <property>
  121. <name>hbase.regionserver.global.memstore.lowerLimit</name>
  122. <value>0.35</value>
  123. </property>
  124. <property>
  125. <name>hbase.hregion.memstore.flush.size</name>
  126. <value>268435456</value>
  127. </property>
  128. <property>
  129. <name>hbase.hregion.memstore.block.multiplier</name>
  130. <value>12</value>
  131. </property>
  132. <property>
  133. <name>hbase.regionserver.thread.compaction.large</name>
  134. <value>20</value>
  135. </property>
  136. <property>
  137. <name>hbase.regionserver.thread.compaction.small</name>
  138. <value>20</value>
  139. </property>
  140. <property>
  141. <name>hbase.hregion.majorcompaction</name>
  142. <value>0</value>
  143. </property>
  144. <property>
  145. <name>hfile.block.cache.size</name>
  146. <value>0.4</value>
  147. </property>
  148. <property>
  149. <name>hbase.hregion.max.filesize</name>
  150. <value>10737418240</value>
  151. </property>
  152. <property>
  153. <name>hbase.hlog.split.skip.errors</name>
  154. <value>true</value>
  155. </property>
  156. <property>
  157. <name>hbase.regionserver.maxlogs</name>
  158. <value>60</value>
  159. </property>
  160. <property>
  161. <name>hbase.regionserver.executor.openregion.threads</name>
  162. <value>100</value>
  163. </property>
  164. <property>
  165. <name>hbase.coprocessor.user.region.classes</name>
  166. <value>org.apache.hadoop.hbase.coprocessor.AggregateImplementation</value>
  167. </property>
  168. <property>
  169. <name>hbase.regionserver.wal.codec</name>
  170. <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
  171. </property>
  172. <property>
  173. <name>hbase.region.server.rpc.scheduler.factory.class</name>
  174. <value>org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory</value>
  175. <description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
  176. </property>
  177. <property>
  178. <name>hbase.rpc.controllerfactory.class</name>
  179. <value>org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory</value>
  180. <description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
  181. </property>
  182. </configuration>

hbase-env.sh

  1. JAVA_HOME=/home/jdk1.8.0_172
  2. export JAVA_HOME
  3. export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
  4. export HBASE_LIBRARY_PATH=/home/hadoopadmin/hadoopNative
  5. export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xmx8192m -Xms8192m -XX:CMSInitiatingOccupancyFraction=70"
  6. export HBASE_REGIONSERVER_OPTS="-XX:UseG1GC -Xmx20480m -Xms20480m -XX:ParallelGCThreads=28 -XX:ConcGCThreads=16 -XX:G1HeapRegionSize=32M -XX:MaxGCPauseMillis=500 -XX:InitiatingHeapOccupancyPercent=85"
  7. export HBASE_PID_DIR=/home/hadoopadmin/hbase-1.2.4/pid
  8. export HBASE_MANAGES_ZK=false