Hbase升级后有大量RIT
1、背景
HDP2.6升级->HDP3.1
对应
Hbase1.x->Hbase2.1.6
• 启动后,有大量RIT超时
• Procedure中卡住了很多AssignRegion Procedure
• 部分表scan limit10 查询结果为0 row
• 部分表scan limit10 查询结果RS不在线
2、查询RS日志
2020-10-24 20:07:53,316 WARN [RS_OPEN_REGION-regionserver/xxxxxx:60020-14] regionserver.HRegion: Failed initialize of region= xxx_CUSTOMER_GROUP_20201021,,1603044003394.18e8a61d5e580c0c8f767b6dd710a40a., starting to roll back memstore
java.io.IOException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.PREFIX_TREE
at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1096)
at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:944)
at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:900)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7274)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7232)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7204)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7162)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7113)
at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:283)
at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.PREFIX_TREE
at java.lang.Enum.valueOf(Enum.java:238)
at org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.valueOf(DataBlockEncoding.java:31)
at org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder$ModifyableColumnFamilyDescriptor.lambda$getDataBlockEncoding$2(ColumnFamilyDescriptorBuilder.java:801)
at org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder$ModifyableColumnFamilyDescriptor.lambda$getStringOrDefault$0(ColumnFamilyDescriptorBuilder.java:703)
at org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder$ModifyableColumnFamilyDescriptor.getOrDefault(ColumnFamilyDescriptorBuilder.java:711)
at org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder$ModifyableColumnFamilyDescriptor.getStringOrDefault(ColumnFamilyDescriptorBuilder.java:703)
at org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder$ModifyableColumnFamilyDescriptor.getDataBlockEncoding(ColumnFamilyDescriptorBuilder.java:800)
at org.apache.hadoop.hbase.regionserver.HStore.
at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5746)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1060)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1057)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1057)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
… 3 more
2020-10-24 20:08:28,637 ERROR [RS_OPEN_REGION-regionserver/xxxxxx-0614:60020-12] handler.OpenRegionHandler: Failed open of region=xxx_CUSTOMER_GROUP_20201026,,1603476004706.fde73d133e593736716715de39cef600.
java.io.IOException: The new max sequence id 1 is less than the old max sequence id 5
at org.apache.hadoop.hbase.wal.WALSplitter.writeRegionSequenceIdFile(WALSplitter.java:697)
at org.apache.hadoop.hbase.regionserver.HRegion.writeRegionCloseMarker(HRegion.java:1167)
at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1702)
at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1516)
at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1466)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7287)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7232)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7204)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7162)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7113)
at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:283)
at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2020-10-24 20:08:29,176 INFO [RpcServer.priority.FPBQ.Fifo.handler=18,queue=0,port=60020] regionserver.RSRpcServices: Open xxx_CUSTOMER_GROUP_20201023,,1603216803826.b1ddf4aeec528550fb45d39713d2980d.
3、解决办法
升级前检查所有表的DataBlockEncoding是否合适.验证所有列族并打印出任何不兼容性。:
./bin/hbase pre-upgrade validate-dbe
存在不合适的表:
2018-07-13 09:58:32,028 WARN [main] tool.DataBlockEncodingValidator: Incompatible DataBlockEncoding for table: t, cf: f, encoding: PREFIX_TREE
这意味着表t,列族f的数据块编码不兼容。要修复,请在 HBase shell 中使用alter命令
alter ‘t’, { NAME => ‘f’, DATABLOCKENCODING => ‘FAST_DIFF’ }
或者
alter ‘t’, { NAME => ‘f’, DATA_BLOCK_ENCODING => ‘NONE’ }
所有表都合适
2020-10-25 04:17:22,153 INFO [main] tool.DataBlockEncodingValidator: The used Data Block Encodings are compatible with HBase 2.0.
DataBlockEncoding是表的属性,value值可以为如下图
上图是Hbase2.1.6版本中DataBlockEncoding属性对应的代码,在2.x中,因为DataBlockEncoding的value值为PREFIX_TREE时在1.x中表现的不稳定性。移除注释了PREFIX_TREE。
所以如果**Hbase升级没有注意到有表的DataBlockEncoding属性为PREFIX_TREE
则在升级后可以选择去alter这个属性(最好是升级前做检查,进行修改比较稳妥)**