问题描述

  1. 最近正在做集群迁移,使用distcp将历史数据拷贝至新集群

  2. 老集群的hive数据会定期使用hive命令做har归档,避免小文件过多,命令如下 ```sql hive> set hive.archive.enabled=true; hive> set hive.archive.har.parentdir.settable=true; hive> set har.partfile.size=1099511627776;

hive> ALTER TABLE table_name ARCHIVE PARTITION (partition_col = partition_col_value, partition_col = partiton_col_value, …)

参考:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Archiving

  1. 3. 使用distcphar文件拷贝到对应表的分区目录后后发现文件无法识别,使用count也是等于0
  2. <a name="BIwj4"></a>
  3. ## 问题原因
  4. 1. 使用hive命令进行har归档是会将hive元数据对应表的partition location改变为har://xxxx
  5. ```sql
  6. 例如:har://hdfs-nameservice1/warehouse/dwd_yunc_transaction/dwd_posdm_center__posdm_invoice_item/sdt=20200105/data.har
  1. 使用distcp只是将数据拷贝过去,并没有改变hive的元数据,partiton location还是为hdfs://xxxx

    例如:hdfs://nameservice1/warehouse/dwd_yunc_transaction/dwd_posdm_center__posdm_invoice_item/sdt=20200105
    

    问题解决

    修复方法

  2. 单分区和双分区有两种不同的修复方法,原因就是双分区将har归档拷贝过去之后,无法使用msck生成分区元数据信息,从而就不能使用直接修改元数据的方式修复

  3. 单分区修复:直接修改元数据,将分区的localtion改为正常的即可,那么如何获得partiton 对应的元数据信息呢,如下 ```sql

    先获取DB_ID,将db_name修改为数据库名

    select DB_ID from metastore.DBS where name=’db_name’;

在通过DB_ID获取TBL_ID,将table_name修改为表名

SELECT TBL_ID FROM TBLS WHERE DB_ID=537085 AND TBL_NAME=’table_name’;

再通过TBL_ID获取PARTITION的SD_ID,将partition_name修改为分区名如sdt=20171030

SELECT SD_ID FROM PARTITIONS WHERE TBL_ID=8410308 AND PART_NAME=’partition_name’;

最后就可以通过SD_ID修改分区的location了

UPDATE metastore.SDS SET LOCATION=’har://hdfs-nameservice1/warehouse/dm_yhsh_yunc_transaction/dm_biz_yhsh_manage/sdt=20171123/data.har’ WHERE SD_ID=xxx;

<a name="LmMjq"></a>
### 表为单分区

1. 共如下几个脚本,运行顺序也是从上至下,修复前先备份Hive元数据,脚本中hdfs路径需要替换为自己的,后续会优化脚本..
```sql
fix_table_list.txt : 要修复的库表列表,以db_name.table_name形式保存,如st_yunc_transaction.rpt_day_shop_goods_wxs_yhdj
generate_tbl_id.sh : 查询元数据,生成db_name.DB_ID.table_name.TBL_ID,并保存至fix_table_list_tbl_id.txt文件中
generate_and_check_partition.sh :使用solr查询工具,快速查找出HDFS中需要修复的分区,保存至db_name/table_name.txt
generate_update_sql.sh:生成修复location 的MySQL语句
fix_msck.sh:修复分区
exec_update_sql.sh:输入库名,按照库的粒度修复元数据
  1. fix_table_list.txt

    st_yunc_transaction.st_whole_after_sale_order_detail_all_wtd_di
    st_yunc_transaction.st_yhdj_bravo_mini_memmber_heatmap_1d_di
    st_yunc_transaction.st_yhdj_bravo_mini_memmber_heatmap_1w_di
    st_yunc_transaction.st_yhdj_bravo_mini_memmber_heatmap_mtd_di
    st_yunc_transaction.st_yunda_hxjs_goods_sale_di
    st_yunc_transaction.st_yunda_hxjs_shop_sale_di
    
  2. generate_tbl_id.sh ```shell

    !/bin/bash

for fix_list in cat fix_table_list.txt; do db_name=echo $fix_list | awk -F . '{print $1}' table_name=echo $fix_list | awk -F . '{print $2}'

# 获取DB_ID
db_id=`mysql -u root --password='xxx' -s -e "select DB_ID from metastore.DBS where name='${db_name}';"`

# 获取TBL_ID
tbl_id=`mysql -u root --password='xxx' -s -e "SELECT TBL_ID FROM metastore.TBLS WHERE DB_ID=${db_id} AND TBL_NAME='${table_name}';"`

echo "${db_name}.${db_id}.${table_name}.${tbl_id}" >> fix_table_list_tbl_id.txt

done


4. generate_and_check_partition.sh
```shell
#!/bin/bash


for tbl_list in `cat fix_table_list_tbl_id.txt` ;do
    db_name=`echo $tbl_list | awk -F . '{print $1}'`
    table_name=`echo $tbl_list | awk -F . '{print $3}'`

    sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/solr/contrib/mr/search-mr.jar org.apache.solr.hadoop.HdfsFindTool -find /warehouse/$db_name/$table_name  -name '*.har' >> ./${db_name}/${table_name}.txt

    sleep 2
done
  1. generate_update_sql.sh ```shell

    !/bin/bash

for tbl_list in cat fix_table_list_tbl_id_bak.txt ;do db_name=echo $tbl_list | awk -F . '{print $1}' table_name=echo $tbl_list | awk -F . '{print $3}' tbl_id=echo $tbl_list | awk -F . '{print $4}'

for table_path in `cat ./${db_name}/${table_name}.txt`;do
    partition_name=`echo $table_path | awk -F / '{print $7}'`
    echo "UPDATE metastore.SDS SET LOCATION='har://hdfs-nameservice1/warehouse/${db_name}/${table_name}/${partition_name}/data.har' WHERE SD_ID=(SELECT SD_ID FROM PARTITIONS WHERE TBL_ID=${tbl_id} AND PART_NAME='${partition_name}');" >> ./${db_name}/${table_name}.sql
done

done

<br />6. fix_msck.sh
```shell
#!/bin/bash

for tbl_list in `cat fix_table_list_tbl_id.txt` ;do
    db_name=`echo $tbl_list | awk -F . '{print $1}'`
    table_name=`echo $tbl_list | awk -F . '{print $3}'`

    echo "msck repair table ${db_name}.${table_name};" >> ./msck_table.hql
done

 beeline -u jdbc:hive2://10.1.53.214:10000/default -n xxx -p xxx -f msck_table.hql --force=true &>> log/msck.log
  1. exec_update_sql.sh ```shell

    !/bin/bash

db_name=$1

cur_path=”/root/tianbo/script”

if [ $# -lt 1 ];then echo “请输入数据库” exit 255 fi

sql_list=ls -l $cur_path/$db_name/*.sql | awk '{print $9}'

for sql_name in $sql_list; do mysql -u root -pxxx metastore < $sql_name echo $sql_name sleep 5 done

<a name="WTrzh"></a>
### 
<a name="Ef1wH"></a>
### 表为双分区

1. 双分区由于无法使用msck修复分区生成元数据,只能将归档文件恢复成正常文件了,路径需要修改为自己的,脚本如下
```shell
#!/bin/bash

db_name=$1
table_name=$2

for table_path in `cat ./${db_name}/${table_name}.txt`;do
    partition_name=`echo $table_path | awk -F / '{print $7}'`
    sudo -u hdfs  hdfs dfs -cp har:///warehouse/${db_name}/${table_name}/${partition_name}/data.har/* hdfs:///warehouse/${db_name}/${table_name}/${partition_name}/
    sleep 5
done
  1. 最后在msck即可