Sqoop—MySQL至Hive
参考信息
单表Sqoop脚本
sqoop import --connect jdbc:mysql://172.26.1.234:3306/PointAssSIT --username root --password Pass2017 --table AircraftStandPlaceMonitNodePlaceRelations \
--split-by Id \
--target-dir /user/hive/warehouse/pointass \
--fields-terminated-by "," \
--hive-import \
--create-hive-table \
--hive-table pointass.ods_AircraftStandPlaceMonitNodePlaceRelations
- —split-by: Column of the table used to split work units. Cannot be used with
--autoreset-to-one-mapper
. 用来分片的列 - —target-dir: HDFS destination dir
- —delete-target-dir:Delete the import target directory if it exists。target目录如果存在就删除
- —hive-import:Import tables into Hive (Uses Hive’s default delimiters if none are set.)
- —hive-overwrite:Overwrite existing data in the Hive table. 当然也可以使用这个
- —create-hive-table:If set, then the job will fail if the target hive table exits。如果存在表,job会失败
- —hive-table :Sets the table name to use when importing to Hive.
所有表导入脚本
对shell脚本的使用不熟悉,说明一下。
循环数组语法:
arr=(a b c d e f)
for item in ${arr[@]}
do
echo "print=>$item"
done
函数中使用参数:
function test(){
echo "test function param1=> $1,param2=>$2"
}
test hello world
//打印结果test function param1=>hello,param2=>world
所以最终的脚本如下:
#!/bin/bash
#所有表名
arr=(AircraftStandPlaceMonitNodePlaceRelations AircraftStandPlaces AirLineProcess
AirLineProcessNodeHistorys AlgorithmParameterSettings AlgorithmTargetNodes CameraAlgorithmParameters CameraDevices
CodeCategorys CoordinatePoints CustomAuditLogs CustomNavigations Equipments MonitNodePlaces NotificationMessages
NotificationUsers ProtectionMainNodes ProtectionNodes ProtectionTaskSchemeRelations ProtectionTaskStartLogs ProtectionTasks
SchemeDetails Schemes TaskMachDetails TaskNodeDetails VideoStreamInfos)
function import(){
sqoop import --connect jdbc:mysql://172.26.1.234:3306/PointAssSIT --username root --password Pass2017 --table $1 \
--split-by Id \
--target-dir /dtInsight/hive/warehouse/pointasssit.db/$1 \
--fields-terminated-by "," \
--hive-import \
--create-hive-table \
--hive-table pointasssit.ods_$1
}
for tl_name in ${arr[@]}
do
import $tl_name
done
echo "end import all tables"