Sqoop—MySQL至Hive

参考信息

参考官网手册

国外的某篇博文

单表Sqoop脚本

  1. sqoop import --connect jdbc:mysql://172.26.1.234:3306/PointAssSIT --username root --password Pass2017 --table AircraftStandPlaceMonitNodePlaceRelations \
  2. --split-by Id \
  3. --target-dir /user/hive/warehouse/pointass \
  4. --fields-terminated-by "," \
  5. --hive-import \
  6. --create-hive-table \
  7. --hive-table pointass.ods_AircraftStandPlaceMonitNodePlaceRelations
  • —split-by: Column of the table used to split work units. Cannot be used with --autoreset-to-one-mapper . 用来分片的列
  • —target-dir: HDFS destination dir
  • —delete-target-dir:Delete the import target directory if it exists。target目录如果存在就删除
  • —hive-import:Import tables into Hive (Uses Hive’s default delimiters if none are set.)
  • —hive-overwrite:Overwrite existing data in the Hive table. 当然也可以使用这个
  • —create-hive-table:If set, then the job will fail if the target hive table exits。如果存在表,job会失败
  • —hive-table :Sets the table name to use when importing to Hive.

所有表导入脚本

对shell脚本的使用不熟悉,说明一下。

循环数组语法:

  1. arr=(a b c d e f)
  2. for item in ${arr[@]}
  3. do
  4. echo "print=>$item"
  5. done

函数中使用参数:

  1. function test(){
  2. echo "test function param1=> $1,param2=>$2"
  3. }
  4. test hello world
  5. //打印结果test function param1=>hello,param2=>world

所以最终的脚本如下:

  1. #!/bin/bash
  2. #所有表名
  3. arr=(AircraftStandPlaceMonitNodePlaceRelations AircraftStandPlaces AirLineProcess
  4. AirLineProcessNodeHistorys AlgorithmParameterSettings AlgorithmTargetNodes CameraAlgorithmParameters CameraDevices
  5. CodeCategorys CoordinatePoints CustomAuditLogs CustomNavigations Equipments MonitNodePlaces NotificationMessages
  6. NotificationUsers ProtectionMainNodes ProtectionNodes ProtectionTaskSchemeRelations ProtectionTaskStartLogs ProtectionTasks
  7. SchemeDetails Schemes TaskMachDetails TaskNodeDetails VideoStreamInfos)
  8. function import(){
  9. sqoop import --connect jdbc:mysql://172.26.1.234:3306/PointAssSIT --username root --password Pass2017 --table $1 \
  10. --split-by Id \
  11. --target-dir /dtInsight/hive/warehouse/pointasssit.db/$1 \
  12. --fields-terminated-by "," \
  13. --hive-import \
  14. --create-hive-table \
  15. --hive-table pointasssit.ods_$1
  16. }
  17. for tl_name in ${arr[@]}
  18. do
  19. import $tl_name
  20. done
  21. echo "end import all tables"