建库建表、加载数据 :
建库建表:
—如果数据库已存在就删除 drop database if exists db_msg cascade ;
—创建数据库 create database db_msg ;
—切换数据库 use db_msg ;
—列举数据库 show databases ;
--如果表已存在就删除drop table if exists db_msg.tb_msg_source ;--建表create table db_msg.tb_msg_source(msg_time string comment "消息发送时间", sender_name string comment "发送人昵称", sender_account string comment "发送人账号", sender_sex string comment "发送人性别", sender_ip string comment "发送人ip地址", sender_os string comment "发送人操作系统", sender_phonetype string comment "发送人手机型号", sender_network string comment "发送人网络类型", sender_gps string comment "发送人的GPS定位", receiver_name string comment "接收人昵称", receiver_ip string comment "接收人IP", receiver_account string comment "接收人账号", receiver_os string comment "接收人操作系统", receiver_phonetype string comment "接收人手机型号", receiver_network string comment "接收人网络类型", receiver_gps string comment "接收人的GPS定位", receiver_sex string comment "接收人性别", msg_type string comment "消息类型", distance string comment "双方距离", message string comment "消息内容")--指定分隔符为制表符row format delimited fields terminated by '\t' ;
加载数据 :
(1)HDFS上创建目录hdfs dfs -mkdir -p /momo/data
(2)上传到HDFS hdfs dfs -put /export/data/data1.tsv /momo/data/hdfs dfs -put /export/data/data2.tsv /momo/data/
(3) 加载到Hive表中 load data inpath '/momo/data/data1.tsv' into table db_msg.tb_msg_source;load data inpath '/momo/data/data2.tsv' into table db_msg.tb_msg_source;
(4)验证结果 select msg_time,sender_name,sender_ip,sender_phonetype,receiver_name,receiver_network from tb_msg_source limit 10;
