• 任务:定义一个udf,实现根据输入的日期,输出一个时段, 2:00-5:00凌晨,5:00-12:00为上午,12:00-14:00为中午,14:00-17:00为下午,17:00-19:00为傍晚,19:00-23:00为晚上,23:00-2:00为深夜 ```plsql

    在本地创建一个date.csv文件,写入以下数据

    2021-12-12 12:00:00 2021-12-22 13:03:00 2021-12-23 19:37:29

hive中建表,并导入date.csv数据

use test; create table datetime(datetime string); load data local inpath ‘/home/hadoop/date.csv’ into table datetime;

  1. <a name="pBj2l"></a>
  2. ## 一、配置Hive的java环境
  3. 1. 创建一个新的java project,命名为"hive_test"
  4. 1. 为该project添加相关的依赖包:
  5. - hive安装包中lib目录下的所有jar包
  6. - hadoop安装包中以下路径的所有jar包:
  7. - 1)share/hadoop/common
  8. - 2)share/hadoop/common/lib
  9. - 3)share/hadoop/hdfs
  10. - 4)share/hadoop/hdfs/lib
  11. [依赖包.rar](https://www.yuque.com/attachments/yuque/0/2021/rar/22283287/1638940549854-80cfc949-f452-4380-b425-4b3c215650f3.rar?_lake_card=%7B%22src%22%3A%22https%3A%2F%2Fwww.yuque.com%2Fattachments%2Fyuque%2F0%2F2021%2Frar%2F22283287%2F1638940549854-80cfc949-f452-4380-b425-4b3c215650f3.rar%22%2C%22name%22%3A%22%E4%BE%9D%E8%B5%96%E5%8C%85.rar%22%2C%22size%22%3A253878896%2C%22type%22%3A%22%22%2C%22ext%22%3A%22rar%22%2C%22status%22%3A%22done%22%2C%22taskId%22%3A%22ud381073a-2589-452a-bbbd-b38a4343ed3%22%2C%22taskType%22%3A%22upload%22%2C%22id%22%3A%22u2e61819f%22%2C%22card%22%3A%22file%22%7D)<br />3.新建一个class,命名为"dateUDF"
  12. <a name="bBUdz"></a>
  13. ## 二、编写自定义函数
  14. ```java
  15. package hive_test;
  16. import java.text.ParseException;
  17. import java.text.SimpleDateFormat;
  18. import java.util.Date;
  19. import org.apache.hadoop.hive.ql.exec.UDF;
  20. public class dateUDF extends UDF{
  21. public String evaluate(String datetime,String dateformat) throws ParseException{
  22. SimpleDateFormat sd = new SimpleDateFormat(dateformat);
  23. Date date = sd.parse(datetime);
  24. int hour = date.getHours();
  25. if(hour >= 2 && hour < 5) return "凌晨";
  26. else if(hour >= 5 && hour < 12) return "上午";
  27. else if(hour >= 12 && hour < 14) return "中午";
  28. else if(hour >= 14 && hour < 17) return "下午";
  29. else if(hour >= 17 && hour < 19) return "傍晚";
  30. else if(hour >= 19 && hour < 23) return "晚上";
  31. else return "深夜";
  32. }
  33. }

编写完成后导出jar包,命名为”dateudf.jar”

三、在hive中创建自定义函数

创建临时函数

  1. 1.上传自定义udfjarLinux
  2. 2.Hive CLI执行:add jar /home/hadoop/dateudf.jar;
  3. 3.Hive CLI执行:create temporary function datetotime as 'hive_test.dateUDF';

使用创建的函数查询:

  1. select datetime,datetotime(datetime,'yyyy-MM-dd HH:mm:ss') from datetime;

任务:如何创建永久函数