- 任务:定义一个udf,实现根据输入的日期,输出一个时段, 2:00-5:00凌晨,5:00-12:00为上午,12:00-14:00为中午,14:00-17:00为下午,17:00-19:00为傍晚,19:00-23:00为晚上,23:00-2:00为深夜
```plsql
在本地创建一个date.csv文件,写入以下数据
2021-12-12 12:00:00 2021-12-22 13:03:00 2021-12-23 19:37:29
hive中建表,并导入date.csv数据
use test; create table datetime(datetime string); load data local inpath ‘/home/hadoop/date.csv’ into table datetime;
<a name="pBj2l"></a>
## 一、配置Hive的java环境
1. 创建一个新的java project,命名为"hive_test"
1. 为该project添加相关的依赖包:
- hive安装包中lib目录下的所有jar包
- hadoop安装包中以下路径的所有jar包:
- 1)share/hadoop/common
- 2)share/hadoop/common/lib
- 3)share/hadoop/hdfs
- 4)share/hadoop/hdfs/lib
[依赖包.rar](https://www.yuque.com/attachments/yuque/0/2021/rar/22283287/1638940549854-80cfc949-f452-4380-b425-4b3c215650f3.rar?_lake_card=%7B%22src%22%3A%22https%3A%2F%2Fwww.yuque.com%2Fattachments%2Fyuque%2F0%2F2021%2Frar%2F22283287%2F1638940549854-80cfc949-f452-4380-b425-4b3c215650f3.rar%22%2C%22name%22%3A%22%E4%BE%9D%E8%B5%96%E5%8C%85.rar%22%2C%22size%22%3A253878896%2C%22type%22%3A%22%22%2C%22ext%22%3A%22rar%22%2C%22status%22%3A%22done%22%2C%22taskId%22%3A%22ud381073a-2589-452a-bbbd-b38a4343ed3%22%2C%22taskType%22%3A%22upload%22%2C%22id%22%3A%22u2e61819f%22%2C%22card%22%3A%22file%22%7D)<br />3.新建一个class,命名为"dateUDF"
<a name="bBUdz"></a>
## 二、编写自定义函数
```java
package hive_test;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import org.apache.hadoop.hive.ql.exec.UDF;
public class dateUDF extends UDF{
public String evaluate(String datetime,String dateformat) throws ParseException{
SimpleDateFormat sd = new SimpleDateFormat(dateformat);
Date date = sd.parse(datetime);
int hour = date.getHours();
if(hour >= 2 && hour < 5) return "凌晨";
else if(hour >= 5 && hour < 12) return "上午";
else if(hour >= 12 && hour < 14) return "中午";
else if(hour >= 14 && hour < 17) return "下午";
else if(hour >= 17 && hour < 19) return "傍晚";
else if(hour >= 19 && hour < 23) return "晚上";
else return "深夜";
}
}
三、在hive中创建自定义函数
创建临时函数
1.上传自定义udf的jar到Linux
2.在Hive CLI执行:add jar /home/hadoop/dateudf.jar;
3.在Hive CLI执行:create temporary function datetotime as 'hive_test.dateUDF';
使用创建的函数查询:
select datetime,datetotime(datetime,'yyyy-MM-dd HH:mm:ss') from datetime;