ROWS BETWEEN :windows子句
- UNBOUNDED PRECEDING:起点 往前所有行
- UNBOUNDED FOLLOWING:终点 往后所有行
- CURRENT ROW:当前行
- 3 PRECEDING:往前3行
- 1 FOLLOWING:往后3行
```sql
SELECT
cookieid,
createtime,
pv,
— 默认为从起点到当前行sum相加
SUM(pv) OVER(PARTITION BY cookieid ORDER BY createtime) AS pv1,
—从起点到当前行,结果同pv1
SUM(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS pv2,
—分组内所有行
SUM(pv) OVER(PARTITION BY cookieid) AS pv3,
—当前行+往前3行 SUM(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS pv4,
—当前行+往前3行+往后1行 SUM(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING) AS pv5,
—当前行+往后所有行 SUM(pv) OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS pv6
FROM lxw1234; —SUM(1) OVER(PARTITION BY NULL) AS rn12, —分组内总行数
cookieid createtime pv pv1 pv2 pv3 pv4 pv5 pv6
cookie1 2015-04-10 1 1 1 26 1 6 26 cookie1 2015-04-11 5 6 6 26 6 13 25 cookie1 2015-04-12 7 13 13 26 13 16 20 cookie1 2015-04-13 3 16 16 26 16 18 13 cookie1 2015-04-14 2 18 18 26 17 21 10 cookie1 2015-04-15 4 22 22 26 16 20 8 cookie1 2015-04-16 4 26 26 26 13 13 4
<a name="PspWQ"></a>### rank、dense_rank、row_number- 这几个函数不支持windws子句```sqlSELECTcookieid,createtime,pv,RANK() OVER(PARTITION BY cookieid ORDER BY pv desc) AS rn1,DENSE_RANK() OVER(PARTITION BY cookieid ORDER BY pv desc) AS rn2,ROW_NUMBER() OVER(PARTITION BY cookieid ORDER BY pv DESC) AS rn3FROM lxw1234WHERE cookieid = 'cookie1';cookieid day pv rn1 rn2 rn3--------------------------------------------------cookie1 2015-04-12 7 1 1 1cookie1 2015-04-11 5 2 2 2cookie1 2015-04-15 4 3 3 3cookie1 2015-04-16 4 3 3 4cookie1 2015-04-13 3 5 4 5cookie1 2015-04-14 2 6 5 6cookie1 2015-04-10 1 7 6 7
lag、lead、first_value、last_value
- 这几个函数不支持windows子句
- first_value和last_value必须加order by
如果想要分组排序最后一个 用FIRST_VALUE(url) OVER(PARTITION BY cookieid ORDER BY createtime DESC)
lag
lag(字段,往上第几行,默认值 不写为null) ```sql SELECT cookieid, createtime, url, ROW_NUMBER() OVER(PARTITION BY cookieid ORDER BY createtime) AS rn, LAG(createtime,1,’1970-01-01 00:00:00’) OVER(PARTITION BY cookieid ORDER BY createtime) AS last_1_time, LAG(createtime,2) OVER(PARTITION BY cookieid ORDER BY createtime) AS last_2_time FROM lxw1234;
cookieid createtime url rn last_1_time last_2_time
cookie1 2015-04-10 10:00:00 url1 1 1970-01-01 00:00:00 NULL cookie1 2015-04-10 10:00:02 url2 2 2015-04-10 10:00:00 NULL cookie1 2015-04-10 10:03:04 url3 3 2015-04-10 10:00:02 2015-04-10 10:00:00 cookie1 2015-04-10 10:10:00 url4 4 2015-04-10 10:03:04 2015-04-10 10:00:02
<a name="jJ6ip"></a>
#### lead
```sql
阿里面试题:统计累计充值金额最多的top10用户,并计算每个用户比他后一名多充值多少钱。
流水号 用户user_id 日期 充值金额(元)amt
xxxxx01 u_001 2017/1/1 10
xxxxx02 u_001 2017/1/2 150
xxxxx09 u_002 2017/1/1 10
xxxxx10 u_002 2017/1/2 150
select
user_id,
amts,
amts - lead(amts,1,0) over(sort by amts desc)
from
(select user_id,sum(amt) as amts
from t4
group by user_id
order by amts desc
limit 10
) t1
first_value
- 取分组内排序后,截止到当前行,第一个值 ```sql SELECT cookieid, createtime, url, ROW_NUMBER() OVER(PARTITION BY cookieid ORDER BY createtime) AS rn, FIRST_VALUE(url ignore nulls) OVER(PARTITION BY cookieid ORDER BY createtime) AS first1 FROM lxw1234;
cookieid createtime url rn first1
cookie1 2015-04-10 10:00:00 url1 1 url1 cookie1 2015-04-10 10:00:02 url2 2 url1 cookie2 2015-04-10 10:00:00 url11 1 url11 cookie2 2015-04-10 10:00:02 url22 2 url11 cookie2 2015-04-10 10:03:04 1url33 3 url11
<a name="6YY84"></a>
#### last_value
- 取分组内排序后,截止到当前行,最后一个值
```sql
SELECT
cookieid,
createtime,
url,
ROW_NUMBER() OVER(PARTITION BY cookieid ORDER BY createtime) AS rn,
LAST_VALUE(url) OVER(PARTITION BY cookieid ORDER BY createtime) AS last1
FROM lxw1234;
cookieid createtime url rn last1
-----------------------------------------------
cookie1 2015-04-10 10:00:00 url1 1 url1
cookie1 2015-04-10 10:00:02 url2 2 url2
cookie1 2015-04-10 10:03:04 url3 3 url3
// 分组内最后一个值 需要变通一下
FIRST_VALUE(url) OVER(PARTITION BY cookieid ORDER BY createtime DESC) AS last2
cookieid createtime url rn last1
-----------------------------------------------
cookie1 2015-04-10 10:00:00 url1 1 url3
cookie1 2015-04-10 10:00:02 url2 2 url3
cookie1 2015-04-10 10:03:04 url3 3 url3
ntile、cume_dist、percent_rank
- 这几个函数不支持windws子句
ntile
```sql 统计一个cookie,pv数最多的前1/3的天 SELECT cookieid, createtime, pv, NTILE(3) OVER(PARTITION BY cookieid ORDER BY pv DESC) AS rn FROM lxw1234; —rn = 1 的记录,就是我们想要的结果
cookieid day pv rn
cookie1 2015-04-12 7 1 cookie1 2015-04-11 5 1 cookie1 2015-04-15 4 1 cookie1 2015-04-16 4 2 cookie1 2015-04-13 3 2 cookie1 2015-04-14 2 3 cookie1 2015-04-10 1 3 cookie2 2015-04-15 9 1 cookie2 2015-04-16 7 1 cookie2 2015-04-13 6 1 cookie2 2015-04-12 5 2 cookie2 2015-04-14 3 2 cookie2 2015-04-11 3 3 cookie2 2015-04-10 2 3
<a name="S2yHm"></a>
#### cume_dist
```sql
SELECT
dept,
userid,
sal,
CUME_DIST() OVER(ORDER BY sal) AS rn1,
CUME_DIST() OVER(PARTITION BY dept ORDER BY sal) AS rn2
FROM lxw1234;
dept userid sal rn1 rn2
-------------------------------------------
d1 user1 1000 0.2 0.3333333333333333
d1 user2 2000 0.4 0.6666666666666666
d1 user3 3000 0.6 1.0
d2 user4 4000 0.8 0.5
d2 user5 5000 1.0 1.0
rn1: 没有partition,所有数据均为1组,总行数为5,
第一行:小于等于1000的行数为1,因此,1/5=0.2
第三行:小于等于3000的行数为3,因此,3/5=0.6
rn2: 按照部门分组,dpet=d1的行数为3,
第二行:小于等于2000的行数为2,因此,2/3=0.6666666666666666
percent_rank
SELECT
dept,
userid,
sal,
PERCENT_RANK() OVER(ORDER BY sal) AS rn1, --分组内
PERCENT_RANK() OVER(PARTITION BY dept ORDER BY sal) AS rn2,
RANK() OVER(ORDER BY sal) AS rn11, --分组内RANK值
SUM(1) OVER(PARTITION BY NULL) AS rn12 --分组内总行数
FROM lxw1234;
dept userid sal rn1 rn2 rn11 rn12
---------------------------------------------------
d1 user1 1000 0.0 0.0 1 5
d1 user2 2000 0.25 0.5 2 5
d1 user3 3000 0.5 1.0 3 5
d2 user4 4000 0.75 0.0 4 5
d2 user5 5000 1.0 1.0 5 5
rn1: rn1 = (rn11-1) / (rn12-1)
第一行,(1-1)/(5-1)=0/4=0
第二行,(2-1)/(5-1)=1/4=0.25
第四行,(4-1)/(5-1)=3/4=0.75
rn2: 按照dept分组,
dept=d1的总行数为3
第一行,(1-1)/(3-1)=0
第三行,(3-1)/(3-1)=1
