User_id,subject_id,score
(1)求每个学科下的学生平均分数
(2)求每个学科下最高分数的学生

  1. select subject_id, avg(score) as avg_score
  2. from table_a
  3. group by subject_id
  1. select *
  2. from (
  3. select *,rank()over(partition by subject_id order by score desc) as flag
  4. from table_a
  5. ) as t1
  6. where flag=1
  7. order by subject_id;

*因为窗口函数的执行顺序在where之后,所以窗口函数不能出现在where筛选语句中,必须调用表格,再筛选。

查询每个科目下,前1名同学的成绩和当前同学成绩的差值

  1. select stud_id, lesson_id,score,pre_score,(score-pre_score) as diff
  2. from (
  3. select stud_id,lesson_id,score,lag(score)over(partition by lesson_id order by score) as pre_score
  4. from table_a
  5. ) as t
  6. order by lesson_id, score;

截止到当前成绩,按照日期排序查询第1个和最后1个的同学分数

第一种方法:窗口函数first_value和last_value
头尾函数:FIRST_VALUE(expr),LAST_VALUE(expr)
用途:返回第一个(FIRST_VALUE(expr))或最后一个(LAST_VALUE(expr))expr的值

  1. select stud_id,lesson_id,score,create_time,
  2. first_value(score)over(partition by lesson_id order by create_time) as first_score,
  3. last_value(score)over(partition by lesson_id order by create_time) as last_score
  4. from table_a
  5. order by lesson_id;

第二种方法:找出每门课下满足日期最大、最小的学生,利用union联合两张表

  1. select stud_id,lesson_id,score,create_time
  2. from table_a as a1
  3. where create_time = (
  4. select max(create_time) from table_a as a2 where a1.lesson_id=a2.lesson_id)#相关子查询, 找出每门课的成绩最晚日期
  5. union
  6. select stud_id,lesson_id,score,create_time
  7. from table_a as t1
  8. where create_time=(
  9. select min(create_time) from table_a t2 where t1.lesson_id=t2.lesson_id)#相关子查询, 找出每门课的成绩最早日期

第三种方法不可行(尝试在第二种的基础上不用union,而是在where中用in使得create_time 满足日期最大、或者日期最小)因为max和min生成了2列,而不是1列多结果

  1. select stud_id,lesson_id,score,create_time
  2. from table_a as a1
  3. where create_time in (
  4. select max(create_time), min(create_time)
  5. from table_a as a2
  6. where a1.lesson_id=a2.lesson_id
  7. )
  8. #相关子查询
  9. #找出每门课成绩记录中的最早、最晚日期

case专题-商品订单数据

数据表:
订单表orders,大概字段有(orderid’订单号,’_user_id‘用户编号’, order_pay‘订单金额’ , order_time‘下单时间’,’商品一级类目commodity_level_I’,’商品二级类目commodity_level_2’)
1.求最近7天内每一个一级类目下成交总额排名前3的二级类目:

  1. select *
  2. from (
  3. select commodity_level1,
  4. commodity_level2,
  5. total_money,
  6. rank()over(partition by commodity_level1 order by total_money desc) as flag
  7. from (
  8. select commodity_level1,
  9. commodity_level2,
  10. sum(order_pay) as total_money
  11. from table_a
  12. where datediff(now(),order_time)<=7
  13. group by commodity_level2
  14. ) as t1
  15. ) as t2
  16. where flag<=3
  17. order by commodity_level1, flag;

注:now()=’2020-01-29 09:10:00’
curdate()=’2020-01-29’
curtime()=’09:10:00’
datediff是计算两个日期之前的天数差别
形式为datediff(date1,date2)或者datediff(日期时间1,日期时间2)
差值为前-后
timestampdiff是计算两个时间之间的间隔差别,可以自定义差值以秒/分/小时/日/月/年等等间隔来计算
形式为datediff(minute, 小日期时间,大日期时间)
差值为后-前,
eg:可以用来计算年龄,当前时间减去出生日期

小疑问:group by应该是不能和窗口函数连用,因为group by对每组只返回一个值,而窗口函数是对每组中的每条记录都返回一个值

  1. select act_id,
  2. count(ord_id) as '总订单数',
  3. datediff(now(),min(a.create_time))+1 as '时间间隔',
  4. count(ord_id)/(datediff(now(), min(a.create_time))+1) as '平均每日产生的订单数'
  5. from act_usr a
  6. join ord as o
  7. on a.user_id=o.user_id and a.create_time<=o.create_time
  8. group by act_id;
  9. select date(reg_datetime) as reg_date, count(distinct user_id) as new_reg_amt
  10. from (
  11. select user_id, min(log_time) as reg_datetime
  12. from tracking_log
  13. group by user_id
  14. ) as t1
  15. group by date(reg_datetime)
  16. order by date(reg_datetime);
  17. craete table reg_log as
  18. select user_id,date(min(log_time)) as reg_date
  19. from tracking_log
  20. group by user_id;
  21. select reg_date,
  22. datediff(date(log_time),r.reg_date) as '时间间隔',
  23. count(distinct t.user_id) as '留存数',
  24. count(distinct t.user_id)/(select count(*) from reg_log as r1 where r.reg_date=r1.reg_date) as '留存率'
  25. from reg_log as r join tracking_log as t
  26. on r.user_id=t.user_id and datediff(date(log_time),r.reg_date) in (1,29)
  27. group by reg_date,datediff(date(log_time),r.reg_date)
  28. select reg_date,'时间间隔','留存数'/ '新增访客数' as '留存率'
  29. from (
  30. select reg_date,count(*) as '新增访客数'
  31. from reg_log
  32. group by reg
  33. ) as t1
  34. join (
  35. select reg_date,
  36. datediff(date(t.log_time),r.reg_date) as '时间间隔',
  37. count(distinct t.user_id) as '留存数'
  38. from reg_log as r
  39. join tracking_log t
  40. on r.user_id=t.user_id and datediff(date(t.log_time),r.reg_date) in (1,29)
  41. group by reg, datediff(date(t.log_time),r.reg_date)
  42. ) as t2
  43. on t1.reg_date=t2.reg_date
  44. order by reg_date'时间间隔'

case专题-学生成绩分析

查询每一科目成绩最高和最低分数的学生,输出courseid,studentid,score

  1. select *
  2. from enrollments as t1
  3. where grade=(select max(grade) from enrollments as t2 where t1.course=t2.course)
  4. union
  5. select *
  6. from enrollments as a1
  7. where grade=(select min(grade) from enrollments as a2 where a1.course=a2.course)

case专题-学生做题情况分析

表t:做题日期(time),学生id(stuid),_题目id(exer_id)
统计10.1-10.10每天做新题的人的数量,重点在每天
思路:
1)找到每个学生每道题的第一次做题时间,即找出min(time)eg:用户登记表,最小登陆日期作为新注册表
2)作为临时表,查找做新题日期数=10
3)筛选出10内每天做新题的学生后,计算学生的人数

  1. select count(*) cnt
  2. from (
  3. select stu_id, count(distinct first_time) as date_cnt
  4. from (
  5. select stu_id,exer_id,min(time) as first_time
  6. from exer_log
  7. where time between '2020-10-01' and '2020-10-10'
  8. group by stu_id,exer_id
  9. ) as t1
  10. group by stu_id
  11. having date_cnt=10
  12. ) as t2