1、问题

条件如下

  • 已知一个流水表中有两个流水类型 【开/关】
  • 【开】类型的流水能连续发生
  • 【关】类型的流水不能连续发生
  • 有【关】类型的流水不能认为一定有开逻辑,必须表里有才算

已知图表如下,使用SQL求发生过打开关闭后又打开过的设备数量,即发生了【开-关-开】的设备数量

flow_id device_id action flow_time
1 A 20180101
2 A 20180102
3 A 20180103
4 B 20180101
5 B 20180102
6 B 20180103
7 B 20180103
8 C 20180102
9 C 20180103
10 C 20180103

A设备发生了 【开-开-关 】不满足要求
B设备发生了 【开-关-开-关 】满足要求
C设备发生了 【关-开-开 】不满足要求

2、解决方案

在解决问题之前先分享一招,我工作中常使用的使用unionwith生成测试表

  1. with flow_table as (
  2. select 1 as flow_id,'A' as device_id,'开' as action,20180101 as flow_time
  3. union all
  4. select 2 as flow_id,'A' as device_id,'开' as action,20180102 as flow_time
  5. union all
  6. select 3 as flow_id,'A' as device_id,'关' as action,20180103 as flow_time
  7. union all
  8. select 4 as flow_id,'B' as device_id,'开' as action,20180101 as flow_time
  9. union all
  10. select 5 as flow_id,'B' as device_id,'关' as action,20180102 as flow_time
  11. union all
  12. select 6 as flow_id,'B' as device_id,'开' as action,20180103 as flow_time
  13. union all
  14. select 7 as flow_id,'B' as device_id,'开' as action,20180103 as flow_time
  15. union all
  16. select 8 as flow_id,'C' as device_id,'关' as action,20180102 as flow_time
  17. union all
  18. select 9 as flow_id,'C' as device_id,'开' as action,20180103 as flow_time
  19. union all
  20. select 10 as flow_id,'C' as device_id,'开' as action,20180103 as flow_time
  21. )
  22. select * from flow_table order by flow_id;

image.png

思路:
只要一个设备的序列中,存在 【开-关】、【关-开】两种类型就说明该设备是【发生过打开,关闭后又打开过】
将action列按照device_id分组,使用lead函数生成action_next列

  1. select
  2. flow_id,
  3. device_id,
  4. action,
  5. lead(action) over(partition by device_id order by flow_time) as action_next,
  6. flow_time
  7. from
  8. flow_table order by flow_id;

image.png
之前分析了只有B满足情况,也只有B有【开-关】、【关-开】两种都有的情况

下一步:
过滤掉action_next IS NULL 和 action = action_next的情况,分组聚合后数据量大于2的就是满足要求的数据

  1. with_action_next as (
  2. select
  3. flow_id,
  4. device_id,
  5. action,
  6. lead(action) over(partition by device_id order by flow_time) as action_next,
  7. flow_time
  8. from
  9. flow_table order by flow_id
  10. )
  11. select device_id,action,action_next from with_action_next
  12. where
  13. action_next is not null
  14. and action <> action_next
  15. group by device_id,action,action_next
  16. ;

image.png

结果完整SQL:

  1. with flow_table as (
  2. select 1 as flow_id,'A' as device_id,'开' as action,20180101 as flow_time
  3. union all
  4. select 2 as flow_id,'A' as device_id,'开' as action,20180102 as flow_time
  5. union all
  6. select 3 as flow_id,'A' as device_id,'关' as action,20180103 as flow_time
  7. union all
  8. select 4 as flow_id,'B' as device_id,'开' as action,20180101 as flow_time
  9. union all
  10. select 5 as flow_id,'B' as device_id,'关' as action,20180102 as flow_time
  11. union all
  12. select 6 as flow_id,'B' as device_id,'开' as action,20180103 as flow_time
  13. union all
  14. select 7 as flow_id,'B' as device_id,'开' as action,20180103 as flow_time
  15. union all
  16. select 8 as flow_id,'C' as device_id,'关' as action,20180102 as flow_time
  17. union all
  18. select 9 as flow_id,'C' as device_id,'开' as action,20180103 as flow_time
  19. union all
  20. select 10 as flow_id,'C' as device_id,'开' as action,20180103 as flow_time
  21. )
  22. , with_action_next as (
  23. select
  24. flow_id,
  25. device_id,
  26. action,
  27. lead(action) over(partition by device_id order by flow_time) as action_next,
  28. flow_time
  29. from
  30. flow_table order by flow_id
  31. )
  32. , result_table as (
  33. select device_id,action,action_next from with_action_next
  34. where
  35. action_next is not null
  36. and action <> action_next
  37. group by device_id,action,action_next
  38. )
  39. select count(*) from (
  40. select
  41. device_id
  42. from result_table
  43. group by device_id having count(*) > 1
  44. ) t
  45. ;

结果:
image.png

搞定

3、总结