题目1:每日留存率(一个user_register表,另一个是user_active表)统计指定日期新客数及这些新客的次日留存、七日留存
步骤:1.算每日新增客数2. 每日留存数量3.留存率
SELECT register_date, interval, active_num/new_register_num as retention rate
FROM (SELECT register_date, count(*) as new_register_num
FROM user_register
WHERE register_date ='2020-09-01'
) as t1
JOIN (SELECT register_date,datediff(active_date,register_date) as interval ,count(*) as active_num
FROM user_register
JOIN user_active
ON user_register.user_id=user_active.user_id
WHERE register_date = '2020-09-01' and datediff(active_date,register_date)<=7
GROUP BY active_date
ORDER BY active_date
)as t2
ON t1.register_date=t2.register_date
ORDER BY interval
题目2: 计算用户留存率,留存率=(昨天活跃用户∩今天活跃用户)/昨天活跃用户
用户活跃表:dws_user_active_t
ds bigint comment ‘日期,分区,yyyymmdd’,
device int comment ‘设备类型’,
user_id bigint comment ‘用户id’,
active_score double comment ‘活跃度’
计算留存率=(昨天活跃用户∩今天活跃用户)/昨天活跃用户
步骤:
1.昨天活跃用户2.昨天活跃用户∩今天活跃用户3./
select t1.ds,'第二天仍活跃用户'/'昨天活跃用户' as '留存率'
from (
select ds,count(distinct user_id) as '昨天活跃用户'
from dws_user_active_t
where ds = '2020-12-11' and active_score > 0
) t1
join (
select d1.ds, count(distinct d2.user_id) as '第二天仍活跃用户'
from dws_user_active_t d1
join dws_user_active_t d2
on datediff(d2.ds,d1.ds)=1 and d1.user_id = d2.user_id
where d1.ds = '2020-12-11' and d1.active_score > 0 and d2.active_score > 0
) t2
on t1.ds = t2.ds
优化-相关子查询
select ds,
count(distinct user_id) as '活跃用户',
(
select count(distinct t2.user_id) as '第二天仍活跃用户'
from dws_user_active_t t2
where datediff(ds,t.ds)=1 and t1.active_score > 0 and t2.active_score > 0
),
'第二天仍活跃用户'/'昨天活跃用户' as '留存率'
from dws_user_active_t t1
where ds = '2020-12-11' and active_score > 0;