数据采集

埋点数据校验

  • 对事件和属性的正确性、顺序性、完整性及数据类型进行校验
  • 对用户关联情况进行校验
  • 验证 App 与 H5 打通(做了打通的情况下)

对事件和属性的正确性:
是否有数据
内容和格式
对事件和属性的顺序性:
有顺序的行为,上发时间是否符合顺序
对事件和属性的完整性:
需要针对所有场景进行测试
反馈的类型是否是我们需要的
对用户的关联情况进行校验:
不同类型的用户,属性信息是否正常
验证app与H5打通:
为什么打通,因为H5是匿名的,主要是校验用户信息传输正确性

怎么做数据校验

抓包测试
埋点平台sql抽取

宽表测试

  • 多个事件到宽表的过程
  • 表数据是否完全收集
  • 灰度数据验收

多个事件到宽表的过程

  1. 共有属性使用共有字段
  2. 特有属性使用特有字段

表数据是否完全收集

  1. 宽表字段是否包含所有事件属性
  2. 表数据是否符合正确性,顺序性,完整性

灰度数据验收

  1. 新功能测试,要改宽表
  2. 验证不同组数据反馈,是否和灰度验证一致

    指标体系搭建工作(之前已经完成)


进入数据采集阶段之后,我们已经有了宽表,希望同学自己去导入

从宽表中梳理我们需要的表

数据已经完成,我们要从杂乱无章的数据中梳理我们想要的数据
导入备份数据库:
在已有的连接中新建一个数据库:

企业指标体系搭建(三)数据采集 - 图1
在弹出的窗口中填入数据库名称edu_company:
企业指标体系搭建(三)数据采集 - 图2
打开数据库edu_company,右键备份,点击还原备份从,选择课程提供的.nb3备份文件。

企业指标体系搭建(三)数据采集 - 图3

用户拉新

作用:存储留存
原数据:
主要数据源
补充数据源
连接条件
表字段计算
企业指标体系搭建(三)数据采集 - 图4

统计主体:决定我们的统计粒度
分渠道的每日用户数留存情况
每日整体新用户留存情况(使用窗口函数)

字段名 备注
install_date 安装APP日期
downloadchannel 下载APP渠道
u_cnt 安装用户数
retention_u_cnt 次留用户数
rention_pp 次留率=(次留用户数/安装用户数)
retention_u_cnt_7 7日留存用户数
rention_pp_7 7日留存率 =(7日留存用户数/安装用户数)
retention_u_cnt_14 14日留存用户数
rention_pp_14 14日留存率 = 14日留存用户数/安装用户数


u_cnt_total
日安装用户总数(不分渠道,按照日为粒度的统计数据,用窗口函数计算)
retention_u_cnt_total 次日留存新用户总数(不分渠道)


rention_pp_modi
次日新用户整体留存率= 次日新用户整体留存总数/日安装用户总数
(不分渠道)
retention_u_cnt_total_7 7日留存新用户总数(不分渠道)


rention_pp_modi_7
7次日新用户整体留存率= 7次日留存总数/日安装用户总数(不分渠道)
retention_u_cnt_total_14 14日留存新用户总书(不分渠道)


rention_pp_modi_14
14次日新用户整体留存率= 14次日留存总数/日安装用户总数(不分渠道)

导入数据:

SQL怎么看

先看主要数据源from部分
再看补充数据源join部分
再看计算select部分

  1. create table 用户拉新 as select *,
  2. sum(u_cnt) over( PARTITION by install_date) as u_cnt_total, sum(retention_u_cnt) over( PARTITION by install_date) as retention_u_cnt_total, if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi,
  3. sum(retention_u_cnt_7) over( PARTITION by install_date) as retention_u_cnt_total_7,
  4. if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt_7) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi_7,
  5. sum(retention_u_cnt_14) over( PARTITION by install_date) as retention_u_cnt_total_14,
  6. if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt_14) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi_14
  7. from(
  8. select install_date,downloadchannel,count(distinct user_id) as u_cnt, COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null)) as retention_u_cnt,
  9. if(count(distinct user_id) =0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null))/ count(distinct user_id)) as rention_pp,
  10. COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) as retention_u_cnt_7,
  11. if(count(distinct user_id)=0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) / count(distinct user_id)) as rention_pp_7,
  12. COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null)) as retention_u_cnt_14,
  13. if(count(distinct user_id)=0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null))/count(distinct user_id)) as rention_pp_14
  14. from(
  15. SELECT distinct T_Install.user_id,
  16. T_Install.downloadchannel, T_Install.date as install_date, T_Start.Date action_date
  17. from (
  18. select * from event_ok where `event` = 'AppInstall'
  19. -- 时间字段
  20. and date in ('2020-08-15',DATE_SUB('2020-08-15', Interval 1 DAY),DATE_SUB('2020-08-15', Interval 7 DAY),DATE_SUB('2020-08-15', Interval 14
  21. DAY)) ) as T_Install -- 待补日期
  22. left join (
  23. select * from event_ok where `event` = '$AppStart'
  24. -- 时间字段
  25. and date >= DATE_SUB('2020-08-15', Interval 14 DAY) and date<= '2020-08-15') as T_Start -- 待补日期
  26. on T_Install.user_id = T_Start.user_id and
  27. ( DATE_SUB(T_Start.Date, Interval 1 DAY) = T_Install.date or DATE_SUB(T_Start.Date, Interval 7 DAY) = T_Install.date or DATE_SUB(T_Start.Date, Interval 14 DAY) = T_Install.date)) as T_Install_Start
  28. group by Install_date,downloadchannel) as add_date_sta;
  1. 主要数据源
  2. 补充数据源
  3. 连接条件

由于字段较多,我们不使用update,而是使用delete
注意

表名规则

  1. 指名层级 公司+数据层级

ec_(ods/mds/wt)

  1. 指名用途

retention

  1. 指名内容

new_user_statistic_day

完整表名:
ec_mds_retention_new_user_statistic_day

如果我们要做写入语句,我们则需要这么改:

  1. DELETE FROM 用户拉新 where install_date in (DATE_SUB('2020-08-15', Interval 1 DAY),DATE_SUB('2020-08-15', Interval 7 DAY),DATE_SUB('2020-08-15', Interval 14 DAY));
  2. insert 用户拉新
  3. select *,
  4. sum(u_cnt) over( PARTITION by install_date) as u_cnt_total,
  5. sum(retention_u_cnt) over( PARTITION by install_date) as retention_u_cnt_total, if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi,
  6. sum(retention_u_cnt_7) over( PARTITION by install_date) as retention_u_cnt_total_7,
  7. if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt_7) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi_7,
  8. sum(retention_u_cnt_14) over( PARTITION by install_date) as retention_u_cnt_total_14,
  9. if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt_14) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi_14
  10. from(
  11. select install_date,downloadchannel,count(distinct user_id) as u_cnt, COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null)) as retention_u_cnt,
  12. if(count(distinct user_id) =0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null))/ count(distinct user_id)) as rention_pp,
  13. COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) as retention_u_cnt_7,
  14. if(count(distinct user_id)=0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) / count(distinct user_id)) as rention_pp_7,
  15. COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null)) as retention_u_cnt_14,
  16. if(count(distinct user_id)=0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null))/count(distinct user_id)) as rention_pp_14
  17. from(
  18. SELECT distinct T_Install.user_id,
  19. T_Install.downloadchannel, T_Install.date as install_date, T_Start.Date action_date
  20. from (
  21. select * from event_ok where `event` = 'AppInstall'
  22. -- 时间字段
  23. and date in ('2020-08-15',DATE_SUB('2020-08-15', Interval 1 DAY),DATE_SUB('2020-08-15', Interval 7 DAY),DATE_SUB('2020-08-15', Interval 14
  24. DAY)) ) as T_Install -- 待补日期
  25. left join (
  26. select * from event_ok where `event` = '$AppStart'
  27. -- 时间字段
  28. and date >= DATE_SUB('2020-08-15', Interval 14 DAY) and date<= '2020-08-15') as T_Start -- 待补日期
  29. on T_Install.user_id = T_Start.user_id and
  30. ( DATE_SUB(T_Start.Date, Interval 1 DAY) = T_Install.date or DATE_SUB(T_Start.Date, Interval 7 DAY) = T_Install.date or
  31. DATE_SUB(T_Start.Date, Interval 14 DAY) = T_Install.date)) as T_Install_Start
  32. group by Install_date,downloadchannel) as add_date_sta;

用户活跃

用户活跃留存
企业指标体系搭建(三)数据采集 - 图5

字段 备注
install_date 活跃日期
dau 日活用户数
first_dau 第一次启动APP用户数
first_PP 第一次启动APP比率 = 第一次启动APP用户数/日活用户数
retention_dau_1 日活次日留存用户数
retention_pp_1 日活次日留存率=日活次日留存用户数/日活用户数
retention_dau_7 7日活次日留存用户数
rention_pp_7 7日活次日留存率=日活次日留存用户数/日活用户数
retention_dau_14 14日活次日留存用户数
rention_pp_14 14日活次日留存率=日活次日留存用户数/日活用户数

导入数据:

  1. create table 用户活跃_用户活跃留存 as
  2. select install_date,count(distinct user_id) as dau,
  3. count(distinct case when first_flag = 1 then user_id else null end) as first_dau,
  4. if(count(distinct user_id) =0,0,count(distinct case when first_flag = 1 then user_id else null end)/count(distinct user_id)) as first_PP,
  5. COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null)) as retention_dau_1,
  6. if(count(distinct user_id)=0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null))/ count(distinct user_id)) as retention_pp_1,
  7. COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) as retention_dau_7,
  8. if(count(distinct user_id) =0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) / count(distinct user_id)) as rention_pp_7,
  9. COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null)) as retention_dau_14,
  10. if(count(distinct user_id) =0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null))/count(distinct user_id)) as rention_pp_14
  11. from(
  12. SELECT distinct T_Action.user_id,T_Action.date as install_date, T_Action.$is_first_day as first_flag,T_Start.Date action_date from (
  13. select user_id,$is_first_day,Date from event_ok where `event` = '$AppStart'
  14. -- 时间字段
  15. and date in ('2020-08-15',DATE_SUB('2020-08-15', Interval 1 DAY),DATE_SUB('2020-08-15', Interval 7 DAY),DATE_SUB('2020-08-15', Interval 14
  16. DAY)) -- 待补日期
  17. group by user_id,$is_first_day,Date
  18. ) as T_Action left join (
  19. select * from event_ok where `event` = '$AppStart'
  20. -- 时间字段
  21. and date >= DATE_SUB('2020-08-15', Interval 14 DAY) and date<= '2020-08-15' ) as T_Start -- 待补日期 ,第二个日期为第一个日期加14
  22. on T_Action.user_id = T_Start.user_id and
  23. ( DATE_SUB(T_Start.Date, Interval 1 DAY) = T_Action.date or DATE_SUB(T_Start.Date, Interval 7 DAY) = T_Action.date or DATE_SUB(T_Start.Date, Interval 14 DAY) = T_Action.date)) as T_Action_Start
  24. group by Install_date

用户在线时长

企业指标体系搭建(三)数据采集 - 图6

字段 备注
Install_date 统计日期
dau 日活人数
total_duration 活跃总时长
duration_mean 人均活跃时长

导入数据:

create table 用户活跃_用户在线时长 as select
Install_date,
count(distinct T_Action.user_id) as dau, sum(duration) as total_duration,
if(count(distinct T_Action.user_id)=0,0,sum(duration)/count(distinct T_Action.user_id)) as duration_mean
from (
select user_id,Date as install_date    from event_ok where `event` = '$AppStart'
-- 时间字段
and date = '2020-08-15' -- 待补日期,当天日期
group by user_id,Date
) as T_Action
left join (
select user_id, date as end_date, sum($event_duration) as duration    from event_ok
where `event` = '$AppEnd'
-- 时间字段
and date = '2020-08-15' -- 待补日期,当天日期
group by user_id, date
) as T_end
on T_Action.user_id = T_end.user_id and install_date = end_date
group by Install_date;

收入

当groupby sh使用数字的时候,如果列名不是数字,那么对应的是select的列名
产品收入
企业指标体系搭建(三)数据采集 - 图7

字段 备注
Install_date 统计日期
pay_user 付费用户数
total_amount 付费总金额
amount_mean 人均付费
pay_user_xly 训练营付费人数
total_amount_xly 训练营付费总金额
amount_mean_xly 训练营人均付费
pay_user_zl 专栏付费人数
total_amount_zl 专栏付费总金额
amount_mean_zl 专栏人均付费
pay_user_member 会员付费人数
total_amount_member 会员付费总金额
amount_mean_member 会员人均付费

导入数据:

create table 产品收入_总收入统计 as select
Install_date,
count(distinct user_id) as pay_user, sum(actual_amount_total) as total_amount,
if(count(distinct user_id) =0,0,sum(actual_amount_total)/count(distinct user_id)) as amount_mean,
#训练营
count(distinct case when income_type =1 then user_id else null end) as pay_user_xly,
sum(case when income_type =1 then actual_amount_total else 0 end) as total_amount_xly,
if(count(distinct case when income_type =1 then    user_id else null end)
=0,0,coalesce(sum(case when income_type =1 then actual_amount_total else 0 end)/count(distinct case when income_type =1 then    user_id else null end),0)) as amount_mean_xly,
#专栏
count(distinct case when income_type =2 then    user_id else null end) as pay_user_zl,
sum(case when income_type =2 then actual_amount_total else 0 end) as total_amount_zl,
if(count(distinct case when income_type =2 then    user_id else null end)=0,0,coalesce(sum(case when income_type =2 then actual_amount_total else 0 end)/count(distinct case when income_type =2 then    user_id else null end),0)) as amount_mean_zl,
#会员
count(distinct case when income_type =3 then    user_id else null end) as pay_user_member,
sum(case when income_type =3 then actual_amount_total else 0 end) as total_amount_member,
if(count(distinct case when income_type =3 then    user_id else null end) <> 0,sum(case when income_type =3 then actual_amount_total else 0 end)/count(distinct case when income_type =3 then    user_id else null end),0) as amount_mean_member
from
(
select user_id,
date as Install_date,
case when actual_amount > 200 and (order_type = '课程' or order_type is null) then 1
when actual_amount >=0 and actual_amount <= 200 and (order_type = '课程' or order_type is null) then 2
when order_type = '会员' then 3 else -1 end as income_type,
sum(actual_amount) as actual_amount_total from event_ok
where event = 'PayOrderSucceed'
-- 时间字段
and date = '2020-08-15'    -- 待补时间,当天日期
group by 1,3,2
) PayOrderSucceed
where actual_amount_total>0 group by
Install_date order by Install_date

训练营收入统计

梳理顺序步骤

  1. 确认页面

广告位
商详页

付费

  1. 找到埋点数据的元数据

曝光,点击以及各环节的转化率

  1. 计算

企业指标体系搭建(三)数据采集 - 图8

字段 备注
date 统计日期
exp_xly_cnt 广告曝光人数
exp_xly_pv 广告曝光次数
exp_xly_step_mean 广告人均曝光数
clk_xly_cnt 广告点击人数
clk_xly_pv 广告点击次数
xly_ctr_uv 广告点击率
xly_sx_cnt_app APP端进入训练营详情页用户数


clk2sx_pp
进入训练营详情页转化率 = APP端进入详情页用户数/广告点击人数
(本应该是进入训练营详情页的人数/广告点击人数,但是由于进入详 情页的路径并不唯一所以转化率大于1了,而埋点上报数据并没有收 集来源)
xly_sx_cnt 进入训练营详情页的人数
xly_appointment_cnt 训练营预约用户数(点击了领取课程大纲)
sx2appointment_pp 预约转化率=训练营预约用户数/进入训练营详情页的人数
pay_user_xly_cnt 训练营付费用户数
clk2pay_rate_xly 训练营付费用户转化率 = 训练营付费用户/训练营预约用户数
pay_xly_order_amount 训练营付费总金额
xly_arppu 人均付费金额 =训练营付费总金额/训练营付费用户数
value_per_sx_uv_xly 单商详UV价值 = 训练营付费总金额/进入训练营详情页的人数

导入数据:

create table 产品收入_训练营收入统计 as
-- 神策数据select exp.*,
if(exp.exp_xly_cnt =0,0,exp.exp_xly_pv/exp.exp_xly_cnt) as exp_xly_step_mean, clk.clk_xly_cnt,
clk.clk_xly_pv,
-- 点击率
if(exp.exp_xly_cnt=0,0,clk.clk_xly_cnt/exp.exp_xly_cnt) as xly_ctr_uv,
-- app 用户数,点击率
exp_1.xly_sx_cnt_app, if(clk.clk_xly_cnt=0,0,exp_1.xly_sx_cnt_app/clk.clk_xly_cnt) as clk2sx_pp,
exp_1.xly_sx_cnt, clk_1.xly_appointment_cnt,
if(exp_1.xly_sx_cnt=0,0,clk_1.xly_appointment_cnt/exp_1.xly_sx_cnt) as sx2appointment_pp,
pay.pay_user_xly_cnt, if(clk_1.xly_appointment_cnt=0,0,pay.pay_user_xly_cnt/clk_1.xly_appointment_cnt)  as clk2pay_rate_xly,
pay.pay_xly_order_amount,
if(pay.pay_user_xly_cnt =0,0,pay.pay_xly_order_amount/pay.pay_user_xly_cnt) as xly_arppu,
if(exp_1.xly_sx_cnt =0,0,pay.pay_xly_order_amount/exp_1.xly_sx_cnt) as value_per_sx_uv_xly
from (
select date,count(distinct user_id) as exp_xly_cnt,count(1) as exp_xly_pv from event_ok
-- 运营位曝光    新增埋点
where event = 'ViewOperatingPosition' and operating_type='训练营'
-- 时间字段
and date = '2020-08-15'    -- 待补时间,当天日期
group by date
) exp
left outer join (
select date, count(distinct user_id) as clk_xly_cnt, count(user_id) as clk_xly_pv
from event_ok
-- 运营位点击
where event = 'ClickOperatingPosition'    and operating_type='训练营'
-- 时间字段
and date = '2020-08-15'    -- 待补时间,当天日期
group by date
) clk
on exp.date = clk.date
-- 训练营商详用户数
left    join (
select date,    count(distinct user_id) as xly_sx_cnt,
count(distinct case when platform_type in ('Android','iOS') then user_id else null end) as xly_sx_cnt_app
from event_ok
where event = 'WebView' and page_id in
('1pb3','1pb4','1pu4','1pu5','1pxi','1qb3','1qea','1qeb','1qg1','1qhg','1qjk','1 qph','1qzq')
-- 时间字段
and date = '2020-08-15'    -- 待补时间,当天日期
group by date
) exp_1
on exp.date = exp_1.date
-- 训练营预约用户数(报名)
left outer join (
select date,count(distinct user_id) as xly_appointment_cnt from event_ok
-- web点击
where event = 'WebButtonClick' and page_id in
('1pb3','1pb4','1pu4','1pu5','1pxi','1qb3','1qea','1qeb','1qg1','1qhg','1qjk','1 qph','1qzq')
-- 时间字段
and date = '2020-08-15'    -- 待补时间,当天日期
group by    date
) clk_1
on clk.date = clk_1.date
-- 训练营付费left outer join (
select date , count(distinct user_id) as pay_user_xly_cnt, sum(actual_amount) as pay_xly_order_amount
from event_ok
-- 支付成功
where event = 'PayOrderSucceed' and actual_amount > 200
and order_type != '会员'
-- 时间字段
and date = '2020-08-15'    -- 待补时间,当天日期
group by date
) pay
on exp.date    = pay.date order by
exp.date

专栏收入统计

作用 存储每日专栏付费用户数,广告位的啋光数,点击数及各环节转化率

找到所有埋点事件力ViewOperat ingPosition(广告位噤光事件)数据表vop

专栏收入表

表原数据

表字段计算
找到所有埋点字件力ClickOperalingPosi tion(广告位点击子件)数据 表cop 找到所有埋点工件为ViewCourseDetai (l 浏览课程详 t’fi)数据表vod
找到所有埋点平件为CreateOrde(r 创建订单行为)数据 表co

找到所有埋点于件为PayOrderSu ccee(d 付费行为)并且0「der_type 1=” 会员 的数据 表pay 5个表进行关联, 关联条件为日期相等(由于曝光力第—事件)所以以exp 力主表

广告位名光用户数 表vop 每日去王user_id 用户数广告位谒光次数 表vop 每日记录数
入均限光且 运宫位隄光次数/运芦位呕光用户数广告位点击人数 表cop 每日去重use「_id 用户数广告位点击次数 表cop 母日记录数
用户卢击转化率 广告位点击人数1广告位噤光用户数 商详浏览入数 表vod 每日去三user_心 用户数
APP 商洋浏览人数 表vod中 platform_type in (‘Android’,’iOS’) 每日去玉user_id 用户数

APP浏览商详用户转化率 APP 商详浏览人数1广告位点击人数创建订单用户数 表co 每日去巠use「_id 用户数
创建订单用户转化率 创建订单用户数商详浏览人数付费用户数 表pos 每日去豆user_id 用户数
付费用户转化率 付费用户数i预约用户数付费总金额 表pos sum(actual_amount) 入均付费金额 付费总金额/付费用户数
商详UV价值 入均付费金额席消浏览入数

字段 备注
date 统计日期
exp_zl_cnt 专栏广告位曝光人数
exp_zl_pv 专栏广告位曝光量
exp_zl_step_mean 人均广告位曝光量
clk_zl_cnt 专栏广告位点击人数
clk_zl_pv 专栏广告位点击次数
zl_ctr_uv 广告位点击率= 专栏广告位点击人数/专栏广告位曝光人数
zl_sx_uv 浏览课程详情页人数
zl_sx_uv_app APP端进入课程详情页人数
zl_create_order_cnt 创建订单用户数
open2create_order_pp 订单转化率 = 创建订单用户数/浏览课程详情页人数
pay_user_zl_cnt 付费用户数
clk2pay_rate_zl 付费用户数转化率=付费用户数/创建订单用户数
pay_zl_order_amount 付费总金额
zl_arppu 人均付费金额=付费总金额/付费用户数
value_per_sx_uv_zl 单商详UV价值 = 付费总金额/浏览课程详情页人数

导入数据:

— 神策数据
create table 产品收入_专栏收入统计 as select
vop.date, exp_zl_cnt, exp_zl_pv,
if(exp_zl_cnt=0,0,exp_zl_pv/exp_zl_cnt) as exp_zl_step_mean,

clk_zl_cnt, clk_zl_pv,
if(exp_zl_cnt=0,0,clk_zl_cnt/exp_zl_cnt) as zl_ctr_uv,

zl_sx_uv, zl_sx_uv_app,
— t21.zl_sx_uv_app/t2.clk_zl_cnt as clk2open_pp, zl_create_order_cnt,
if(zl_sx_uv=0,0,zl_create_order_cnt/zl_sx_uv) as open2create_order_pp,

pay_user_zl_cnt, if(zl_create_order_cnt=0,0,pay_user_zl_cnt/zl_create_order_cnt) as clk2pay_rate_zl,

pay_zl_order_amount,

if(pay_user_zl_cnt=0,0,pay_zl_order_amount/pay_user_zl_cnt) as zl_arppu, if(zl_sx_uv=0,0,pay_zl_order_amount/zl_sx_uv) as value_per_sx_uv_zl

from (
select date, count(distinct user_id) as exp_zl_cnt, count(1) as exp_zl_pv from event_ok
— 运营位曝光
where event = ‘ViewOperatingPosition’ and operating_type=’课程列表页’
— 时间字段
and date = ‘2020-08-15’ — 待补日期,当天日期
group by date
) vop

left join (
select date , count(distinct user_id) as clk_zl_cnt,count(user_id) as clk_zl_pv
from event_ok
where event = ‘ClickOperatingPosition’ and operating_type=’课程列表页’
— 时间字段
and date = ‘2020-08-15’ — 待补日期,当天日期
group by date
) cop
on vop.date = cop.date

left join (
select
date,count(distinct user_id) as zl_sx_uv,
count(distinct case when platform_type in (‘Android’,’iOS’) then user_id else null end) as zl_sx_uv_app
from event_ok
— 浏览售前课程详情页
where event = ‘ViewCourseDetail’
— 时间字段
and date = ‘2020-08-15’ — 待补日期,当天日期
group by date
) vcd
on vop.date = vcd.date

left join (
select date , count(distinct user_id) as zl_create_order_cnt from event_ok
— 创建订单
where event = ‘CreateOrder’ and order_amount < 200
— 时间字段
and date = ‘2020-08-15’ — 待补日期,当天日期
group by date
) co
on vop.date = co.date

— 专栏付费用户

left outer join (

select date, count(distinct user_id) as pay_user_zl_cnt, sum(actual_amount) as pay_zl_order_amount
from event_ok
— 支付成功
where event = ‘PayOrderSucceed’ and actual_amount <= 200 and actual_amount>0
— 时间字段
and date = ‘2020-08-15’ — 待补日期,当天日期
group by date
) pos
on vop.date = pos.date
order by vop.date;

会员收入统计

企业指标体系搭建(三)数据采集 - 图9

字段 备注
date 统计日期
member_sx_cnt 浏览会员商详用户数
receive_vip_user_cnt 领取会员用户数(7日免费会员)
receive_pp 领取会员转化率 = 领取会员用户数/浏览会员商详用户数
member_clk_cnt 会员点击支付用户数(¥688 开通365天会员)
clk_try2pay_pp 点击支付转化率 = 会员点击支付用户数/浏览会员商详用户数
pay_user_hy_cnt 购买会员用户数
try2scucess_pp 支付转化率 = 购买会员用户数/会员点击支付用户数
pay_hy_order_amount 购买总金额
arppu_hy 人均消费 =购买总金额/购买会员用户数

导入数据:

create table 产品收入_会员收入统计 as select
rvf.date,
pv_k.member_sx_cnt,— 会员商详用户数
rvf.receive_vip_user_cnt,— 领取会员用户数
if(pv_k.member_sx_cnt =0,0,rvf.receive_vip_user_cnt/pv_k.member_sx_cnt) as receive_pp, — 领取转化率
member_clk_cnt,— 会员点击支付用户数
if(member_sx_cnt=0,0,member_clk_cnt/member_sx_cnt) as clk_try2pay_pp, — 点击转化率
coalesce(pay_user_hy_cnt,0) as pay_user_hy_cnt, — 购买会员用户数if(member_clk_cnt=0,0,coalesce(pay_user_hy_cnt/member_clk_cnt,0) )as try2scucess_pp, — 支付转化率
coalesce(pay_hy_order_amount,0) as pay_hy_order_amount, — 购买总金额
if(pay_user_hy_cnt=0,0,coalesce(pay_hy_order_amount/pay_user_hy_cnt,0)) as arppu_hy — 人均消费
from (
select date ,count(distinct user_id) as receive_vip_user_cnt from event_ok
— 领取会员成功
where event = ‘ReceiveVipFreely’
— 时间字段
and date = ‘2020-08-15’ — 待补时间,当天日期
group by date
) rvf

— 进入会员详细页
left join (
select date,count(distinct user_id) as member_sx_cnt from event_ok
where event = ‘$pageview’ and $url like ‘%kaiwu.lagou.com/member%’
— 时间字段

and date = ‘2020-08-15’ — 待补时间,当天日期
group by date
https://kaiwu.lagou.com/member/index
) pv_k
on rvf.date = pv_k.date
— 会员点击支付按钮
left join (
select date, count(distinct user_id) as member_clk_cnt from event_ok
where event = ‘$WebClick’
— and $url in (‘https://kaiwu.lagou.com/member/index','https://kaiwu.lagou.com/member/index? appType=C&lagoufrom=ios&appVersion=7.0.1&randomize=704021827’)
and $url like ‘%kaiwu.lagou.com/member%’ and $element_content = ‘¥688 开通365天会员’
— 时间字段
and date = ‘2020-08-15’ — 待补时间,当天日期
group by date
) wc_k
on rvf.date = wc_k.date

— 付费成功left join (
select date , count(distinct user_id) as pay_user_hy_cnt, sum(actual_amount) as pay_hy_order_amount
from event_ok
where event = ‘PayOrderSucceed’ and actual_amount>0 and order_type = ‘会员’
— 时间字段
and date = ‘2020-08-15’ — 待补时间,当天日期
group by date
) pay
on rvf.date = pay.date order by rvf.date;

转化分析

下载渠道付费转化率

企业指标体系搭建(三)数据采集 - 图10

字段 备注
channel 渠道
num 安装人数
pay_user 付费人数
pay_rate 付费转化率= 付费人数/安装人数
total_amount 总付费金额
amount_mean 人均付费金额=总付费金额/付费人数

导入数据:

create table 下载渠道付费转化率 as select
start_date, end_date,
channel,num,pay_user,
if(num=0,0,pay_user/num) as pay_rate,total_amount,amount_mean from
(
select if(DownloadChannel is null ,’iOS’,DownloadChannel) as channel, count(distinct user_id) num
from event_ok
where event = ‘AppInstall’
— 时间字段
and date = DATE_SUB(‘2020-08-15’, Interval 14 DAY) — 待补日期,当天日期
group by 1
) appinstall

安装并付费的用户信息left join

(
select sub_AI.channel_2,
count(distinct sub_AI.user_id) as pay_user, sum(pay.actual_amount_total) as total_amount, if(count(distinct sub_AI.user_id) =0,0,
sum(pay.actual_amount_total)/count(distinct sub_AI.user_id)) as amount_mean, max(pay.Install_date) as end_date,
min(pay.Install_date) as start_date from
(
select user_id, if(DownloadChannel is null ,’iOS’,DownloadChannel) as channel_2,date as Install_date
from event_ok
where event = ‘AppInstall’
— 时间字段
and date = DATE_SUB(‘2020-08-15’, Interval 14 DAY) — 待补日期,当天日期
group by 1, 2,3
) sub_AI

left join (
select user_id, date as Install_date,
case when actual_amount > 200 and order_type = ‘课程’ then 1
when actual_amount >0 and actual_amount <= 200 and order_type =

‘课程’ then 2

when order_type = ‘会员’ then 3 else -1 end as income_type,

sum(actual_amount) as actual_amount_total from event_ok
where event = ‘PayOrderSucceed’
— 时间字段
and date >= DATE_SUB(‘2020-08-15’, Interval 14 DAY) and date <= ‘2020-08-15’
— 待补日期 两周转化周期
group by 1,3,2
) pay on sub_AI.user_id = pay.user_id

where pay.actual_amount_total>0 group by 1
order by 2 desc

)temp on temp.channel_2 = appinstall.channel order by total_amount desc;

统计日报与周报

作用 以每周一为日期, 统计当前周的拉新,活 跃留存, 产品收入信息

用户拉新

用户活跃留存用户在线时长

周报 表原数据
总收入统计

训练营收入统计专栏收入统计 会员收入统计
以上各表以日期进行关联

表字段计算 对所有上述表进行汇总/7, 计算—周内日均数据

字段 备注
统计日期 日报为当日日期,周报为当前周周一
激活人数
新用户次日留存人数
新用户次日留存率
新用户7日留存人数
新用户7日留存率
新用户14日留存人数
新用户14日留存率
DAU
首次活跃人数
首次活跃用户占比
活跃用户次日留存人数
活跃用户次日留存率
活跃用户7日留存人数
活跃用户7日留存率
活跃用户14日留存人数
活跃用户14日留存率
平均单日使用时长(分)
总付费人数
总付费金额
客单价
付费人数_训练营
付费金额_训练营
客单价_训练营
付费人数_专栏
付费金额_专栏
客单价_专栏
付费人数_会员
付费金额_会员
客单价_会员
字段 备注
曝光人数_训练营
曝光次数_训练营
平均曝光次数_训练营
点击人数训练营app
点击次数训练营app
点击率训练营uv
点击率训练营pv
浏览商详人数_训练营
浏览商详人数训练营app
预约人数_训练营
浏览商详-预约转化率_训练营
预约-付费转化率_训练营
单商详uv价值_训练营
曝光人数_专栏
曝光次数_专栏
人均曝光次数_专栏
点击人数_专栏
点击率专栏uv
点击率专栏pv
浏览商详人数_专栏
浏览商详人数专栏app
下单人数_专栏
浏览商详-下单转化率_专栏
下单-付费转化率_专栏
单商详uv价值_专栏
浏览商详人数_会员
领取会员人数
浏览商详-领取会员转化率_会员
点击支付人数_会员

导入数据:

日报:

create table 日报 as select
— 用户拉新表内
用户活跃_用户活跃留存.install_date as ‘日期’, u_cnt_total as ‘ 激 活 人 数 ‘, retention_u_cnt_total as ‘新用户次日留存人数’,
if(u_cnt_total =0,0,retention_u_cnt_total/u_cnt_total) as “新用户次日留存率”,

retention_u_cnt_total_7 as “ 新 用 户 7 日 留 存 人 数 “, if(u_cnt_total=0,0,retention_u_cnt_total_7/u_cnt_total) as “新用户7日留存率”,

retention_u_cnt_total_14 as “ 新 用 户 14 日 留 存 人 数 “, if(u_cnt_total=0,0,retention_u_cnt_total_14/u_cnt_total) as “新用户14日留存率”,

— 用户活跃用户活跃留存 内
用户活跃
用户活跃留存.dau as “DAU”, firstdau as “首次活跃人数”,
if(用户活跃
用户活跃留存.dau =0,0,firstdau/用户活跃用户活跃留存.dau) as “首次活跃用户占比”,

retentiondau_1 as “活跃用户次日留存人数”,
if(用户活跃
用户活跃留存.dau=0,0,retentiondau_1/用户活跃用户活跃留存.dau) as “活跃用户次日留存率”,

retentiondau_7 as “活跃用户7日留存人数”,
if(用户活跃
用户活跃留存.dau=0,0,retentiondau_7/用户活跃用户活跃留存.dau) as “活跃用户7日留存率”,

retentiondau_14 as “活跃用户14日留存人数”,
if(用户活跃
用户活跃留存.dau=0,0,retentiondau_14/用户活跃用户活跃留存.dau) as “活跃用户14日留存率”,

— 用户活跃用户在线时长
if(用户活跃
用户在线时长.dau=0,0,totalduration/用户活跃用户在线时长.dau/60) as “平均单日使用时长(分)”,

— 产品收入_总收入统计pay_user as “总付费人数”, total_amount as “总付费金额”,
if(pay_user=0,0,total_amount/pay_user) as “客单价”,

payuser_xly as “付费人数训练营”, totalamount_xly as “付费金额训练营”,
coalesce(totalamount_xly/pay_user_xly,0) as “客单价训练营”,

payuser_zl as “付费人数专栏”, totalamount_zl as “付费金额专栏”,
if(payuser_zl=0,0,coalesce(total_amount_zl/pay_user_zl,0)) as “客单价专栏”,

payuser_member as “付费人数会员”, totalamount_member as “付费金额会员”,
if(payuser_member <> 0,total_amount_member/pay_user_member,0) as “客单价会员”,

— 产品收入训练营收入统计exp_xly_cnt as “曝光人数训练营”, expxly_pv as “曝光次数训练营”,
if(expxly_cnt<>0, exp_xly_pv/exp_xly_cnt,0) as “平均曝光次数训练营”, clkxly_cnt as “点击人数训练营app”,
clk_xly_pv as “点击次数
训练营app”,
if(exp_xly_cnt<>0, clk_xly_cnt/exp_xly_cnt,0) as “点击率
训练营uv”, if(exp_xly_pv<>0, clk_xly_pv/exp_xly_pv,0) as “点击率训练营_pv”,

— 产品收入训练营收入统计
xly_sx_cnt as “浏览商详人数
训练营”, xlysx_cnt_app as “浏览商详人数训练营_app”,

xlyappointment_cnt as “ 预 约 人 数 训 练 营 “, if(xlysx_cnt<>0,xly_appointment_cnt/xly_sx_cnt,0) as “浏览商详-预约转化率训练营”, if(xlyappointment_cnt<>0,pay_user_xly_cnt/xly_appointment_cnt,0) as “预约-付费转化率训练营”,
if(xlysx_cnt<>0,pay_xly_order_amount/xly_sx_cnt,0) as “单商详uv价值训练营”,

— 产品收入_专栏收入统计

expzl_cnt as “曝光人数专栏”, expzl_pv as “曝光次数专栏”,
if(expzl_cnt<>0,exp_zl_pv/exp_zl_cnt,0) as “人均曝光次数专栏”, clkzl_cnt/7 as “ 点 击 人 数 专 栏 “ , if(expzl_cnt<>0,clk_zl_cnt/exp_zl_cnt,0) as “点击率专栏uv”, if(exp_zl_pv<>0,clk_zl_pv/exp_zl_pv,0) as “点击率专栏_pv”,

zlsx_uv as “浏览商详人数专栏” , zlsx_uv_app as “浏览商详人数专栏app”, zl_create_order_cnt as “下单人数专栏”,

if(zlsx_uv<>0,zl_create_order_cnt/zl_sx_uv,0) as “浏览商详-下单转化率专栏”, if(zlcreate_order_cnt<>0,pay_user_zl_cnt/zl_create_order_cnt,0) as “下单-付费转化率专栏”,
if(zlsx_uv<>0,pay_zl_order_amount/zl_sx_uv,0) as “单商详uv价值专栏”,

— 产品收入会员收入统计
member_sx_cnt as “浏览商详人数
会员”, receivevip_user_cnt as “领取会员人数”,
if(member_sx_cnt <> 0, receive_vip_user_cnt / member_sx_cnt,0) as “浏览商详-领取会员转化率
会员”,
memberclk_cnt as “点击支付人数会员” from 用户活跃用户活跃留存
left join (select DISTINCT install_date, u_cnt_total, — 日安装用户总数retention_u_cnt_total, — 次日留存用户总数retention_u_cnt_total_7, — 日留存用户总书
retention_u_cnt_total_14 — 日留存用户总书
from 用户拉新) as 新用户
on 用户活跃
用户活跃留存.Installdate = 新用户.install_date
left join 用户活跃
用户在线时长 on 用户活跃_用户在线时长.Install_date = 新用户.install_date

left join 产品收入总收入统计 on 产品收入总收入统计.Installdate = 新用户.install_date
left join 产品收入
训练营收入统计 on 产品收入训练营收入统计.date = 新用户.install_date left join 产品收入专栏收入统计 on 产品收入专栏收入统计.date = 新用户.install_date left join 产品收入会员收入统计 on 产品收入会员收入统计.date = 新用户.install_date where 用户活跃用户活跃留存.install_date = ‘2020-08-15’; — 待补时间,当天日期

周报:

create table 周报 as select
— 用户拉新表内
DATESUB(用户活跃用户活跃留存.installdate,INTERVAL WEEKDAY(用户活跃用户活跃留存.install_date) DAY) as ‘当周周一’,
sum(u_cnt_total)/7 as ‘ 激 活 人 数 ‘, sum(retention_u_cnt_total)/7 as ‘新用户次日留存人数’,
if(sum(u_cnt_total)=0,0,sum(retention_u_cnt_total)/sum(u_cnt_total)) as “新用户次日留存率”,

sum(retention_u_cnt_total_7)/7 as “ 新 用 户 7 日 留 存 人 数 “, if(sum(u_cnt_total)=0,0,sum(retention_u_cnt_total_7)/sum(u_cnt_total)) as “新用户7日留存率”,

sum(retention_u_cnt_total_14)/7 as “ 新 用 户 14 日 留 存 人 数 “, if(sum(u_cnt_total)=0,0,sum(retention_u_cnt_total_14)/sum(u_cnt_total)) as “新用户14日留存率”,

— 用户活跃用户活跃留存 内
sum(用户活跃
用户活跃留存.dau)/7 as “DAU”, sum(firstdau)/7 as “首次活跃人数”,
if(sum(用户活跃
用户活跃留存.dau)=0,0,sum(firstdau)/sum(用户活跃用户活跃留存.dau)) as “首次活跃用户占比”,

sum(retentiondau_1)/7 as “活跃用户次日留存人数”,
if(sum(用户活跃
用户活跃留存.dau) =0,0,sum(retentiondau_1)/sum(用户活跃用户活跃留存.dau)) as “活跃用户次日留存率”,

sum(retentiondau_7)/7 as “活跃用户7日留存人数”,
if(sum(用户活跃
用户活跃留存.dau)=0,0,sum(retentiondau_7)/sum(用户活跃用户活跃留 存.dau)) as “活跃用户7日留存率”,

sum(retentiondau_14)/7 as “活跃用户14日留存人数”,
if(sum(用户活跃
用户活跃留存.dau)=0,0,sum(retentiondau_14)/sum(用户活跃用户活跃留 存.dau)) as “活跃用户14日留存率”,

— 用户活跃用户在线时长
if(sum(用户活跃
用户在线时长.dau)=0,0,sum(totalduration)/sum(用户活跃用户在线时 长.dau)/60) as “平均单日使用时长(分)”,

— 产品收入_总收入统计sum(pay_user)/7 as “总付费人数”, sum(total_amount)/7 as “总付费金额”,
if(sum(pay_user)=0,0,sum(total_amount)/sum(pay_user)) as “客单价”,

sum(payuser_xly)/7 as “付费人数训练营”, sum(totalamount_xly) /7 as “付费金额训练营”,

if(sum(payuser_xly)=0,0,coalesce(sum(total_amount_xly)/sum(pay_user_xly),0)) as “客单价训练营”,

sum(payuser_zl)/7 as “付费人数专栏”, sum(totalamount_zl)/7 as “付费金额专栏”,
if(sum(payuser_zl)=0,0,coalesce(sum(total_amount_zl)/sum(pay_user_zl),0)) as “客单价专栏”,

sum(payuser_member)/7 as “付费人数会员”, sum(totalamount_member)/7 as “付费金额会员”,
if(sum(payuser_member) <> 0,sum(total_amount_member)/sum(pay_user_member),0) as “客单价会员”,

— 产品收入训练营收入统计sum(exp_xly_cnt)/7 as “曝光人数训练营”, sum(expxly_pv)/7 as “曝光次数训练营”,
if(sum(expxly_cnt)<>0, sum(exp_xly_pv)/sum(exp_xly_cnt),0) as “平均曝光次数训练营”,
sum(clkxly_cnt)/7 as “点击人数训练营app”, sum(clk_xly_pv)/7 as “点击次数训练营app”,
if(sum(exp_xly_cnt)<>0, sum(clk_xly_cnt)/sum(exp_xly_cnt),0) as “点击率
训练营
uv”,
if(sum(exp_xly_pv)<>0, sum(clk_xly_pv)/sum(exp_xly_pv),0) as “点击率
训练营_pv”,

— 产品收入训练营收入统计
sum(xly_sx_cnt)/7 as “浏览商详人数
训练营”, sum(xlysx_cnt_app)/7 as “浏览商详人数训练营_app”,

sum(xlyappointment_cnt)/7 as “ 预 约 人 数 训 练 营 “, if(sum(xlysx_cnt)<>0,sum(xly_appointment_cnt)/sum(xly_sx_cnt),0) as “浏览商详-预约 转 化 率 训 练 营 “, if(sum(xlyappointment_cnt)<>0,sum(pay_user_xly_cnt)/sum(xly_appointment_cnt),0) as “ 预 约 - 付 费 转 化 率 训 练 营 “, if(sum(xlysx_cnt)<>0,sum(pay_xly_order_amount)/sum(xly_sx_cnt),0) as “单商详uv价值训练营”,

— 产品收入_专栏收入统计

sum(expzl_cnt)/7 as “曝光人数专栏”, sum(expzl_pv)/7 as “曝光次数专栏”,
if(sum(expzl_cnt)<>0,sum(exp_zl_pv)/sum(exp_zl_cnt),0) as “人均曝光次数专栏”, sum(clkzl_cnt)/7 as “ 点 击 人 数 专 栏 “ , if(sum(expzl_cnt)<>0,sum(clk_zl_cnt)/sum(exp_zl_cnt),0) as “点击率专栏uv”, if(sum(exp_zl_pv)<>0,sum(clk_zl_pv)/sum(exp_zl_pv),0) as “点击率专栏_pv”,

sum(zlsx_uv)/7 as “浏览商详人数专栏” , sum(zlsx_uv_app)/7 as “浏览商详人数专栏app”, sum(zl_create_order_cnt)/7 as “下单人数专栏”,

if(sum(zlsx_uv)<>0,sum(zl_create_order_cnt)/sum(zl_sx_uv),0) as “浏览商详-下单转化率专栏”,

if(sum(zlcreate_order_cnt)<>0,sum(pay_user_zl_cnt)/sum(zl_create_order_cnt),0) as “下单-付费转化率专栏”,

if(sum(zlsx_uv)<>0,sum(pay_zl_order_amount)/sum(zl_sx_uv),0) as “单商详uv价值专栏”,

— 产品收入会员收入统计
sum(member_sx_cnt)/7 as “浏览商详人数
会员”, sum(receivevip_user_cnt)/7 as “领取会员人数”,
if(sum(member_sx_cnt) <> 0,sum(receive_vip_user_cnt)/sum(member_sx_cnt),0) as “浏览商详-领取会员转化率
会员”,
sum(memberclk_cnt)/7 as “点击支付人数会员”

from 用户活跃用户活跃留存
left join
(select DISTINCT install_date, u_cnt_total, — 日安装用户总数retention_u_cnt_total, — 次日留存用户总数retention_u_cnt_total_7, — 日留存用户总书retention_u_cnt_total_14 — 日留存用户总书
from 用户拉新) as 新用户 on 用户活跃
用户活跃留存.Installdate = 新用户.install_date left join 用户活跃用户在线时长 on 用户活跃用户在线时长.Install_date = 新用
户.install_date
left join 产品收入
总收入统计 on 产品收入总收入统计.Install_date = 新用户.install_date
left join 产品收入
训练营收入统计 on 产品收入训练营收入统计.date = 新用户.install_date left join 产品收入专栏收入统计 on 产品收入专栏收入统计.date = 新用户.install_date left join 产品收入会员收入统计 on 产品收入会员收入统计.date = 新用户.install_date where 用户活跃用户活跃留存.installdate >= ‘2020-08-09’ and 用户活跃用户活跃留
存.install_date <= ‘2020-08-15’ — 待补时间,补充时间段要为周一到周日
group by 1;

调度系统(见任务三讲义)

  1. 报表系统(见任务三讲义)