数据采集
埋点数据校验
- 对事件和属性的正确性、顺序性、完整性及数据类型进行校验
- 对用户关联情况进行校验
- 验证 App 与 H5 打通(做了打通的情况下)
对事件和属性的正确性:
是否有数据
内容和格式
对事件和属性的顺序性:
有顺序的行为,上发时间是否符合顺序
对事件和属性的完整性:
需要针对所有场景进行测试
反馈的类型是否是我们需要的
对用户的关联情况进行校验:
不同类型的用户,属性信息是否正常
验证app与H5打通:
为什么打通,因为H5是匿名的,主要是校验用户信息传输正确性
怎么做数据校验
抓包测试
埋点平台sql抽取
宽表测试
- 多个事件到宽表的过程
- 表数据是否完全收集
- 灰度数据验收
多个事件到宽表的过程
- 共有属性使用共有字段
- 特有属性使用特有字段
表数据是否完全收集
- 宽表字段是否包含所有事件属性
- 表数据是否符合正确性,顺序性,完整性
灰度数据验收
进入数据采集阶段之后,我们已经有了宽表,希望同学自己去导入
从宽表中梳理我们需要的表
数据已经完成,我们要从杂乱无章的数据中梳理我们想要的数据
导入备份数据库:
在已有的连接中新建一个数据库:
在弹出的窗口中填入数据库名称edu_company:
打开数据库edu_company,右键备份,点击还原备份从,选择课程提供的.nb3备份文件。
用户拉新
作用:存储留存
原数据:
主要数据源
补充数据源
连接条件
表字段计算
统计主体:决定我们的统计粒度
分渠道的每日用户数留存情况
每日整体新用户留存情况(使用窗口函数)
字段名 | 备注 |
---|---|
install_date | 安装APP日期 |
downloadchannel | 下载APP渠道 |
u_cnt | 安装用户数 |
retention_u_cnt | 次留用户数 |
rention_pp | 次留率=(次留用户数/安装用户数) |
retention_u_cnt_7 | 7日留存用户数 |
rention_pp_7 | 7日留存率 =(7日留存用户数/安装用户数) |
retention_u_cnt_14 | 14日留存用户数 |
rention_pp_14 | 14日留存率 = 14日留存用户数/安装用户数 |
u_cnt_total |
日安装用户总数(不分渠道,按照日为粒度的统计数据,用窗口函数计算) |
retention_u_cnt_total | 次日留存新用户总数(不分渠道) |
rention_pp_modi |
次日新用户整体留存率= 次日新用户整体留存总数/日安装用户总数 (不分渠道) |
retention_u_cnt_total_7 | 7日留存新用户总数(不分渠道) |
rention_pp_modi_7 |
7次日新用户整体留存率= 7次日留存总数/日安装用户总数(不分渠道) |
retention_u_cnt_total_14 | 14日留存新用户总书(不分渠道) |
rention_pp_modi_14 |
14次日新用户整体留存率= 14次日留存总数/日安装用户总数(不分渠道) |
导入数据:
SQL怎么看
先看主要数据源from部分
再看补充数据源join部分
再看计算select部分
create table 用户拉新 as select *,
sum(u_cnt) over( PARTITION by install_date) as u_cnt_total, sum(retention_u_cnt) over( PARTITION by install_date) as retention_u_cnt_total, if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi,
sum(retention_u_cnt_7) over( PARTITION by install_date) as retention_u_cnt_total_7,
if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt_7) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi_7,
sum(retention_u_cnt_14) over( PARTITION by install_date) as retention_u_cnt_total_14,
if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt_14) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi_14
from(
select install_date,downloadchannel,count(distinct user_id) as u_cnt, COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null)) as retention_u_cnt,
if(count(distinct user_id) =0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null))/ count(distinct user_id)) as rention_pp,
COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) as retention_u_cnt_7,
if(count(distinct user_id)=0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) / count(distinct user_id)) as rention_pp_7,
COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null)) as retention_u_cnt_14,
if(count(distinct user_id)=0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null))/count(distinct user_id)) as rention_pp_14
from(
SELECT distinct T_Install.user_id,
T_Install.downloadchannel, T_Install.date as install_date, T_Start.Date action_date
from (
select * from event_ok where `event` = 'AppInstall'
-- 时间字段
and date in ('2020-08-15',DATE_SUB('2020-08-15', Interval 1 DAY),DATE_SUB('2020-08-15', Interval 7 DAY),DATE_SUB('2020-08-15', Interval 14
DAY)) ) as T_Install -- 待补日期
left join (
select * from event_ok where `event` = '$AppStart'
-- 时间字段
and date >= DATE_SUB('2020-08-15', Interval 14 DAY) and date<= '2020-08-15') as T_Start -- 待补日期
on T_Install.user_id = T_Start.user_id and
( DATE_SUB(T_Start.Date, Interval 1 DAY) = T_Install.date or DATE_SUB(T_Start.Date, Interval 7 DAY) = T_Install.date or DATE_SUB(T_Start.Date, Interval 14 DAY) = T_Install.date)) as T_Install_Start
group by Install_date,downloadchannel) as add_date_sta;
- 主要数据源
- 补充数据源
- 连接条件
由于字段较多,我们不使用update,而是使用delete
注意
表名规则
- 指名层级 公司+数据层级
ec_(ods/mds/wt)
- 指名用途
retention
- 指名内容
new_user_statistic_day
完整表名:
ec_mds_retention_new_user_statistic_day
如果我们要做写入语句,我们则需要这么改:
DELETE FROM 用户拉新 where install_date in (DATE_SUB('2020-08-15', Interval 1 DAY),DATE_SUB('2020-08-15', Interval 7 DAY),DATE_SUB('2020-08-15', Interval 14 DAY));
insert 用户拉新
select *,
sum(u_cnt) over( PARTITION by install_date) as u_cnt_total,
sum(retention_u_cnt) over( PARTITION by install_date) as retention_u_cnt_total, if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi,
sum(retention_u_cnt_7) over( PARTITION by install_date) as retention_u_cnt_total_7,
if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt_7) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi_7,
sum(retention_u_cnt_14) over( PARTITION by install_date) as retention_u_cnt_total_14,
if(sum(u_cnt) over( PARTITION by install_date)=0,0,sum(retention_u_cnt_14) over( PARTITION by install_date)/sum(u_cnt) over( PARTITION by install_date)) as rention_pp_modi_14
from(
select install_date,downloadchannel,count(distinct user_id) as u_cnt, COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null)) as retention_u_cnt,
if(count(distinct user_id) =0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null))/ count(distinct user_id)) as rention_pp,
COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) as retention_u_cnt_7,
if(count(distinct user_id)=0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) / count(distinct user_id)) as rention_pp_7,
COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null)) as retention_u_cnt_14,
if(count(distinct user_id)=0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null))/count(distinct user_id)) as rention_pp_14
from(
SELECT distinct T_Install.user_id,
T_Install.downloadchannel, T_Install.date as install_date, T_Start.Date action_date
from (
select * from event_ok where `event` = 'AppInstall'
-- 时间字段
and date in ('2020-08-15',DATE_SUB('2020-08-15', Interval 1 DAY),DATE_SUB('2020-08-15', Interval 7 DAY),DATE_SUB('2020-08-15', Interval 14
DAY)) ) as T_Install -- 待补日期
left join (
select * from event_ok where `event` = '$AppStart'
-- 时间字段
and date >= DATE_SUB('2020-08-15', Interval 14 DAY) and date<= '2020-08-15') as T_Start -- 待补日期
on T_Install.user_id = T_Start.user_id and
( DATE_SUB(T_Start.Date, Interval 1 DAY) = T_Install.date or DATE_SUB(T_Start.Date, Interval 7 DAY) = T_Install.date or
DATE_SUB(T_Start.Date, Interval 14 DAY) = T_Install.date)) as T_Install_Start
group by Install_date,downloadchannel) as add_date_sta;
用户活跃
用户活跃留存
字段 | 备注 |
---|---|
install_date | 活跃日期 |
dau | 日活用户数 |
first_dau | 第一次启动APP用户数 |
first_PP | 第一次启动APP比率 = 第一次启动APP用户数/日活用户数 |
retention_dau_1 | 日活次日留存用户数 |
retention_pp_1 | 日活次日留存率=日活次日留存用户数/日活用户数 |
retention_dau_7 | 7日活次日留存用户数 |
rention_pp_7 | 7日活次日留存率=日活次日留存用户数/日活用户数 |
retention_dau_14 | 14日活次日留存用户数 |
rention_pp_14 | 14日活次日留存率=日活次日留存用户数/日活用户数 |
导入数据:
create table 用户活跃_用户活跃留存 as
select install_date,count(distinct user_id) as dau,
count(distinct case when first_flag = 1 then user_id else null end) as first_dau,
if(count(distinct user_id) =0,0,count(distinct case when first_flag = 1 then user_id else null end)/count(distinct user_id)) as first_PP,
COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null)) as retention_dau_1,
if(count(distinct user_id)=0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=1,user_id,Null))/ count(distinct user_id)) as retention_pp_1,
COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) as retention_dau_7,
if(count(distinct user_id) =0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=7,user_id,Null)) / count(distinct user_id)) as rention_pp_7,
COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null)) as retention_dau_14,
if(count(distinct user_id) =0,0,COUNT(distinct if(DATEDIFF(action_date,install_date)=14,user_id,Null))/count(distinct user_id)) as rention_pp_14
from(
SELECT distinct T_Action.user_id,T_Action.date as install_date, T_Action.$is_first_day as first_flag,T_Start.Date action_date from (
select user_id,$is_first_day,Date from event_ok where `event` = '$AppStart'
-- 时间字段
and date in ('2020-08-15',DATE_SUB('2020-08-15', Interval 1 DAY),DATE_SUB('2020-08-15', Interval 7 DAY),DATE_SUB('2020-08-15', Interval 14
DAY)) -- 待补日期
group by user_id,$is_first_day,Date
) as T_Action left join (
select * from event_ok where `event` = '$AppStart'
-- 时间字段
and date >= DATE_SUB('2020-08-15', Interval 14 DAY) and date<= '2020-08-15' ) as T_Start -- 待补日期 ,第二个日期为第一个日期加14天
on T_Action.user_id = T_Start.user_id and
( DATE_SUB(T_Start.Date, Interval 1 DAY) = T_Action.date or DATE_SUB(T_Start.Date, Interval 7 DAY) = T_Action.date or DATE_SUB(T_Start.Date, Interval 14 DAY) = T_Action.date)) as T_Action_Start
group by Install_date
用户在线时长
字段 | 备注 |
---|---|
Install_date | 统计日期 |
dau | 日活人数 |
total_duration | 活跃总时长 |
duration_mean | 人均活跃时长 |
导入数据:
create table 用户活跃_用户在线时长 as select
Install_date,
count(distinct T_Action.user_id) as dau, sum(duration) as total_duration,
if(count(distinct T_Action.user_id)=0,0,sum(duration)/count(distinct T_Action.user_id)) as duration_mean
from (
select user_id,Date as install_date from event_ok where `event` = '$AppStart'
-- 时间字段
and date = '2020-08-15' -- 待补日期,当天日期
group by user_id,Date
) as T_Action
left join (
select user_id, date as end_date, sum($event_duration) as duration from event_ok
where `event` = '$AppEnd'
-- 时间字段
and date = '2020-08-15' -- 待补日期,当天日期
group by user_id, date
) as T_end
on T_Action.user_id = T_end.user_id and install_date = end_date
group by Install_date;
收入
当groupby sh使用数字的时候,如果列名不是数字,那么对应的是select的列名
产品收入
字段 | 备注 |
---|---|
Install_date | 统计日期 |
pay_user | 付费用户数 |
total_amount | 付费总金额 |
amount_mean | 人均付费 |
pay_user_xly | 训练营付费人数 |
total_amount_xly | 训练营付费总金额 |
amount_mean_xly | 训练营人均付费 |
pay_user_zl | 专栏付费人数 |
total_amount_zl | 专栏付费总金额 |
amount_mean_zl | 专栏人均付费 |
pay_user_member | 会员付费人数 |
total_amount_member | 会员付费总金额 |
amount_mean_member | 会员人均付费 |
导入数据:
create table 产品收入_总收入统计 as select
Install_date,
count(distinct user_id) as pay_user, sum(actual_amount_total) as total_amount,
if(count(distinct user_id) =0,0,sum(actual_amount_total)/count(distinct user_id)) as amount_mean,
#训练营
count(distinct case when income_type =1 then user_id else null end) as pay_user_xly,
sum(case when income_type =1 then actual_amount_total else 0 end) as total_amount_xly,
if(count(distinct case when income_type =1 then user_id else null end)
=0,0,coalesce(sum(case when income_type =1 then actual_amount_total else 0 end)/count(distinct case when income_type =1 then user_id else null end),0)) as amount_mean_xly,
#专栏
count(distinct case when income_type =2 then user_id else null end) as pay_user_zl,
sum(case when income_type =2 then actual_amount_total else 0 end) as total_amount_zl,
if(count(distinct case when income_type =2 then user_id else null end)=0,0,coalesce(sum(case when income_type =2 then actual_amount_total else 0 end)/count(distinct case when income_type =2 then user_id else null end),0)) as amount_mean_zl,
#会员
count(distinct case when income_type =3 then user_id else null end) as pay_user_member,
sum(case when income_type =3 then actual_amount_total else 0 end) as total_amount_member,
if(count(distinct case when income_type =3 then user_id else null end) <> 0,sum(case when income_type =3 then actual_amount_total else 0 end)/count(distinct case when income_type =3 then user_id else null end),0) as amount_mean_member
from
(
select user_id,
date as Install_date,
case when actual_amount > 200 and (order_type = '课程' or order_type is null) then 1
when actual_amount >=0 and actual_amount <= 200 and (order_type = '课程' or order_type is null) then 2
when order_type = '会员' then 3 else -1 end as income_type,
sum(actual_amount) as actual_amount_total from event_ok
where event = 'PayOrderSucceed'
-- 时间字段
and date = '2020-08-15' -- 待补时间,当天日期
group by 1,3,2
) PayOrderSucceed
where actual_amount_total>0 group by
Install_date order by Install_date
训练营收入统计
梳理顺序步骤
- 确认页面
广告位
商详页
付费
- 找到埋点数据的元数据
曝光,点击以及各环节的转化率
- 计算
字段 | 备注 |
---|---|
date | 统计日期 |
exp_xly_cnt | 广告曝光人数 |
exp_xly_pv | 广告曝光次数 |
exp_xly_step_mean | 广告人均曝光数 |
clk_xly_cnt | 广告点击人数 |
clk_xly_pv | 广告点击次数 |
xly_ctr_uv | 广告点击率 |
xly_sx_cnt_app | APP端进入训练营详情页用户数 |
clk2sx_pp |
进入训练营详情页转化率 = APP端进入详情页用户数/广告点击人数 (本应该是进入训练营详情页的人数/广告点击人数,但是由于进入详 情页的路径并不唯一所以转化率大于1了,而埋点上报数据并没有收 集来源) |
xly_sx_cnt | 进入训练营详情页的人数 |
xly_appointment_cnt | 训练营预约用户数(点击了领取课程大纲) |
sx2appointment_pp | 预约转化率=训练营预约用户数/进入训练营详情页的人数 |
pay_user_xly_cnt | 训练营付费用户数 |
clk2pay_rate_xly | 训练营付费用户转化率 = 训练营付费用户/训练营预约用户数 |
pay_xly_order_amount | 训练营付费总金额 |
xly_arppu | 人均付费金额 =训练营付费总金额/训练营付费用户数 |
value_per_sx_uv_xly | 单商详UV价值 = 训练营付费总金额/进入训练营详情页的人数 |
导入数据:
create table 产品收入_训练营收入统计 as
-- 神策数据select exp.*,
if(exp.exp_xly_cnt =0,0,exp.exp_xly_pv/exp.exp_xly_cnt) as exp_xly_step_mean, clk.clk_xly_cnt,
clk.clk_xly_pv,
-- 点击率
if(exp.exp_xly_cnt=0,0,clk.clk_xly_cnt/exp.exp_xly_cnt) as xly_ctr_uv,
-- app 用户数,点击率
exp_1.xly_sx_cnt_app, if(clk.clk_xly_cnt=0,0,exp_1.xly_sx_cnt_app/clk.clk_xly_cnt) as clk2sx_pp,
exp_1.xly_sx_cnt, clk_1.xly_appointment_cnt,
if(exp_1.xly_sx_cnt=0,0,clk_1.xly_appointment_cnt/exp_1.xly_sx_cnt) as sx2appointment_pp,
pay.pay_user_xly_cnt, if(clk_1.xly_appointment_cnt=0,0,pay.pay_user_xly_cnt/clk_1.xly_appointment_cnt) as clk2pay_rate_xly,
pay.pay_xly_order_amount,
if(pay.pay_user_xly_cnt =0,0,pay.pay_xly_order_amount/pay.pay_user_xly_cnt) as xly_arppu,
if(exp_1.xly_sx_cnt =0,0,pay.pay_xly_order_amount/exp_1.xly_sx_cnt) as value_per_sx_uv_xly
from (
select date,count(distinct user_id) as exp_xly_cnt,count(1) as exp_xly_pv from event_ok
-- 运营位曝光 新增埋点
where event = 'ViewOperatingPosition' and operating_type='训练营'
-- 时间字段
and date = '2020-08-15' -- 待补时间,当天日期
group by date
) exp
left outer join (
select date, count(distinct user_id) as clk_xly_cnt, count(user_id) as clk_xly_pv
from event_ok
-- 运营位点击
where event = 'ClickOperatingPosition' and operating_type='训练营'
-- 时间字段
and date = '2020-08-15' -- 待补时间,当天日期
group by date
) clk
on exp.date = clk.date
-- 训练营商详用户数
left join (
select date, count(distinct user_id) as xly_sx_cnt,
count(distinct case when platform_type in ('Android','iOS') then user_id else null end) as xly_sx_cnt_app
from event_ok
where event = 'WebView' and page_id in
('1pb3','1pb4','1pu4','1pu5','1pxi','1qb3','1qea','1qeb','1qg1','1qhg','1qjk','1 qph','1qzq')
-- 时间字段
and date = '2020-08-15' -- 待补时间,当天日期
group by date
) exp_1
on exp.date = exp_1.date
-- 训练营预约用户数(报名)
left outer join (
select date,count(distinct user_id) as xly_appointment_cnt from event_ok
-- web点击
where event = 'WebButtonClick' and page_id in
('1pb3','1pb4','1pu4','1pu5','1pxi','1qb3','1qea','1qeb','1qg1','1qhg','1qjk','1 qph','1qzq')
-- 时间字段
and date = '2020-08-15' -- 待补时间,当天日期
group by date
) clk_1
on clk.date = clk_1.date
-- 训练营付费left outer join (
select date , count(distinct user_id) as pay_user_xly_cnt, sum(actual_amount) as pay_xly_order_amount
from event_ok
-- 支付成功
where event = 'PayOrderSucceed' and actual_amount > 200
and order_type != '会员'
-- 时间字段
and date = '2020-08-15' -- 待补时间,当天日期
group by date
) pay
on exp.date = pay.date order by
exp.date
专栏收入统计
作用 存储每日专栏付费用户数,广告位的啋光数,点击数及各环节转化率
找到所有埋点事件力ViewOperat ingPosition(广告位噤光事件)数据表vop
专栏收入表
表原数据
表字段计算
找到所有埋点字件力ClickOperalingPosi tion(广告位点击子件)数据 表cop 找到所有埋点工件为ViewCourseDetai (l 浏览课程详 t’fi)数据表vod
找到所有埋点平件为CreateOrde(r 创建订单行为)数据 表co
找到所有埋点于件为PayOrderSu ccee(d 付费行为)并且0「der_type 1=” 会员 的数据 表pay 5个表进行关联, 关联条件为日期相等(由于曝光力第—事件)所以以exp 力主表
广告位名光用户数 表vop 每日去王user_id 用户数广告位谒光次数 表vop 每日记录数
入均限光且 运宫位隄光次数/运芦位呕光用户数广告位点击人数 表cop 每日去重use「_id 用户数广告位点击次数 表cop 母日记录数
用户卢击转化率 广告位点击人数1广告位噤光用户数 商详浏览入数 表vod 每日去三user_心 用户数
APP 商洋浏览人数 表vod中 platform_type in (‘Android’,’iOS’) 每日去玉user_id 用户数
APP浏览商详用户转化率 APP 商详浏览人数1广告位点击人数创建订单用户数 表co 每日去巠use「_id 用户数
创建订单用户转化率 创建订单用户数商详浏览人数付费用户数 表pos 每日去豆user_id 用户数
付费用户转化率 付费用户数i预约用户数付费总金额 表pos sum(actual_amount) 入均付费金额 付费总金额/付费用户数
商详UV价值 入均付费金额席消浏览入数
字段 | 备注 |
---|---|
date | 统计日期 |
exp_zl_cnt | 专栏广告位曝光人数 |
exp_zl_pv | 专栏广告位曝光量 |
exp_zl_step_mean | 人均广告位曝光量 |
clk_zl_cnt | 专栏广告位点击人数 |
clk_zl_pv | 专栏广告位点击次数 |
zl_ctr_uv | 广告位点击率= 专栏广告位点击人数/专栏广告位曝光人数 |
zl_sx_uv | 浏览课程详情页人数 |
zl_sx_uv_app | APP端进入课程详情页人数 |
zl_create_order_cnt | 创建订单用户数 |
open2create_order_pp | 订单转化率 = 创建订单用户数/浏览课程详情页人数 |
pay_user_zl_cnt | 付费用户数 |
clk2pay_rate_zl | 付费用户数转化率=付费用户数/创建订单用户数 |
pay_zl_order_amount | 付费总金额 |
zl_arppu | 人均付费金额=付费总金额/付费用户数 |
value_per_sx_uv_zl | 单商详UV价值 = 付费总金额/浏览课程详情页人数 |
导入数据:
— 神策数据
create table 产品收入_专栏收入统计 as select
vop.date, exp_zl_cnt, exp_zl_pv,
if(exp_zl_cnt=0,0,exp_zl_pv/exp_zl_cnt) as exp_zl_step_mean,
clk_zl_cnt, clk_zl_pv,
if(exp_zl_cnt=0,0,clk_zl_cnt/exp_zl_cnt) as zl_ctr_uv,
zl_sx_uv, zl_sx_uv_app,
— t21.zl_sx_uv_app/t2.clk_zl_cnt as clk2open_pp, zl_create_order_cnt,
if(zl_sx_uv=0,0,zl_create_order_cnt/zl_sx_uv) as open2create_order_pp,
pay_user_zl_cnt, if(zl_create_order_cnt=0,0,pay_user_zl_cnt/zl_create_order_cnt) as clk2pay_rate_zl,
pay_zl_order_amount,
if(pay_user_zl_cnt=0,0,pay_zl_order_amount/pay_user_zl_cnt) as zl_arppu, if(zl_sx_uv=0,0,pay_zl_order_amount/zl_sx_uv) as value_per_sx_uv_zl
from (
select date, count(distinct user_id) as exp_zl_cnt, count(1) as exp_zl_pv from event_ok
— 运营位曝光
where event = ‘ViewOperatingPosition’ and operating_type=’课程列表页’
— 时间字段
and date = ‘2020-08-15’ — 待补日期,当天日期
group by date
) vop
left join (
select date , count(distinct user_id) as clk_zl_cnt,count(user_id) as clk_zl_pv
from event_ok
where event = ‘ClickOperatingPosition’ and operating_type=’课程列表页’
— 时间字段
and date = ‘2020-08-15’ — 待补日期,当天日期
group by date
) cop
on vop.date = cop.date
left join (
select
date,count(distinct user_id) as zl_sx_uv,
count(distinct case when platform_type in (‘Android’,’iOS’) then user_id else null end) as zl_sx_uv_app
from event_ok
— 浏览售前课程详情页
where event = ‘ViewCourseDetail’
— 时间字段
and date = ‘2020-08-15’ — 待补日期,当天日期
group by date
) vcd
on vop.date = vcd.date
left join (
select date , count(distinct user_id) as zl_create_order_cnt from event_ok
— 创建订单
where event = ‘CreateOrder’ and order_amount < 200
— 时间字段
and date = ‘2020-08-15’ — 待补日期,当天日期
group by date
) co
on vop.date = co.date
— 专栏付费用户
left outer join (
select date, count(distinct user_id) as pay_user_zl_cnt, sum(actual_amount) as pay_zl_order_amount
from event_ok
— 支付成功
where event = ‘PayOrderSucceed’ and actual_amount <= 200 and actual_amount>0
— 时间字段
and date = ‘2020-08-15’ — 待补日期,当天日期
group by date
) pos
on vop.date = pos.date
order by vop.date;
会员收入统计
字段 | 备注 |
---|---|
date | 统计日期 |
member_sx_cnt | 浏览会员商详用户数 |
receive_vip_user_cnt | 领取会员用户数(7日免费会员) |
receive_pp | 领取会员转化率 = 领取会员用户数/浏览会员商详用户数 |
member_clk_cnt | 会员点击支付用户数(¥688 开通365天会员) |
clk_try2pay_pp | 点击支付转化率 = 会员点击支付用户数/浏览会员商详用户数 |
pay_user_hy_cnt | 购买会员用户数 |
try2scucess_pp | 支付转化率 = 购买会员用户数/会员点击支付用户数 |
pay_hy_order_amount | 购买总金额 |
arppu_hy | 人均消费 =购买总金额/购买会员用户数 |
导入数据:
create table 产品收入_会员收入统计 as select
rvf.date,
pv_k.member_sx_cnt,— 会员商详用户数
rvf.receive_vip_user_cnt,— 领取会员用户数
if(pv_k.member_sx_cnt =0,0,rvf.receive_vip_user_cnt/pv_k.member_sx_cnt) as receive_pp, — 领取转化率
member_clk_cnt,— 会员点击支付用户数
if(member_sx_cnt=0,0,member_clk_cnt/member_sx_cnt) as clk_try2pay_pp, — 点击转化率
coalesce(pay_user_hy_cnt,0) as pay_user_hy_cnt, — 购买会员用户数if(member_clk_cnt=0,0,coalesce(pay_user_hy_cnt/member_clk_cnt,0) )as try2scucess_pp, — 支付转化率
coalesce(pay_hy_order_amount,0) as pay_hy_order_amount, — 购买总金额
if(pay_user_hy_cnt=0,0,coalesce(pay_hy_order_amount/pay_user_hy_cnt,0)) as arppu_hy — 人均消费
from (
select date ,count(distinct user_id) as receive_vip_user_cnt from event_ok
— 领取会员成功
where event = ‘ReceiveVipFreely’
— 时间字段
and date = ‘2020-08-15’ — 待补时间,当天日期
group by date
) rvf
— 进入会员详细页
left join (
select date,count(distinct user_id) as member_sx_cnt from event_ok
where event = ‘$pageview’ and $url like ‘%kaiwu.lagou.com/member%’
— 时间字段
and date = ‘2020-08-15’ — 待补时间,当天日期
group by date
— https://kaiwu.lagou.com/member/index
) pv_k
on rvf.date = pv_k.date
— 会员点击支付按钮
left join (
select date, count(distinct user_id) as member_clk_cnt from event_ok
where event = ‘$WebClick’
— and $url in (‘https://kaiwu.lagou.com/member/index','https://kaiwu.lagou.com/member/index? appType=C&lagoufrom=ios&appVersion=7.0.1&randomize=704021827’)
and $url like ‘%kaiwu.lagou.com/member%’ and $element_content = ‘¥688 开通365天会员’
— 时间字段
and date = ‘2020-08-15’ — 待补时间,当天日期
group by date
) wc_k
on rvf.date = wc_k.date
— 付费成功left join (
select date , count(distinct user_id) as pay_user_hy_cnt, sum(actual_amount) as pay_hy_order_amount
from event_ok
where event = ‘PayOrderSucceed’ and actual_amount>0 and order_type = ‘会员’
— 时间字段
and date = ‘2020-08-15’ — 待补时间,当天日期
group by date
) pay
on rvf.date = pay.date order by rvf.date;
转化分析
下载渠道付费转化率
字段 | 备注 |
---|---|
channel | 渠道 |
num | 安装人数 |
pay_user | 付费人数 |
pay_rate | 付费转化率= 付费人数/安装人数 |
total_amount | 总付费金额 |
amount_mean | 人均付费金额=总付费金额/付费人数 |
导入数据:
create table 下载渠道付费转化率 as select
start_date, end_date,
channel,num,pay_user,
if(num=0,0,pay_user/num) as pay_rate,total_amount,amount_mean from
(
select if(DownloadChannel is null ,’iOS’,DownloadChannel) as channel, count(distinct user_id) num
from event_ok
where event = ‘AppInstall’
— 时间字段
and date = DATE_SUB(‘2020-08-15’, Interval 14 DAY) — 待补日期,当天日期
group by 1
) appinstall
安装并付费的用户信息left join
(
select sub_AI.channel_2,
count(distinct sub_AI.user_id) as pay_user, sum(pay.actual_amount_total) as total_amount, if(count(distinct sub_AI.user_id) =0,0,
sum(pay.actual_amount_total)/count(distinct sub_AI.user_id)) as amount_mean, max(pay.Install_date) as end_date,
min(pay.Install_date) as start_date from
(
select user_id, if(DownloadChannel is null ,’iOS’,DownloadChannel) as channel_2,date as Install_date
from event_ok
where event = ‘AppInstall’
— 时间字段
and date = DATE_SUB(‘2020-08-15’, Interval 14 DAY) — 待补日期,当天日期
group by 1, 2,3
) sub_AI
left join (
select user_id, date as Install_date,
case when actual_amount > 200 and order_type = ‘课程’ then 1
when actual_amount >0 and actual_amount <= 200 and order_type =
‘课程’ then 2
when order_type = ‘会员’ then 3 else -1 end as income_type,
sum(actual_amount) as actual_amount_total from event_ok
where event = ‘PayOrderSucceed’
— 时间字段
and date >= DATE_SUB(‘2020-08-15’, Interval 14 DAY) and date <= ‘2020-08-15’
— 待补日期 两周转化周期
group by 1,3,2
) pay on sub_AI.user_id = pay.user_id
where pay.actual_amount_total>0 group by 1
order by 2 desc
)temp on temp.channel_2 = appinstall.channel order by total_amount desc;
统计日报与周报
作用 以每周一为日期, 统计当前周的拉新,活 跃留存, 产品收入信息
用户拉新
用户活跃留存用户在线时长
周报 表原数据
总收入统计
训练营收入统计专栏收入统计 会员收入统计
以上各表以日期进行关联
表字段计算 对所有上述表进行汇总/7, 计算—周内日均数据
字段 | 备注 |
---|---|
统计日期 | 日报为当日日期,周报为当前周周一 |
激活人数 | |
新用户次日留存人数 | |
新用户次日留存率 | |
新用户7日留存人数 | |
新用户7日留存率 | |
新用户14日留存人数 | |
新用户14日留存率 | |
DAU | |
首次活跃人数 | |
首次活跃用户占比 | |
活跃用户次日留存人数 | |
活跃用户次日留存率 | |
活跃用户7日留存人数 | |
活跃用户7日留存率 | |
活跃用户14日留存人数 | |
活跃用户14日留存率 | |
平均单日使用时长(分) | |
总付费人数 | |
总付费金额 | |
客单价 | |
付费人数_训练营 | |
付费金额_训练营 | |
客单价_训练营 | |
付费人数_专栏 | |
付费金额_专栏 | |
客单价_专栏 | |
付费人数_会员 | |
付费金额_会员 | |
客单价_会员 |
字段 | 备注 |
---|---|
曝光人数_训练营 | |
曝光次数_训练营 | |
平均曝光次数_训练营 | |
点击人数训练营app | |
点击次数训练营app | |
点击率训练营uv | |
点击率训练营pv | |
浏览商详人数_训练营 | |
浏览商详人数训练营app | |
预约人数_训练营 | |
浏览商详-预约转化率_训练营 | |
预约-付费转化率_训练营 | |
单商详uv价值_训练营 | |
曝光人数_专栏 | |
曝光次数_专栏 | |
人均曝光次数_专栏 | |
点击人数_专栏 | |
点击率专栏uv | |
点击率专栏pv | |
浏览商详人数_专栏 | |
浏览商详人数专栏app | |
下单人数_专栏 | |
浏览商详-下单转化率_专栏 | |
下单-付费转化率_专栏 | |
单商详uv价值_专栏 | |
浏览商详人数_会员 | |
领取会员人数 | |
浏览商详-领取会员转化率_会员 | |
点击支付人数_会员 |
导入数据:
日报:
create table 日报 as select
— 用户拉新表内
用户活跃_用户活跃留存.install_date as ‘日期’, u_cnt_total as ‘ 激 活 人 数 ‘, retention_u_cnt_total as ‘新用户次日留存人数’,
if(u_cnt_total =0,0,retention_u_cnt_total/u_cnt_total) as “新用户次日留存率”,
retention_u_cnt_total_7 as “ 新 用 户 7 日 留 存 人 数 “, if(u_cnt_total=0,0,retention_u_cnt_total_7/u_cnt_total) as “新用户7日留存率”,
retention_u_cnt_total_14 as “ 新 用 户 14 日 留 存 人 数 “, if(u_cnt_total=0,0,retention_u_cnt_total_14/u_cnt_total) as “新用户14日留存率”,
— 用户活跃用户活跃留存 内
用户活跃用户活跃留存.dau as “DAU”, firstdau as “首次活跃人数”,
if(用户活跃用户活跃留存.dau =0,0,firstdau/用户活跃用户活跃留存.dau) as “首次活跃用户占比”,
retentiondau_1 as “活跃用户次日留存人数”,
if(用户活跃用户活跃留存.dau=0,0,retentiondau_1/用户活跃用户活跃留存.dau) as “活跃用户次日留存率”,
retentiondau_7 as “活跃用户7日留存人数”,
if(用户活跃用户活跃留存.dau=0,0,retentiondau_7/用户活跃用户活跃留存.dau) as “活跃用户7日留存率”,
retentiondau_14 as “活跃用户14日留存人数”,
if(用户活跃用户活跃留存.dau=0,0,retentiondau_14/用户活跃用户活跃留存.dau) as “活跃用户14日留存率”,
— 用户活跃用户在线时长
if(用户活跃用户在线时长.dau=0,0,totalduration/用户活跃用户在线时长.dau/60) as “平均单日使用时长(分)”,
— 产品收入_总收入统计pay_user as “总付费人数”, total_amount as “总付费金额”,
if(pay_user=0,0,total_amount/pay_user) as “客单价”,
payuser_xly as “付费人数训练营”, totalamount_xly as “付费金额训练营”,
coalesce(totalamount_xly/pay_user_xly,0) as “客单价训练营”,
payuser_zl as “付费人数专栏”, totalamount_zl as “付费金额专栏”,
if(payuser_zl=0,0,coalesce(total_amount_zl/pay_user_zl,0)) as “客单价专栏”,
payuser_member as “付费人数会员”, totalamount_member as “付费金额会员”,
if(payuser_member <> 0,total_amount_member/pay_user_member,0) as “客单价会员”,
— 产品收入训练营收入统计exp_xly_cnt as “曝光人数训练营”, expxly_pv as “曝光次数训练营”,
if(expxly_cnt<>0, exp_xly_pv/exp_xly_cnt,0) as “平均曝光次数训练营”, clkxly_cnt as “点击人数训练营app”,
clk_xly_pv as “点击次数训练营app”,
if(exp_xly_cnt<>0, clk_xly_cnt/exp_xly_cnt,0) as “点击率训练营uv”, if(exp_xly_pv<>0, clk_xly_pv/exp_xly_pv,0) as “点击率训练营_pv”,
— 产品收入训练营收入统计
xly_sx_cnt as “浏览商详人数训练营”, xlysx_cnt_app as “浏览商详人数训练营_app”,
xlyappointment_cnt as “ 预 约 人 数 训 练 营 “, if(xlysx_cnt<>0,xly_appointment_cnt/xly_sx_cnt,0) as “浏览商详-预约转化率训练营”, if(xlyappointment_cnt<>0,pay_user_xly_cnt/xly_appointment_cnt,0) as “预约-付费转化率训练营”,
if(xlysx_cnt<>0,pay_xly_order_amount/xly_sx_cnt,0) as “单商详uv价值训练营”,
— 产品收入_专栏收入统计
expzl_cnt as “曝光人数专栏”, expzl_pv as “曝光次数专栏”,
if(expzl_cnt<>0,exp_zl_pv/exp_zl_cnt,0) as “人均曝光次数专栏”, clkzl_cnt/7 as “ 点 击 人 数 专 栏 “ , if(expzl_cnt<>0,clk_zl_cnt/exp_zl_cnt,0) as “点击率专栏uv”, if(exp_zl_pv<>0,clk_zl_pv/exp_zl_pv,0) as “点击率专栏_pv”,
zlsx_uv as “浏览商详人数专栏” , zlsx_uv_app as “浏览商详人数专栏app”, zl_create_order_cnt as “下单人数专栏”,
if(zlsx_uv<>0,zl_create_order_cnt/zl_sx_uv,0) as “浏览商详-下单转化率专栏”, if(zlcreate_order_cnt<>0,pay_user_zl_cnt/zl_create_order_cnt,0) as “下单-付费转化率专栏”,
if(zlsx_uv<>0,pay_zl_order_amount/zl_sx_uv,0) as “单商详uv价值专栏”,
— 产品收入会员收入统计
member_sx_cnt as “浏览商详人数会员”, receivevip_user_cnt as “领取会员人数”,
if(member_sx_cnt <> 0, receive_vip_user_cnt / member_sx_cnt,0) as “浏览商详-领取会员转化率会员”,
memberclk_cnt as “点击支付人数会员” from 用户活跃用户活跃留存
left join (select DISTINCT install_date, u_cnt_total, — 日安装用户总数retention_u_cnt_total, — 次日留存用户总数retention_u_cnt_total_7, — 日留存用户总书
retention_u_cnt_total_14 — 日留存用户总书
from 用户拉新) as 新用户
on 用户活跃用户活跃留存.Installdate = 新用户.install_date
left join 用户活跃用户在线时长 on 用户活跃_用户在线时长.Install_date = 新用户.install_date
left join 产品收入总收入统计 on 产品收入总收入统计.Installdate = 新用户.install_date
left join 产品收入训练营收入统计 on 产品收入训练营收入统计.date = 新用户.install_date left join 产品收入专栏收入统计 on 产品收入专栏收入统计.date = 新用户.install_date left join 产品收入会员收入统计 on 产品收入会员收入统计.date = 新用户.install_date where 用户活跃用户活跃留存.install_date = ‘2020-08-15’; — 待补时间,当天日期
周报:
create table 周报 as select
— 用户拉新表内
DATESUB(用户活跃用户活跃留存.installdate,INTERVAL WEEKDAY(用户活跃用户活跃留存.install_date) DAY) as ‘当周周一’,
sum(u_cnt_total)/7 as ‘ 激 活 人 数 ‘, sum(retention_u_cnt_total)/7 as ‘新用户次日留存人数’,
if(sum(u_cnt_total)=0,0,sum(retention_u_cnt_total)/sum(u_cnt_total)) as “新用户次日留存率”,
sum(retention_u_cnt_total_7)/7 as “ 新 用 户 7 日 留 存 人 数 “, if(sum(u_cnt_total)=0,0,sum(retention_u_cnt_total_7)/sum(u_cnt_total)) as “新用户7日留存率”,
sum(retention_u_cnt_total_14)/7 as “ 新 用 户 14 日 留 存 人 数 “, if(sum(u_cnt_total)=0,0,sum(retention_u_cnt_total_14)/sum(u_cnt_total)) as “新用户14日留存率”,
— 用户活跃用户活跃留存 内
sum(用户活跃用户活跃留存.dau)/7 as “DAU”, sum(firstdau)/7 as “首次活跃人数”,
if(sum(用户活跃用户活跃留存.dau)=0,0,sum(firstdau)/sum(用户活跃用户活跃留存.dau)) as “首次活跃用户占比”,
sum(retentiondau_1)/7 as “活跃用户次日留存人数”,
if(sum(用户活跃用户活跃留存.dau) =0,0,sum(retentiondau_1)/sum(用户活跃用户活跃留存.dau)) as “活跃用户次日留存率”,
sum(retentiondau_7)/7 as “活跃用户7日留存人数”,
if(sum(用户活跃用户活跃留存.dau)=0,0,sum(retentiondau_7)/sum(用户活跃用户活跃留 存.dau)) as “活跃用户7日留存率”,
sum(retentiondau_14)/7 as “活跃用户14日留存人数”,
if(sum(用户活跃用户活跃留存.dau)=0,0,sum(retentiondau_14)/sum(用户活跃用户活跃留 存.dau)) as “活跃用户14日留存率”,
— 用户活跃用户在线时长
if(sum(用户活跃用户在线时长.dau)=0,0,sum(totalduration)/sum(用户活跃用户在线时 长.dau)/60) as “平均单日使用时长(分)”,
— 产品收入_总收入统计sum(pay_user)/7 as “总付费人数”, sum(total_amount)/7 as “总付费金额”,
if(sum(pay_user)=0,0,sum(total_amount)/sum(pay_user)) as “客单价”,
sum(payuser_xly)/7 as “付费人数训练营”, sum(totalamount_xly) /7 as “付费金额训练营”,
if(sum(payuser_xly)=0,0,coalesce(sum(total_amount_xly)/sum(pay_user_xly),0)) as “客单价训练营”,
sum(payuser_zl)/7 as “付费人数专栏”, sum(totalamount_zl)/7 as “付费金额专栏”,
if(sum(payuser_zl)=0,0,coalesce(sum(total_amount_zl)/sum(pay_user_zl),0)) as “客单价专栏”,
sum(payuser_member)/7 as “付费人数会员”, sum(totalamount_member)/7 as “付费金额会员”,
if(sum(payuser_member) <> 0,sum(total_amount_member)/sum(pay_user_member),0) as “客单价会员”,
— 产品收入训练营收入统计sum(exp_xly_cnt)/7 as “曝光人数训练营”, sum(expxly_pv)/7 as “曝光次数训练营”,
if(sum(expxly_cnt)<>0, sum(exp_xly_pv)/sum(exp_xly_cnt),0) as “平均曝光次数训练营”,
sum(clkxly_cnt)/7 as “点击人数训练营app”, sum(clk_xly_pv)/7 as “点击次数训练营app”,
if(sum(exp_xly_cnt)<>0, sum(clk_xly_cnt)/sum(exp_xly_cnt),0) as “点击率训练营
uv”,
if(sum(exp_xly_pv)<>0, sum(clk_xly_pv)/sum(exp_xly_pv),0) as “点击率训练营_pv”,
— 产品收入训练营收入统计
sum(xly_sx_cnt)/7 as “浏览商详人数训练营”, sum(xlysx_cnt_app)/7 as “浏览商详人数训练营_app”,
sum(xlyappointment_cnt)/7 as “ 预 约 人 数 训 练 营 “, if(sum(xlysx_cnt)<>0,sum(xly_appointment_cnt)/sum(xly_sx_cnt),0) as “浏览商详-预约 转 化 率 训 练 营 “, if(sum(xlyappointment_cnt)<>0,sum(pay_user_xly_cnt)/sum(xly_appointment_cnt),0) as “ 预 约 - 付 费 转 化 率 训 练 营 “, if(sum(xlysx_cnt)<>0,sum(pay_xly_order_amount)/sum(xly_sx_cnt),0) as “单商详uv价值训练营”,
— 产品收入_专栏收入统计
sum(expzl_cnt)/7 as “曝光人数专栏”, sum(expzl_pv)/7 as “曝光次数专栏”,
if(sum(expzl_cnt)<>0,sum(exp_zl_pv)/sum(exp_zl_cnt),0) as “人均曝光次数专栏”, sum(clkzl_cnt)/7 as “ 点 击 人 数 专 栏 “ , if(sum(expzl_cnt)<>0,sum(clk_zl_cnt)/sum(exp_zl_cnt),0) as “点击率专栏uv”, if(sum(exp_zl_pv)<>0,sum(clk_zl_pv)/sum(exp_zl_pv),0) as “点击率专栏_pv”,
sum(zlsx_uv)/7 as “浏览商详人数专栏” , sum(zlsx_uv_app)/7 as “浏览商详人数专栏app”, sum(zl_create_order_cnt)/7 as “下单人数专栏”,
if(sum(zlsx_uv)<>0,sum(zl_create_order_cnt)/sum(zl_sx_uv),0) as “浏览商详-下单转化率专栏”,
if(sum(zlcreate_order_cnt)<>0,sum(pay_user_zl_cnt)/sum(zl_create_order_cnt),0) as “下单-付费转化率专栏”,
if(sum(zlsx_uv)<>0,sum(pay_zl_order_amount)/sum(zl_sx_uv),0) as “单商详uv价值专栏”,
— 产品收入会员收入统计
sum(member_sx_cnt)/7 as “浏览商详人数会员”, sum(receivevip_user_cnt)/7 as “领取会员人数”,
if(sum(member_sx_cnt) <> 0,sum(receive_vip_user_cnt)/sum(member_sx_cnt),0) as “浏览商详-领取会员转化率会员”,
sum(memberclk_cnt)/7 as “点击支付人数会员”
from 用户活跃用户活跃留存
left join
(select DISTINCT install_date, u_cnt_total, — 日安装用户总数retention_u_cnt_total, — 次日留存用户总数retention_u_cnt_total_7, — 日留存用户总书retention_u_cnt_total_14 — 日留存用户总书
from 用户拉新) as 新用户 on 用户活跃用户活跃留存.Installdate = 新用户.install_date left join 用户活跃用户在线时长 on 用户活跃用户在线时长.Install_date = 新用
户.install_date
left join 产品收入总收入统计 on 产品收入总收入统计.Install_date = 新用户.install_date
left join 产品收入训练营收入统计 on 产品收入训练营收入统计.date = 新用户.install_date left join 产品收入专栏收入统计 on 产品收入专栏收入统计.date = 新用户.install_date left join 产品收入会员收入统计 on 产品收入会员收入统计.date = 新用户.install_date where 用户活跃用户活跃留存.installdate >= ‘2020-08-09’ and 用户活跃用户活跃留
存.install_date <= ‘2020-08-15’ — 待补时间,补充时间段要为周一到周日
group by 1;
调度系统(见任务三讲义)
- 报表系统(见任务三讲义)