作业:(满分:100分)
一、建表、统计与程序异常分析(60分)
某公司地区业务有两张门店相关的表(数据见:homework.xlsx):
一张为门店信息表:
表名:area_table
字段内容 字段名 字段类型 长度
店铺id store_id varchar 10
店铺位置 area varchar 20
店长id leader_id int
第二张为门店销售信息表:
表名:store_table
字段内容 字段名 字段类型 长度
店铺id store_id varchar 10
订单id order_id varchar 12
销量 sales_volume int
销售日期 salesdate date
1.建表:(20分)
为以上两张表建立Mysql表格,并导入数据
2.统计(20分)
2.1.统计每日每个店铺的销量(10分)
2.2.统计每日每个区域的销量(10分)
3.程序异常(20分)
本公司有报表需要展示区域“alpha”的前一日所有店铺销量,示例如下:
店铺 昨日销量
store1 124
store2 532
store3 325
store4 665
某天由于“store3”停业,当日“store3”在store_table表中没有数据。
当第二天需要出昨日报表数据时。
报表原本预想展示的数据如下:
店铺 昨日销量
store1 168
store2 480
store3 0
store4 720
但是当日报表实际展示为:
店铺 昨日销量
store1 168
store2 480
store4 720
后台代码为:
select
a.store_id
,IFNULL(sum(sales_volume),0) as sales_volumes
from area_table a
left join
store_table b
on a.store_id=b.store_id
where a.area=’alpha’
and b.salesdate=DATE_SUB(curdate(),INTERVAL 1 DAY)
GROUP BY a.store_id
order by a.store_id;
注: DATE_SUB(curdate(),INTERVAL 1 DAY) 该函数意为:求出昨天的日期;
实现过程:1.使用curdate()求出当天日期;
2.使用DATE_SUB(当天日期,INTERVAL 1 DAY)实现当天日期减一天的日期,即昨日日期;
3.1.请分析代码为什么无法执行出期望结果?(10分)
3.2.请修改代码,产出报表的设想结果。(10分)
二、SQL语句阅读能力(40分)
能够读懂业务SQL是一个数据分析师的基本技能。掌握SQL的执行顺序,才能正确地读懂SQL语句。
执行顺序
With as —> FROM(JOIN 部分一般先左后右) —> WHERE —> GROUP BY —> HAVING —> SELECT —> ORDER BY
每个部分中,先执行子查询内部,再将子查询看做一个整体,按普通顺序执行。
多个子查询嵌套,最先执行最内部的子查询。
— 1.请按如下方式写出SQL语句的执行顺序
— 例子:
select uclass,sum(money) as uclass_money
from user_table a
join
(select
uid
,money
from user_earn_table
where money is not null
) b
on a.uid = b.uid
group by uclass
order by uclass_money desc
limit 100;
拆解示意:
1.from user_earn_table
2.where money is not null
3.select
uid,money
4.from user_table a
5.join
(select
uid,money
from user_earn_table
where money is not null
) b
on a.uid = b.uid
6.group by uclass
7.select uclass,sum(money) as uclass_money
8.order by uclass_money desc
9.limit 100;
— 请拆解如下语句:
— 本语句为某视频网站公司对于k-pop业务线的各渠道用户、新增用户数据抽取语句
with temp_active_hour_table_kps as
(
select
a0.dt
,product_id
,mkey
,substr(FROM_UNIXTIME(st_time),12,2) as hour
,a0.device_id
from
(select
dt
,product_id
,st_time
,device_id
from kps_dwd.kps_dwd_dd_view_user_active
where dt=’${dt_1}’
) a0
left join
(select
dt
,mkey
,device_id
from kps_dwd.kps_dwd_dd_user_channels
where dt=’${dt_1}’
) a1
on a0.device_id = a1.device_id
)
select
dt
,product
,product_id
,a1.mkey
,name_cn
,hour
,status
,dau
,new
from
(select
dt
,'K-pop' as product
,product_id
,mkey
,hour
,status
,count(distinct a.device_id) as dau
,count(distinct if(b.device_id is not null,a.device_id,null)) as new
from
(select
dt
,product_id
,mkey
,hour
,device_id
,'active' as status
from temp_active_hour_table_kps
group by dt,mkey,product_id,device_id,hour
union all
select
dt
,product_id
,mkey
,min(hour) as hour
,device_id
,'first' as status
from temp_active_hour_table_kps
group by dt,mkey,product_id,device_id
) a
left join
(
select
dt
,device_id
from kps_dwd.kps_dwd_dd_fact_view_new_user
where dt='${dt_1}'
group by dt,device_id
)b on a.dt=b.dt and a.device_id = b.device_id
group by dt
,product_id
,mkey
,hour
,status
) a1
left join
asian_channel.dict_lcmas_channel b1
on a1.mkey = b1.mkey;
— 拆解步骤如下,请将下列横线处拆解内容补充完整
— 以下题4个空,每空10分
1.from kps_dwd.kps_dwd_dd_view_user_active
2.where dt=’${dt_1}’
3.select
dt
,product_id
,st_time
,device_id
4.from
(……) a0
5.left join
(select
dt
,mkey
,device_id
from kps_dwd.kps_dwd_dd_user_channels
where dt='${dt_1}'
) a1
on a0.device_id = a1.device_id
6.__
7.select
dt
,product_id
,mkey
,hour
,device_id
,'active' as status
from temp_active_hour_table_kps
group by dt,mkey,product_id,device_id,hour
8.__
9.from
(……) a
10.__
11.from
(select
dt
,'K-pop' as product
,product_id
,mkey
,hour
,status
,count(distinct a.device_id) as dau
,count(distinct if(b.device_id is not null,a.device_id,null)) as new
……
group by dt
,product_id
,mkey
,hour
,status
) a1
12.__
13.select
dt
,product
,product_id
,a1.mkey
,name_cn
,hour
,status
,dau
,new