1. 组合两个表

需求:编写一个 SQL 查询,对两表进行关联,展示列为:
FirstName, LastName, City, State

展示效果:

FirstName LastName City State
Allen Wang New York City New York
  1. Create table If Not Exists 1_Person (PersonId int, FirstName varchar(255), LastName varchar(255));
  2. Create table If Not Exists 1_Address (AddressId int, PersonId int, City varchar(255), State varchar(255));
  3. Truncate table 1_Person;
  4. Truncate table 1_Address;
  5. insert into 1_Person (PersonId, LastName, FirstName) values (1, 'Wang', 'Allen');
  6. insert into 1_Address (AddressId, PersonId, City, State) values (1, 1, 'New York City', 'New York');

最终SQL:

  1. select
  2. p.FirstName,
  3. p.LastName,
  4. a.City,
  5. a.State
  6. from
  7. 1_Person as p
  8. left join
  9. 1_Address as a
  10. on
  11. p.PersonId = a.PersonId;

2. 第二高的薪水

需求一:编写一个 SQL 查询,获取 Employee 表中第二高的薪水(Salary)。如果不存在第二高的薪水,那么查询应返回 null。

展示效果:

SecondHighestSalary
200

建表语句:

  1. Create table If Not Exists 2_Employee (Id int, Salary int);
  2. Truncate table 2_Employee;
  3. insert into 2_Employee (Id, Salary) values (1, 100);
  4. insert into 2_Employee (Id, Salary) values (2, 200);
  5. insert into 2_Employee (Id, Salary) values (3, 300);

最终SQL:

  1. -- 方法一:
  2. select
  3. IFNULL((select
  4. DISTINCT Salary
  5. from
  6. 2_Employee
  7. order by
  8. Salary DESC
  9. limit 1,1
  10. ), NULL) as SecondHighestSalary;
  11. -- 方法二:
  12. select
  13. max(Salary) as SecondHighestSalary
  14. from
  15. 2_Employee
  16. where
  17. Salary < (select
  18. max(Salary)
  19. from
  20. 2_Employee
  21. );
  22. -- 方法三:
  23. select
  24. max(e1.salary) as SecondHighestSalary
  25. from
  26. 2_Employee e1,
  27. 2_Employee e2
  28. group by
  29. e1.id
  30. having
  31. sum(if(e1.salary > e2.salary,1,0)) = 1;

提示:LIMIT 子句可以被用于强制 SELECT 语句返回指定的记录数。LIMIT 接受一个或两个数字参数。参数必须是一个整数常量。如果给定两个参数,第一个参数指定第一个返回记录行的偏移量,第二个参数指定返回记录行的最大数目。

需求二:编写一个 SQL 查询,获取 Employee 表中第 n 高的薪水(Salary)。

  1. -- 方法一:
  2. CREATE FUNCTION getNthHighestSalary_1(N INT) RETURNS INT
  3. BEGIN
  4. SET n = N-1;
  5. RETURN (
  6. SELECT DISTINCT Salary FROM 2_Employee ORDER BY Salary DESC LIMIT n,1
  7. );
  8. END;
  9. select getNthHighestSalary_1(2) ;
  10. -- 方案二:
  11. CREATE FUNCTION getNthHighestSalary_2(N INT) RETURNS INT
  12. BEGIN
  13. RETURN (
  14. SELECT IF(count<N,NULL,min)
  15. FROM
  16. (SELECT
  17. MIN(Salary) AS min, COUNT(1) AS count
  18. FROM
  19. (SELECT
  20. DISTINCT Salary
  21. FROM
  22. 2_Employee
  23. ORDER BY
  24. Salary DESC
  25. LIMIT N) AS a
  26. ) as b
  27. );
  28. END;
  29. select getNthHighestSalary_2(2) ;

3. 分数排名

需求:编写一个 SQL 查询来实现分数排名。如果两个分数相同,则两个分数排名(Rank)相同。请注意,平分后的下一个名次应该是下一个连续的整数值。换句话说,名次之间不应该有“间隔”。

展示效果:

Score Rank
4.00 1
4.00 1
3.85 2
3.65 3
3.65 3
3.50 4
  1. Create table If Not Exists 3_Scores (Id int, Score DECIMAL(3,2));
  2. Truncate table 3_Scores;
  3. insert into 3_Scores (Id, Score) values (1, 3.5);
  4. insert into 3_Scores (Id, Score) values (2, 3.65);
  5. insert into 3_Scores (Id, Score) values (3, 4.0);
  6. insert into 3_Scores (Id, Score) values (4, 3.85);
  7. insert into 3_Scores (Id, Score) values (5, 4.0);
  8. insert into 3_Scores (Id, Score) values (6, 3.65);

最终SQL:

  1. -- 方法一:
  2. select
  3. a.Score as score ,
  4. (select
  5. count(distinct b.Score)
  6. from
  7. 3_Scores b
  8. where
  9. b.Score >=a.Score) as `rank`
  10. from
  11. 3_Scores a
  12. order by
  13. Score DESC;
  14. -- 方法二:
  15. select
  16. Score,
  17. dense_rank() over(order by Score desc) `rank`
  18. from
  19. 3_Scores;

4. 连续出现的数字

需求:编写一个 SQL 查询,查找所有至少连续出现三次的数字。

展示效果:

ConsecutiveNums
1
  1. Create table If Not Exists 4_Logs (Id int, Num int);
  2. Truncate table 4_Logs;
  3. insert into 4_Logs (Id, Num) values (1, 1);
  4. insert into 4_Logs (Id, Num) values (2, 1);
  5. insert into 4_Logs (Id, Num) values (3, 1);
  6. insert into 4_Logs (Id, Num) values (4, 2);
  7. insert into 4_Logs (Id, Num) values (5, 1);
  8. insert into 4_Logs (Id, Num) values (6, 2);
  9. insert into 4_Logs (Id, Num) values (7, 2);

最终SQL:

  1. -- 方法一:
  2. SELECT
  3. l1.Num
  4. FROM
  5. 4_Logs l1,
  6. 4_Logs l2,
  7. 4_Logs l3
  8. WHERE
  9. l1.Id = l2.Id - 1 AND l1.Num = l2.Num
  10. AND l2.Id = l3.Id - 1 AND l2.Num = l3.Num;
  11. -- 方法二:
  12. SELECT
  13. l1.Num
  14. FROM
  15. 4_Logs l1
  16. left join
  17. 4_Logs l2
  18. on
  19. l1.Id = l2.Id - 1
  20. left join
  21. 4_Logs l3
  22. on
  23. l2.Id = l3.Id - 1
  24. where
  25. l1.num = l2.num and l2.num = l3.num;
  26. -- 方法三:
  27. select distinct Num ConsecutiveNums
  28. from
  29. (select
  30. Num,
  31. lead(Num,1,null) over(order by id) n2,
  32. lead(Num,2,null) over(order by id) n3
  33. from 4_Logs
  34. )t1
  35. where Num = n2 and Num = n3;

5. 超过经理收入的员工

需求:Employee 表包含所有员工,他们的经理也属于员工。每个员工都有一个 Id,此外还有一列对应员工的经理的 Id。编写一个 SQL 查询,该查询可以获取收入超过他们经理的员工的姓名。

数据样式:

Id Name Salary ManagerId
1 Joe 70000 3
2 Henry 80000 4
3 Sam 60000 null
4 Max 90000 null

展示效果:

Employee
Joe
  1. create table If Not Exists 5_Employee (Id int, Name varchar(255), Salary int, ManagerId int);
  2. truncate table 5_Employee;
  3. insert into 5_Employee (Id, Name, Salary, ManagerId) values (1, 'Joe', 70000, 3);
  4. insert into 5_Employee (Id, Name, Salary, ManagerId) values (2, 'Henry', 80000, 4);
  5. insert into 5_Employee (Id, Name, Salary, ManagerId) values (3, 'Sam', 60000, null);
  6. insert into 5_Employee (Id, Name, Salary, ManagerId) values (4, 'Max', 90000, null);

最终SQL:

  1. SELECT
  2. a.NAME AS Employee
  3. FROM
  4. 5_Employee AS a
  5. JOIN
  6. 5_Employee AS b
  7. ON
  8. a.ManagerId = b.Id
  9. AND
  10. a.Salary > b.Salary;

6. 查找重复的邮箱

需求:编写一个 SQL 查询,查找 Person 表中所有重复的电子邮箱。

展示效果:

Email
a@b.com
  1. Create table If Not Exists 6_Person (Id int, Email varchar(255));
  2. Truncate table 6_Person;
  3. insert into 6_Person (Id, Email) values (1, 'a@b.com');
  4. insert into 6_Person (Id, Email) values (2, 'c@d.com');
  5. insert into 6_Person (Id, Email) values (3, 'a@b.com');

最终SQL:

  1. select
  2. Email
  3. from
  4. 6_Person
  5. group by
  6. Email
  7. having
  8. count(Email) > 1;

7. 从不订购的客户

需求:某网站包含两个表,Customers 表和 Orders 表。编写一个 SQL 查询,找出所有从不订购任何东西的客户。

展示效果:

Customers
Henry
Max
  1. Create table If Not Exists 7_Customers (Id int, Name varchar(255));
  2. Create table If Not Exists 7_Orders (Id int, CustomerId int);
  3. Truncate table 7_Customers;
  4. insert into 7_Customers (Id, Name) values (1, 'Joe');
  5. insert into 7_Customers (Id, Name) values (2, 'Henry');
  6. insert into 7_Customers (Id, Name) values (3, 'Sam');
  7. insert into 7_Customers (Id, Name) values (4, 'Max');
  8. Truncate table 7_Orders;
  9. insert into 7_Orders (Id, CustomerId) values (1, 3);
  10. insert into 7_Orders (Id, CustomerId) values (2, 1);

最终SQL:

  1. -- 方法一:
  2. select
  3. c.name as 'Customers'
  4. from
  5. 7_Customers as c
  6. where
  7. c.id not in(select customerid from 7_Orders);
  8. -- 方法二:
  9. select
  10. c.Name Customers
  11. from
  12. 7_Customers c
  13. left join
  14. 7_Orders o
  15. on
  16. c.id = o.CustomerId
  17. where
  18. o.id is null;

8. 部门工资最高的员工

需求一:编写一个 SQL 查询,找出每个部门工资最高的员工。例如,根据上述给定的表格,Max 在 IT 部门有最高工资,Henry 在 Sales 部门有最高工资。

展示效果:

Department Employee Salary
IT Jim 90000
IT Max 90000
Sales Henry 80000
  1. Create table If Not Exists 8_Employee (Id int, Name varchar(255), Salary int, DepartmentId int);
  2. Create table If Not Exists 8_Department (Id int, Name varchar(255));
  3. Truncate table 8_Employee;
  4. insert into 8_Employee (Id, Name, Salary, DepartmentId) values (1, 'Joe', 75000, 1);
  5. insert into 8_Employee (Id, Name, Salary, DepartmentId) values (2, 'Jim', 90000, 1);
  6. insert into 8_Employee (Id, Name, Salary, DepartmentId) values (3, 'Henry', 80000, 2);
  7. insert into 8_Employee (Id, Name, Salary, DepartmentId) values (4, 'Sam', 60000, 2);
  8. insert into 8_Employee (Id, Name, Salary, DepartmentId) values (5, 'Max', 90000, 1);
  9. insert into 8_Employee (Id, Name, Salary, DepartmentId) values (6, 'Randy', 85000, 1);
  10. insert into 8_Employee (Id, Name, Salary, DepartmentId) values (7, 'Will', 70000, 1);
  11. Truncate table 8_Department;
  12. insert into 8_Department (Id, Name) values (1, 'IT');
  13. insert into 8_Department (Id, Name) values (2, 'Sales');

最终SQL:

  1. -- 方法一:
  2. SELECT
  3. d.name AS 'Department',
  4. e.name AS 'Employee',
  5. Salary
  6. FROM
  7. 8_Employee as e
  8. JOIN
  9. 8_Department as d
  10. ON
  11. e.DepartmentId = d.Id
  12. WHERE
  13. (e.DepartmentId , Salary) IN
  14. ( SELECT
  15. DepartmentId, MAX(Salary)
  16. FROM
  17. 8_Employee
  18. GROUP BY DepartmentId
  19. );
  20. -- 方法二:
  21. select
  22. Department,Employee,Salary
  23. from
  24. (select
  25. d.Name Department,
  26. e.Name Employee,
  27. e.Salary,
  28. rank() over(partition by d.id order by Salary desc) rk
  29. from
  30. 8_Employee e
  31. join
  32. 8_Department d
  33. on e.DepartmentId=d.id
  34. )tmp
  35. where rk = 1

需求二:编写一个 SQL 查询,找出每个部门获得前三高工资的所有员工。

展示效果:

Department Employee Salary
IT Max 90000
IT Jim 90000
IT Randy 85000
IT Joe 75000
Sales Henry 80000
Sales Sam 60000

最终SQL:

  1. -- 方法一:
  2. SELECT
  3. d.Name AS 'Department', e1.Name AS 'Employee', e1.Salary
  4. FROM
  5. 8_Employee e1
  6. JOIN
  7. 8_Department d
  8. ON
  9. e1.DepartmentId = d.Id
  10. WHERE
  11. (SELECT
  12. COUNT(DISTINCT e2.Salary)
  13. FROM
  14. 8_Employee e2
  15. WHERE
  16. e1.Salary < e2.Salary
  17. AND
  18. e1.DepartmentId = e2.DepartmentId) < 3
  19. order by Department, e1.salary desc;
  20. -- 方法二:
  21. select
  22. Department,
  23. Employee,
  24. Salary
  25. from
  26. (select
  27. d.Name Department,
  28. e.Name Employee,
  29. e.Salary,
  30. dense_rank() over(partition by d.id order by Salary desc) rk
  31. from
  32. 8_Employee e
  33. join
  34. 8_Department d
  35. on
  36. e.DepartmentId=d.id
  37. )tmp
  38. where rk <=3

9. 删除重复的电子邮箱

需求:编写一个 SQL 查询,来删除 Person 表中所有重复的电子邮箱,重复的邮箱里只保留 Id 最小 的那个。

展示效果:

Id Email
1 john@example.com
2 bob@example.com
  1. Create table If Not Exists 9_Person (Id int, email varchar(255));
  2. Truncate table 9_Person;
  3. insert into 9_Person (Id, email) values (1, 'john@example.com');
  4. insert into 9_Person (Id, email) values (2, 'bob@example.com');
  5. insert into 9_Person (Id, email) values (3, 'john@example.com');

最终SQL:

  1. DELETE
  2. p1
  3. FROM
  4. 9_Person p1,
  5. 9_Person p2
  6. WHERE
  7. p1.Email = p2.Email AND p1.Id > p2.Id;

10. 上升的温度

需求:编写一个 SQL 查询,来查找与之前(昨天的)日期相比温度更高的所有日期的 Id。

Id
2
4
  1. Create table If Not Exists 10_Weather (Id int, RecordDate date, Temperature int);
  2. Truncate table 10_Weather;
  3. insert into 10_Weather (Id, RecordDate, Temperature) values (1, '2015-01-01', 10);
  4. insert into 10_Weather (Id, RecordDate, Temperature) values (2, '2015-01-02', 25);
  5. insert into 10_Weather (Id, RecordDate, Temperature) values (3, '2015-01-03', 20);
  6. insert into 10_Weather (Id, RecordDate, Temperature) values (4, '2015-01-04', 30);

最终SQL:

  1. -- 方法一:
  2. SELECT
  3. w1.id AS 'Id'
  4. FROM
  5. 10_Weather w1
  6. JOIN
  7. 10_Weather w2
  8. ON
  9. DATEDIFF(w1.RecordDate, w2.RecordDate) = 1
  10. AND w1.Temperature > w2.Temperature;
  11. -- 方法二:
  12. select
  13. Id
  14. from
  15. (select
  16. Id,
  17. RecordDate,
  18. Temperature,
  19. lag(RecordDate,1,9999-99-99) over (order by RecordDate) yd,
  20. lag(Temperature,1,999) over(order by RecordDate ) yt
  21. from
  22. 10_Weather
  23. )tmp
  24. where
  25. Temperature > yt and datediff(RecordDate,yd)=1;

11. 行程和用户

需求:写一段 SQL 语句查出 2019年10月1日 至 2019年10月3日 期间非禁止用户的取消率。基于上表,你的 SQL 语句应返回如下结果,取消率(Cancellation Rate)保留两位小数。取消率的计算方式如下:(被司机或乘客取消的非禁止用户生成的订单数量) / (非禁止用户生成的订单总数)

Trips表:所有出租车的行程信息。每段行程有唯一键 Id,Client_Id 和 Driver_Id 是 Users 表中 Users_Id 的外键。Status 是枚举类型,枚举成员为 (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’)。

Id Client_Id Driver_Id City_Id Status Request_at
1 1 10 1 completed 2019-10-01
2 2 11 1 cancelled_by_driver 2019-10-01
3 3 12 6 completed 2019-10-01
4 4 13 6 cancelled_by_client 2019-10-01
5 1 10 1 completed 2019-10-02
6 2 11 6 completed 2019-10-02
7 3 12 6 completed 2019-10-02
8 2 12 12 completed 2019-10-03
9 3 10 12 completed 2019-10-03
10 4 13 12 cancelled_by_driver 2019-10-03

Users 表存所有用户。每个用户有唯一键 Users_Id。Banned 表示这个用户是否被禁止,Role 则是一个表示(‘client’, ‘driver’, ‘partner’)的枚举类型。

Users_Id Banned Role
1 No client
2 Yes client
3 No client
4 No client
10 No driver
11 No driver
12 No driver
13 No driver

展示效果:

Day Cancellation Rate
2019-10-01 0.33
2019-10-02 0.00
2019-10-03 0.50
  1. Create table If Not Exists 11_Trips (Id int, Client_Id int, Driver_Id int, City_Id int, Status ENUM('completed', 'cancelled_by_driver', 'cancelled_by_client'), Request_at varchar(50));
  2. Create table If Not Exists 11_Users (Users_Id int, Banned varchar(50), Role ENUM('client', 'driver', 'partner'));
  3. Truncate table 11_Trips;
  4. insert into 11_Trips (Id, Client_Id, Driver_Id, City_Id, Status, Request_at) values (1, 1, 10, 1, 'completed', '2019-10-01');
  5. insert into 11_Trips (Id, Client_Id, Driver_Id, City_Id, Status, Request_at) values (2, 2, 11, 1, 'cancelled_by_driver', '2019-10-01');
  6. insert into 11_Trips (Id, Client_Id, Driver_Id, City_Id, Status, Request_at) values (3, 3, 12, 6, 'completed', '2019-10-01');
  7. insert into 11_Trips (Id, Client_Id, Driver_Id, City_Id, Status, Request_at) values (4, 4, 13, 6, 'cancelled_by_client', '2019-10-01');
  8. insert into 11_Trips (Id, Client_Id, Driver_Id, City_Id, Status, Request_at) values (5, 1, 10, 1, 'completed', '2019-10-02');
  9. insert into 11_Trips (Id, Client_Id, Driver_Id, City_Id, Status, Request_at) values (6, 2, 11, 6, 'completed', '2019-10-02');
  10. insert into 11_Trips (Id, Client_Id, Driver_Id, City_Id, Status, Request_at) values (7, 3, 12, 6, 'completed', '2019-10-02');
  11. insert into 11_Trips (Id, Client_Id, Driver_Id, City_Id, Status, Request_at) values (8, 2, 12, 12, 'completed', '2019-10-03');
  12. insert into 11_Trips (Id, Client_Id, Driver_Id, City_Id, Status, Request_at) values (9, 3, 10, 12, 'completed', '2019-10-03');
  13. insert into 11_Trips (Id, Client_Id, Driver_Id, City_Id, Status, Request_at) values (10, 4, 13, 12, 'cancelled_by_driver', '2019-10-03');
  14. Truncate table 11_Users;
  15. insert into 11_Users (Users_Id, Banned, Role) values (1, 'No', 'client');
  16. insert into 11_Users (Users_Id, Banned, Role) values (2, 'Yes', 'client');
  17. insert into 11_Users (Users_Id, Banned, Role) values (3, 'No', 'client');
  18. insert into 11_Users (Users_Id, Banned, Role) values (4, 'No', 'client');
  19. insert into 11_Users (Users_Id, Banned, Role) values (10, 'No', 'driver');
  20. insert into 11_Users (Users_Id, Banned, Role) values (11, 'No', 'driver');
  21. insert into 11_Users (Users_Id, Banned, Role) values (12, 'No', 'driver');
  22. insert into 11_Users (Users_Id, Banned, Role) values (13, 'No', 'driver');

最终SQL:

  1. 方法一:
  2. SELECT
  3. T.request_at AS `Day`,
  4. ROUND(
  5. SUM(IF(T.STATUS = 'completed',0,1))/ COUNT(T.STATUS),
  6. 2
  7. ) AS `Cancellation Rate`
  8. FROM
  9. 11_Trips AS T
  10. JOIN
  11. 11_Users AS U1
  12. ON
  13. T.client_id = U1.users_id AND U1.banned ='No'
  14. WHERE
  15. T.request_at BETWEEN '2019-10-01' AND '2019-10-03'
  16. GROUP BY
  17. T.request_at;

12. 游戏玩法分析

需求一:写一条 SQL 查询语句获取每位玩家 第一次登陆平台的日期。

Activity表:显示了某些游戏的玩家的活动情况。

player_id device_id event_date games_played
1 2 2016-03-01 5
1 2 2016-03-02 6
2 3 2017-06-25 1
3 1 2016-03-01 0
3 4 2018-07-03 5

展示效果:

player_id first_login
1 2016-03-01
2 2017-06-25
3 2016-03-02
  1. Create table If Not Exists 12_Activity (player_id int, device_id int, event_date date, games_played int);
  2. Truncate table 12_Activity;
  3. insert into 12_Activity (player_id, device_id, event_date, games_played) values (1, 2, '2016-03-01', 5);
  4. insert into 12_Activity (player_id, device_id, event_date, games_played) values (1, 2, '2016-03-02', 6);
  5. insert into 12_Activity (player_id, device_id, event_date, games_played) values (2, 3, '2017-06-25', 1);
  6. insert into 12_Activity (player_id, device_id, event_date, games_played) values (3, 1, '2016-03-01', 0);
  7. insert into 12_Activity (player_id, device_id, event_date, games_played) values (3, 4, '2018-07-03', 5);

最终SQL:

  1. select
  2. player_id,
  3. min(event_date) as first_login
  4. from
  5. 12_Activity
  6. group by
  7. player_id;

需求二:描述每一个玩家首次登陆的设备名称

player_id device_id
1 2
2 3
3 1

最终SQL:

  1. -- 方法一:
  2. select
  3. player_id,
  4. device_id
  5. from
  6. 12_Activity
  7. where
  8. (player_id,event_date) in (select
  9. player_id,
  10. min(event_date)
  11. from
  12. 12_Activity
  13. group by
  14. player_id);
  15. -- 方法二:
  16. select
  17. player_id,
  18. device_id
  19. from
  20. (select
  21. player_id,
  22. event_date,
  23. device_id,
  24. rank() over(partition by player_id order by event_date) rk
  25. from
  26. 12_Activity
  27. ) tmp
  28. where rk = 1;

需求三:编写一个 SQL 查询,同时报告每组玩家和日期,以及玩家到目前为止玩了多少游戏。也就是说,在此日期之前玩家所玩的游戏总数。详细情况请查看示例。

player_id event_date games_played_so_far
1 2016-03-01 5
1 2016-05-02 11
2 2017-06-25 1
3 2016-03-02 0
3 2018-07-03 5

提示:提示:对于 ID 为 1 的玩家,2016-05-02 共玩了 5+6=11 个游戏.
对于 ID 为 3 的玩家,2018-07-03 共玩了 0+5=5 个游戏。

最终SQL:

  1. -- 方法一:
  2. SELECT
  3. A.player_id,
  4. A.event_date,
  5. SUM(B.games_played) AS `games_played_so_far`
  6. FROM
  7. 12_Activity AS A
  8. left JOIN
  9. 12_Activity AS B
  10. ON
  11. A.player_id = B.player_id
  12. AND A.event_date >= B.event_date
  13. GROUP BY
  14. A.player_id,A.event_date;
  15. -- 方法二:
  16. select
  17. player_id,
  18. event_date ,
  19. sum(games_played) over(partition by player_id order by event_date )games_played_so_far
  20. from 12_Activity;

需求四:编写一个 SQL 查询,报告在首次登录的第二天再次登录的玩家的百分比,四舍五入到小数点后两位。换句话说,您需要计算从首次登录日期开始至少连续两天登录的玩家的数量,然后除以玩家总数。

fraction
0.33

最终SQL:

  1. -- 方法一:
  2. select
  3. round(
  4. sum(case when datediff(a.event_date,b.first_date)=1 then 1 else 0 end)
  5. /
  6. (select count(distinct(player_id)) from 12_Activity)
  7. ,2 ) as fraction
  8. from
  9. 12_Activity a,
  10. (select
  11. player_id,
  12. min(event_date) first_date
  13. from
  14. 12_Activity
  15. group by
  16. player_id
  17. ) b
  18. where
  19. a.player_id=b.player_id;
  20. -- 方法二:
  21. select
  22. round(avg(a.event_date is not null), 2) fraction
  23. from
  24. (select
  25. player_id,
  26. min(event_date) as first_date
  27. from
  28. 12_Activity
  29. group by
  30. player_id) b
  31. left join
  32. 12_Activity a
  33. on
  34. b.player_id=a.player_id and datediff(a.event_date, b.first_date)=1

需求五:编写一个 SQL 查询,报告每个安装日期、当天安装游戏的玩家数量和第一天的保留时间。

install_dt installs Day1_retention
2016-03-01 2 0.50
2017-06-25 1 0.00

提示:玩家 1 和 3 在 2016-03-01 安装了游戏,但只有玩家 1 在 2016-03-02 重新登录,所以 2016-03-01 的第一天保留时间是 1/2=0.50
玩家 2 在 2017-06-25 安装了游戏,但在 2017-06-26 没有重新登录,因此 2017-06-25 的第一天保留为 0/1=0.00

最终SQL:

  1. -- 方法一
  2. SELECT
  3. A.event_date AS install_dt,
  4. COUNT(A.player_id) AS installs,
  5. round(COUNT(C.player_id)/COUNT(A.player_id),2) AS Day1_retention
  6. FROM
  7. 12_Activity AS A
  8. left JOIN
  9. 12_Activity AS B
  10. ON
  11. A.player_id = B.player_id AND A.event_date > B.event_date
  12. left JOIN
  13. 12_Activity AS C
  14. ON
  15. A.player_id = C.player_id AND C.event_date = DATE_ADD(A.event_date,INTERVAL 1 DAY)
  16. WHERE
  17. B.event_date IS NULL
  18. GROUP BY
  19. A.event_date;

13. 员工薪水中位数

需求:请编写SQL查询来查找每个公司的薪水中位数。挑战点:你是否可以在不使用任何内置的SQL函数的情况下解决此问题。

展示效果:

Id Company Salary
5 A 451
6 A 513
12 B 234
9 B 1154
15 C 2645
  1. Create table If Not Exists 13_Employee (Id int, Company varchar(255), Salary int);
  2. Truncate table 13_Employee;
  3. insert into 13_Employee (Id, Company, Salary) values (1, 'A', 2341);
  4. insert into 13_Employee (Id, Company, Salary) values (2, 'A', 341);
  5. insert into 13_Employee (Id, Company, Salary) values (3, 'A', 15);
  6. insert into 13_Employee (Id, Company, Salary) values (4, 'A', 15314);
  7. insert into 13_Employee (Id, Company, Salary) values (5, 'A', 451);
  8. insert into 13_Employee (Id, Company, Salary) values (6, 'A', 513);
  9. insert into 13_Employee (Id, Company, Salary) values (7, 'B', 15);
  10. insert into 13_Employee (Id, Company, Salary) values (8, 'B', 13);
  11. insert into 13_Employee (Id, Company, Salary) values (9, 'B', 1154);
  12. insert into 13_Employee (Id, Company, Salary) values (10, 'B', 1345);
  13. insert into 13_Employee (Id, Company, Salary) values (11, 'B', 1221);
  14. insert into 13_Employee (Id, Company, Salary) values (12, 'B', 234);
  15. insert into 13_Employee (Id, Company, Salary) values (13, 'C', 2345);
  16. insert into 13_Employee (Id, Company, Salary) values (14, 'C', 2865);
  17. insert into 13_Employee (Id, Company, Salary) values (15, 'C', 2645);
  18. insert into 13_Employee (Id, Company, Salary) values (16, 'C', 2652);
  19. insert into 13_Employee (Id, Company, Salary) values (17, 'C', 65);

最终SQL:

  1. -- 方法一:
  2. select
  3. b.id,
  4. b.company,
  5. b.salary
  6. from
  7. (select
  8. id,
  9. company,
  10. salary,
  11. case @com when company then @rk:=@rk+1 else @rk:=1 end rk,
  12. @com:=company
  13. from
  14. 13_Employee,
  15. (select @rk:=0, @com:='') a
  16. order by
  17. company,salary
  18. ) b
  19. left join
  20. (select
  21. company,
  22. count(1)/2 cnt
  23. from
  24. 13_Employee
  25. group by company
  26. ) c
  27. on
  28. b.company=c.company
  29. where
  30. b.rk in (cnt+0.5,cnt+1,cnt);
  31. -- 方法二:
  32. select
  33. Id,
  34. Company,
  35. Salary
  36. from
  37. (select
  38. Id,
  39. Company,
  40. Salary,
  41. ROW_NUMBER() over(partition by Company order by Salary) rk,
  42. count(*) over(partition by Company) cnt
  43. from 13_Employee
  44. )t1
  45. where rk IN (FLOOR((cnt + 1)/2), FLOOR((cnt + 2)/2))

14. 至少有5名直接下属的经理

需求:Employee 表,请编写一个SQL查询来查找至少有5名直接下属的经理。

展示效果:

Name
John
  1. Create table If Not Exists 14_Employee (Id int, Name varchar(255), Department varchar(255), ManagerId int);
  2. Truncate table 14_Employee;
  3. insert into 14_Employee (Id, Name, Department, ManagerId) values (101, 'John', 'A', null);
  4. insert into 14_Employee (Id, Name, Department, ManagerId) values (102, 'Dan', 'A', 101);
  5. insert into 14_Employee (Id, Name, Department, ManagerId) values (103, 'James', 'A', 101);
  6. insert into 14_Employee (Id, Name, Department, ManagerId) values (104, 'Amy', 'A', 101);
  7. insert into 14_Employee (Id, Name, Department, ManagerId) values (105, 'Anne', 'A', 101);
  8. insert into 14_Employee (Id, Name, Department, ManagerId) values (106, 'Ron', 'B', 101);

最终SQL:

  1. -- 方法一:
  2. SELECT
  3. Name
  4. FROM
  5. 14_Employee AS t1
  6. JOIN
  7. (SELECT
  8. ManagerId
  9. FROM
  10. 14_Employee
  11. GROUP BY
  12. ManagerId
  13. HAVING
  14. COUNT(ManagerId) >= 5
  15. ) AS t2
  16. ON
  17. t1.Id = t2.ManagerId;
  18. -- 方法二:
  19. select
  20. Name
  21. from
  22. 14_Employee
  23. where Id in (
  24. select
  25. ManagerId
  26. from
  27. 14_Employee
  28. group by
  29. ManagerId
  30. having
  31. count(*)>=5 );

15. 给定数字的频率查询中位数

需求:请编写一个查询来查找所有数字的中位数并将结果命名为 median 。

根据下表数据可以看出,原始数据为:0,0,1,2,2,2,3 中位数为2。

展示效果:

median
2.0000
  1. Create table If Not Exists 15_Numbers (Number int, Frequency int);
  2. Truncate table 15_Numbers;
  3. insert into 15_Numbers (Number, Frequency) values (0, 2);
  4. insert into 15_Numbers (Number, Frequency) values (1, 1);
  5. insert into 15_Numbers (Number, Frequency) values (2, 3);
  6. insert into 15_Numbers (Number, Frequency) values (3, 1);

提示:如果 n1.Number 为中位数,n1.Number(包含本身)前累计的数字应大于等于总数/2 ,同时n1.Number(不包含本身)前累计数字应小于等于总数/2。

最终SQL:

  1. select
  2. avg(t.number) as median
  3. from
  4. (select
  5. n1.number,
  6. n1.frequency,
  7. (select
  8. sum(frequency)
  9. from
  10. 15_Numbers n2
  11. where
  12. n2.number<=n1.number
  13. ) as asc_frequency,
  14. (select
  15. sum(frequency)
  16. from
  17. 15_Numbers n3
  18. where
  19. n3.number>=n1.number
  20. ) as desc_frequency
  21. from
  22. 15_Numbers n1
  23. ) t
  24. where
  25. t.asc_frequency>= (select sum(frequency) from 15_Numbers)/2
  26. and t.desc_frequency>= (select sum(frequency) from 15_Numbers)/2;

16. 当选者

需求:请编写 sql 语句来找到当选者(CandidateId)的名字,

展示效果:

Name
B
  1. Create table If Not Exists 16_Candidate (id int, Name varchar(255));
  2. Create table If Not Exists 16_Vote (id int, CandidateId int);
  3. Truncate table 16_Candidate;
  4. insert into 16_Candidate (id, Name) values (1, 'A');
  5. insert into 16_Candidate (id, Name) values (2, 'B');
  6. insert into 16_Candidate (id, Name) values (3, 'C');
  7. insert into 16_Candidate (id, Name) values (4, 'D');
  8. insert into 16_Candidate (id, Name) values (5, 'E');
  9. Truncate table 16_Vote;
  10. insert into 16_Vote (id, CandidateId) values (1, 2);
  11. insert into 16_Vote (id, CandidateId) values (2, 4);
  12. insert into 16_Vote (id, CandidateId) values (3, 3);
  13. insert into 16_Vote (id, CandidateId) values (4, 2);
  14. insert into 16_Vote (id, CandidateId) values (5, 5);

最终SQL:

  1. SELECT
  2. name AS 'Name'
  3. FROM
  4. 16_Candidate a
  5. JOIN
  6. (SELECT
  7. CandidateId
  8. FROM
  9. 16_Vote
  10. GROUP BY
  11. CandidateId
  12. ORDER BY
  13. COUNT(*) DESC
  14. LIMIT 1
  15. ) AS winner
  16. WHERE
  17. a.id = winner.CandidateId;

17. 员工奖金

需求:选出所有 bonus < 1000 的员工的 name 及其 bonus。

展示效果:

name bonus
John null
Dan 500
Brad null
  1. Create table If Not Exists 17_Employee (EmpId int, Name varchar(255), Supervisor int, Salary int);
  2. Create table If Not Exists 17_Bonus (EmpId int, Bonus int);
  3. Truncate table 17_Employee;
  4. insert into 17_Employee (EmpId, Name, Supervisor, Salary) values (3, 'Brad', null, 4000);
  5. insert into 17_Employee (EmpId, Name, Supervisor, Salary) values (1, 'John', 3, 1000);
  6. insert into 17_Employee (EmpId, Name, Supervisor, Salary) values (2, 'Dan', 3, 2000);
  7. insert into 17_Employee (EmpId, Name, Supervisor, Salary) values (4, 'Thomas', 3, 4000);
  8. Truncate table 17_Bonus;
  9. insert into 17_Bonus (EmpId, Bonus) values (2, 500);
  10. insert into 17_Bonus (EmpId, Bonus) values (4, 2000);

最终SQL:

  1. SELECT
  2. e.name,
  3. b.bonus
  4. FROM
  5. 17_Employee e
  6. LEFT JOIN
  7. 17_Bonus b
  8. ON
  9. e.empid = b.empid
  10. WHERE
  11. bonus < 1000 OR bonus IS NULL;

18. 最高回答率

需求:请编写SQL查询来找到具有最高回答率的问题。

展示效果:

survey_log
285
  1. `survey_log` 表中获得回答率最高的问题,`survey_log` 表包含这些列**:id**, **action**, **question_id**, **answer_id**, **q_num**, **timestamp**。id 表示用户 idaction 有以下几种值:"show""answer""skip";当 action 值为 "answer" answer_id 非空,而 action 值为 "show" 或者 "skip" answer_id 为空;q_num 表示当前会话中问题的编号。
  1. Create table If Not Exists 18_survey_log (uid int, action varchar(255), question_id int, answer_id int, q_num int, timestamp int);
  2. Truncate table 18_survey_log;
  3. insert into 18_survey_log (uid, action, question_id, answer_id, q_num, timestamp) values (5, 'show', 285, null, 1, 123);
  4. insert into 18_survey_log (uid, action, question_id, answer_id, q_num, timestamp) values (5, 'answer', 285, 124124, 1, 124);
  5. insert into 18_survey_log (uid, action, question_id, answer_id, q_num, timestamp) values (5, 'show', 369, null, 2, 125);
  6. insert into 18_survey_log (uid, action, question_id, answer_id, q_num, timestamp) values (5, 'skip', 369, null, 2, 126);

最终SQL:

  1. -- 方法一
  2. SELECT
  3. question_id as survey_log
  4. FROM
  5. (SELECT
  6. question_id,
  7. SUM(case when action="answer" THEN 1 ELSE 0 END) as num_answer,
  8. SUM(case when action="show" THEN 1 ELSE 0 END) as num_show
  9. FROM
  10. 18_survey_log
  11. GROUP BY
  12. question_id
  13. ) as tbl
  14. ORDER BY
  15. (num_answer / num_show) DESC
  16. LIMIT 1;
  17. -- 方法二
  18. SELECT
  19. question_id AS 'survey_log'
  20. FROM
  21. 18_survey_log
  22. GROUP BY
  23. question_id
  24. ORDER BY
  25. COUNT(answer_id) / COUNT(IF(action = 'show', 1, 0)) DESC
  26. LIMIT 1;

19. 员工累计薪水

需求:查询一个员工三个月内的累计薪水,但是不包括最近一个月的薪水。

展示效果:

Id Month Salary
1 3 90
1 2 50
1 1 20
2 1 20
3 3 100
3 2 40
  1. Create table If Not Exists 19_Employee (Id int, Month int, Salary int);
  2. Truncate table 19_Employee;
  3. insert into 19_Employee (Id, Month, Salary) values (1, 1, 20);
  4. insert into 19_Employee (Id, Month, Salary) values (2, 1, 20);
  5. insert into 19_Employee (Id, Month, Salary) values (1, 2, 30);
  6. insert into 19_Employee (Id, Month, Salary) values (2, 2, 30);
  7. insert into 19_Employee (Id, Month, Salary) values (3, 2, 40);
  8. insert into 19_Employee (Id, Month, Salary) values (1, 3, 40);
  9. insert into 19_Employee (Id, Month, Salary) values (3, 3, 60);
  10. insert into 19_Employee (Id, Month, Salary) values (1, 4, 60);
  11. insert into 19_Employee (Id, Month, Salary) values (3, 4, 70);

说明:员工 1 除去最近一个月(月份 4),有三个月的薪水记录:月份 3 薪水为 40,月份 2 薪水为 30,月份 1 薪水为 20。所以近 3 个月的薪水累计分别为 (40 + 30 + 20) = 90,(30 + 20) = 50 和 20。

最终SQL:

  1. SELECT
  2. E1.id,
  3. E1.month,
  4. (IFNULL(E1.salary, 0) + IFNULL(E2.salary, 0) + IFNULL(E3.salary, 0)) AS Salary
  5. FROM
  6. (SELECT
  7. id, MAX(month) AS month
  8. FROM
  9. 19_Employee
  10. GROUP BY
  11. id
  12. HAVING
  13. COUNT(*) > 1) AS maxmonth
  14. LEFT JOIN
  15. 19_Employee E1
  16. ON
  17. (maxmonth.id = E1.id AND maxmonth.month > E1.month)
  18. LEFT JOIN
  19. 19_Employee E2
  20. ON
  21. (E2.id = E1.id AND E2.month = E1.month - 1)
  22. LEFT JOIN
  23. 19_Employee E3
  24. ON
  25. (E3.id = E1.id AND E3.month = E1.month - 2)
  26. ORDER BY
  27. id ASC , month DESC;

20. 统计各专业人数

需求:查询 department 表中每个专业的学生人数 (即使没有学生的专业也需列出)。

展示效果:

dept_name student_number
Engineering 2
Science 1
Law 0
  1. CREATE TABLE IF NOT EXISTS 20_student (student_id INT,student_name VARCHAR(45), gender VARCHAR(6), dept_id INT);
  2. CREATE TABLE IF NOT EXISTS 20_department (dept_id INT, dept_name VARCHAR(255));
  3. Truncate table 20_student;
  4. insert into 20_student (student_id, student_name, gender, dept_id) values (1, 'Jack', 'M', 1);
  5. insert into 20_student (student_id, student_name, gender, dept_id) values (2, 'Jane', 'F', 1);
  6. insert into 20_student (student_id, student_name, gender, dept_id) values (3, 'Mark', 'M', 2);
  7. Truncate table 20_department;
  8. insert into 20_department (dept_id, dept_name) values (1, 'Engineering');
  9. insert into 20_department (dept_id, dept_name) values (2, 'Science');
  10. insert into 20_department (dept_id, dept_name) values (3, 'Law');

最终SQL:

  1. SELECT
  2. dept_name,
  3. COUNT(student_id) AS student_number
  4. FROM
  5. 20_department d
  6. LEFT OUTER JOIN
  7. 20_student s
  8. ON
  9. d.dept_id = s.dept_id
  10. GROUP BY
  11. d.dept_name
  12. ORDER BY
  13. student_number DESC,
  14. d.dept_name;