练习一: 各部门工资最高的员工(难度:中等)

创建Employee 表,包含所有员工信息,每个员工有其对应的 Id, salary 和 department Id。

Id Name Salary DepartmentId
1 Joe 70000 1
2 Henry 80000 2
3 Sam 60000 2
4 Max 90000 1

创建Department 表,包含公司所有部门的信息。

Id Name
1 IT
2 Sales

编写一个 SQL 查询,找出每个部门工资最高的员工。例如,根据上述给定的表格,Max 在 IT 部门有最高工资,Henry 在 Sales 部门有最高工资。

Department Employee Salary
IT Max 90000
Sales Henry 80000

回答:
①创表和更新数据

  1. -- 员工信息表
  2. -- USE shop;
  3. CREATE TABLE Employee
  4. (Id INTEGER NOT NULL,
  5. Name VARCHAR(64) NOT NULL,
  6. Salary INTEGER NOT NULL,
  7. DepartmentId INTEGER NOT NULL,
  8. PRIMARY KEY (Id));
  9. INSERT INTO Employee VALUES
  10. ('1', 'Joe', '70000', '1'), ('2', 'Henry', '80000', '2'),
  11. ('3', 'Sam', '60000', '2'), ('4', 'Max', '90000', '1');
  12. -- SELECT * FROM Employee;
  13. -- 部门信息表
  14. CREATE TABLE Department
  15. (Id INTEGER NOT NULL,
  16. Name VARCHAR(64) NOT NULL,
  17. PRIMARY KEY (Id));
  18. INSERT INTO Department VALUES
  19. ('1', 'IT'), ('2', 'Sales');
  20. -- SELECT * FROM Department;

image.png
image.png

②查询每个部门工资最高的员工

SELECT DP.Name AS Department
       ,E.Employee
       ,E.Salary
FROM Department AS DP
LEFT JOIN
(SELECT DepartmentId
       ,E1.Name AS Employee
       ,Salary
FROM Employee AS E1
WHERE Salary = (SELECT MAX(Salary)
                FROM Employee AS E2
                WHERE E1.DepartmentId = E2.DepartmentId
                GROUP BY DepartmentId)) AS E
ON DP.Id = E.DepartmentId ;

image.png

练习二: 换座位(难度:中等)

小美是一所中学的信息科技老师,她有一张 seat 座位表,平时用来储存学生名字和与他们相对应的座位 id。其中纵列的id是连续递增的。

小美想改变相邻俩学生的座位。你能不能帮她写一个 SQL query 来输出小美想要的结果呢?

请创建如下所示seat表:

示例:

id student
1 Abbot
2 Doris
3 Emerson
4 Green
5 Jeames

假如数据输入的是上表,则输出结果如下:

id student
1 Doris
2 Abbot
3 Green
4 Emerson
5 Jeames

注意:
如果学生人数是奇数,则不需要改变最后一个同学的座位。

回答:
①数据准备
image.png

②编写SQL语句:

SELECT CASE WHEN s2.id IS NULL THEN s1.id
            ELSE s2.id
       END AS id,
       CASE WHEN s2.student IS NULL THEN s1.student
            ELSE s2.student
       END AS student
FROM seat AS s1
LEFT JOIN
(SELECT id+1 AS id, student
 FROM seat
 WHERE id%2=1
 UNION 
 SELECT id-1 AS id, student
 FROM seat
 WHERE id%2=0) AS s2
ON s1.id = s2.id;

image.png


练习三: 分数排名(难度:中等)

编写一个 SQL 查询来实现分数排名。如果两个分数相同,则两个分数排名(Rank)相同。请注意,平分后的下一个名次应该是下一个连续的整数值。换句话说,名次之间不应该有“间隔”。

创建以下Scores表:

Id Score
1 3.50
2 3.65
3 4.00
4 3.85
5 4.00
6 3.65

例如,根据上述给定的 Scores 表,你的查询应该返回(按分数从高到低排列):

Score Rank
4.00 1
4.00 1
3.85 2
3.65 3
3.65 3
3.50 4

回答:
①数据准备:
image.png

②编写SQL语句:

SELECT Score, 
       DENSE_RANK() OVER (ORDER BY Score DESC) AS Rank_
FROM Scores;

image.png

练习四:连续出现的数字(难度:中等)

编写一个 SQL 查询,查找所有至少连续出现三次的数字。

Id Num
1 1
2 1
3 1
4 2
5 1
6 2
7 2

例如,给定上面的 Logs 表, 1 是唯一连续出现至少三次的数字。

ConsecutiveNums
1

回答:
①数据准备:
Logs好像会和MySQL关键字冲突。这里避免冲突,改了一下表名。

CREATE TABLE LogsTemp
(Id INTEGER NOT NULL,
 Num INTEGER NOT NULL,
 PRIMARY KEY (Id));

 INSERT INTO LogsTemp VALUES
 (1, 1), (2, 1), (3, 1), (4, 2),
 (5, 1), (6, 2), (7, 2);

 -- SELECT * FROM LogsTemp;

image.png

②编写SQL语句:

SELECT CONVERT(t.mv_avg_num, SIGNED) AS ConsecutiveNums
FROM
(SELECT Id,
        AVG(Num) OVER (ROWS BETWEEN 1 PRECEDING
                    AND 1 FOLLOWING) AS mv_avg_num
 FROM LogsTemp
 LIMIT 1, 9999) AS t
WHERE t.mv_avg_num = (SELECT Num
                     FROM LogsTemp AS t1
                    WHERE t1.Id = t.Id)
ORDER BY t.Id DESC
LIMIT 1, 9999;

image.png

练习五:树节点 (难度:中等)

对于tree表,id是树节点的标识,p_id是其父节点的id。

id p_id
1 null
2 1
3 1
4 2
5 2

每个节点都是以下三种类型中的一种:

  • Root: 如果节点是根节点。
  • Leaf: 如果节点是叶子节点。
  • Inner: 如果节点既不是根节点也不是叶子节点。

写一条查询语句打印节点id及对应的节点类型。按照节点id排序。上面例子的对应结果为:

id Type
1 Root
2 Inner
3 Leaf
4 Leaf
5 Leaf

说明

  • 节点‘1’是根节点,因为它的父节点为NULL,有‘2’和‘3’两个子节点。
  • 节点‘2’是内部节点,因为它的父节点是‘1’,有子节点‘4’和‘5’。
  • 节点‘3’,‘4’,’5’是叶子节点,因为它们有父节点但没有子节点。

下面是树的图形:

 1            

/ \
2 3
/ \ 4 5

注意:
如果一个树只有一个节点,只需要输出根节点属性。

回答:
①数据准备:

CREATE TABLE tree
(id INTEGER NOT NULL,
 p_id INTEGER ,
 PRIMARY KEY (id));

INSERT INTO tree VALUES
(1, NULL), (2, 1), (3, 1),
(4, 2), (5, 2);

-- SELECT * FROM tree;

image.png

②编写SQL语句:

SELECT id,
       CASE WHEN ISNULL(p_id) THEN 'Roo'
            WHEN id IN (SELECT p_id FROM tree) THEN 'Inner'
            ELSE 'Leaf'
       END AS Type_
FROM tree;

image.png

练习六:至少有五名直接下属的经理 (难度:中等)

Employee表包含所有员工及其上级的信息。每位员工都有一个Id,并且还有一个对应主管的Id(ManagerId)。

Id Name Department ManagerId
101 John A null
102 Dan A 101
103 James A 101
104 Amy A 101
105 Anne A 101
106 Ron B 101

针对Employee表,写一条SQL语句找出有5个下属的主管。对于上面的表,结果应输出:

Name
John

注意:
没有人向自己汇报。

回答:
①数据准备:

CREATE TABLE Employee2
(Id INTEGER NOT NULL,
 Name VARCHAR(64) NOT NULL,
 Department CHAR(1) NOT NULL,
 ManagerId INTEGER ,
 PRIMARY KEY (id));

INSERT INTO Employee2 VALUES
(101, 'John', 'A', NULL), (102, 'Dan', 'A', 101),
(103, 'James', 'A', 101), (104, 'Amy', 'A', 101),
(105, 'Anne', 'A', 101), (106, 'on', 'B', 101);

-- SELECT * FROM Employee2;

image.png

②编写SQL语句:

SELECT t2.Name
FROM (SELECT ManagerId, COUNT(*) AS count_id
FROM Employee2 
GROUP BY ManagerId) AS t1
LEFT JOIN
(SELECT Id, Name
 FROM Employee2
 WHERE Id <> ManagerId
      OR ISNULL(ManagerId)) AS t2
ON t1.ManagerId = t2.Id
WHERE t1.count_id >= 5;

image.png

练习七: 分数排名 (难度:中等)

练习三的分数表,实现排名功能,但是排名需要是非连续的,如下:

Score Rank
4.00 1
4.00 1
3.85 3
3.65 4
3.65 4
3.50 6

回答:

SELECT Score, 
       RANK() OVER (ORDER BY Score DESC) AS Rank_
FROM Scores;

image.png

练习八:查询回答率最高的问题 (难度:中等)

求出survey_log表中回答率最高的问题,表格的字段有:uid, action, question_id, answer_id, q_num, timestamp。

uid是用户id;action的值为:”show”, “answer”, “skip”;当action是”answer”时,answer_id不为空,相反,当action是”show”和”skip”时为空(null);q_num是问题的数字序号。

写一条sql语句找出回答率最高的问题。

举例:
输入

uid action question_id answer_id q_num timestamp
5 show 285 null 1 123
5 answer 285 124124 1 124
5 show 369 null 2 125
5 skip 369 null 2 126

输出

survey_log
285

说明:
问题285的回答率为1/1,然而问题369的回答率是0/1,所以输出是285。

注意:最高回答率的意思是:同一个问题出现的次数中回答的比例。

回答:
①数据准备:

CREATE TABLE survey_log
(uid INTEGER NOT NULL,
 action_ VARCHAR(24) NOT NULL,
 question_id INTEGER NOT NULL,
 answer_id INTEGER ,
 q_num INTEGER NOT NULL,
 timestamp_ INTEGER NOT NULL);

INSERT INTO survey_log VALUES
(5, 'show', 285, null, 1, 123),
(5, 'answer', 285, 124124, 1, 124),
(5, 'show', 369, null, 2, 125),
(5, 'skip', 369, null, 2, 126);

-- SELECT * FROM survey_log;

image.png

②编写SQL语句:

SELECT question_id AS survey_log
FROM survey_log
GROUP BY question_id
ORDER BY COUNT(answer_id) / COUNT(*) DESC  -- 根据回答率进行排名
LIMIT 1;  -- 取排名最高的一个

image.png

练习九:各部门前3高工资的员工(难度:中等)

项目7练习1中的employee表清空,重新插入以下数据(其实是多插入5,6两行):

Id Name Salary DepartmentId
1 Joe 70000 1
2 Henry 80000 2
3 Sam 60000 2
4 Max 90000 1
5 Janet 69000 1
6 Randy 85000 1

编写一个 SQL 查询,找出每个部门工资前三高的员工。例如,根据上述给定的表格,查询结果应返回:

Department Employee Salary
IT Max 90000
IT Randy 85000
IT Joe 70000
Sales Henry 80000
Sales Sam 60000

此外,请考虑实现各部门前N高工资的员工功能。

回答:
①数据准备

INSERT INTO Employee VALUES
(5, 'Janet', 69000, 1),
(6, 'Randy', 85000, 1);

-- SELECT * FROM Employee;

image.png
②编写SQL语句:

SELECT DT.Name AS Department,
       E.Name,
       E.Salary
FROM Department DT
LEFT JOIN
(SELECT DepartmentId,
        Name,
        Salary,
        RANK() OVER (PARTITION BY DepartmentId
                    ORDER BY Salary DESC) AS Rank_
   FROM Employee) AS E
ON DT.Id = E.DepartmentId
WHERE E.Rank_ <=3;

image.png

练习十:平面上最近距离 (难度: 困难)

point_2d表包含一个平面内一些点(超过两个)的坐标值(x,y)。
写一条查询语句求出这些点中的最短距离并保留2位小数。

x y
-1 -1
0 0
-1 -2

最短距离是1,从点(-1,-1)到点(-1,-2)。所以输出结果为:

shortest
1.00

注意: 所有点的最大距离小于10000。

回答:
数据准备

CREATE TABLE point_2d
(x INTEGER NOT NULL,
 y INTEGER NOT NULL);

INSERT INTO point_2d VALUES
(-1, -1), (0, 0),(-1, -2);

-- SELECT * FROM point_2d;

image.png

编写SQL语句:

SELECT CONVERT(ABS((p1.x-p2.x)+(p1.y-p2.y)), 
               DECIMAL(3,2)) AS shortest
FROM point_2d AS p1
CROSS JOIN point_2d AS p2 
WHERE p1.x<>p2.x OR p1.y<>p2.y
ORDER BY shortest
LIMIT 1;

image.png

练习十一:行程和用户(难度:困难)

Trips 表中存所有出租车的行程信息。每段行程有唯一键 Id,Client_Id 和 Driver_Id 是 Users 表中 Users_Id 的外键。Status 是枚举类型,枚举成员为 (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’)。

Id Client_Id Driver_Id City_Id Status Request_at
1 1 10 1 completed 2013-10-1
2 2 11 1 cancelled_by_driver 2013-10-1
3 3 12 6 completed 2013-10-1
4 4 13 6 cancelled_by_client 2013-10-1
5 1 10 1 completed 2013-10-2
6 2 11 6 completed 2013-10-2
7 3 12 6 completed 2013-10-2
8 2 12 12 completed 2013-10-3
9 3 10 12 completed 2013-10-3
10 4 13 12 cancelled_by_driver 2013-10-3

Users 表存所有用户。每个用户有唯一键 Users_Id。Banned 表示这个用户是否被禁止,Role 则是一个表示(’client’, ‘driver’, ‘partner’)的枚举类型。

Users_Id Banned Role
1 No client
2 Yes client
3 No client
4 No client
10 No driver
11 No driver
12 No driver
13 No driver

写一段 SQL 语句查出2013年10月1日至2013年10月3日期间非禁止用户的取消率。基于上表,你的 SQL 语句应返回如下结果,取消率(Cancellation Rate)保留两位小数。

Day Cancellation Rate
2013-10-01 0.33
2013-10-02 0.00
2013-10-03 0.50

回答:
创表:

CREATE TABLE Trips
(Id SMALLINT UNSIGNED NOT NULL REFERENCES Users(Users_Id),
 Client_Id SMALLINT UNSIGNED NOT NULL REFERENCES Users(Users_Id),
 Driver_Id SMALLINT UNSIGNED NOT NULL,
 City_Id SMALLINT UNSIGNED NOT NULL,
 Status_ ENUM('completed',
             'cancelled_by_driver',
             'cancelled_by_client') NOT NULL,
Request_at DATE NOT NULL );

CREATE TABLE Users
(Users_Id SMALLINT UNSIGNED NOT NULL,
 Banned VARCHAR(12) NOT NULL,
 Role_ ENUM('client', 'driver', 'partner') NOT NULL);

更新数据:

INSERT INTO Trips VALUES
(1, 1, 10, 1, 'completed', '2013-10-1'),
(2, 2, 11, 1, 'cancelled_by_driver', '2013-10-1'),
(3, 3, 12, 6, 'completed', '2013-10-1'),
(4, 4, 13, 6, 'cancelled_by_client', '2013-10-1'),
(5, 1, 10, 1, 'completed', '2013-10-2'),
(6, 2, 11, 6, 'completed', '2013-10-2'),
(7, 3, 12, 6, 'completed', '2013-10-2'),
(8, 2, 12, 12, 'completed', '2013-10-3'),
(9, 3, 10, 12, 'completed', '2013-10-3'),
(10, 4, 13, 12, 'cancelled_by_driver', '2013-10-3');

-- SELECT * FROM Trips;

image.png

INSERT INTO Users VALUES
(1, 'No', 'client'), (2, 'Yes', 'client'),
(3, 'No', 'client'), (4, 'No', 'client'),
(10, 'No', 'driver'), (11, 'No', 'driver'),
(12, 'No', 'driver'), (13, 'No', 'driver');

-- SELECT * FROM Users;

image.png

问题拆解:

  1. 查询2013年10月1日至2013年10月3日期间的记录,为集合A;
  2. 查询非禁止用户的记录,为集合B;
  3. 求集合A与集合B的交集,为集合C;
  4. 汇总集合C的取消人数和总人数,计算取消率,并保留2位小数。
  5. 按日期分组汇总集合C的取消人数,为字段cancel
  6. 按日期分组汇总集合C的总人数,为字段total
  7. cancel/total 就是所求的取消率了。

编写SQL语句:
拆解的部分

-- 查询2013年10月1日至2013年10月3日期间的记录,为集合A;
SELECT *
FROM Trips
WHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3';

-- 查询非禁止用户的记录,为集合B;
SELECT Users_id
FROM Users
WHERE Banned = 'No';

-- 查询取消的记录;
SELECT *
FROM Trips
WHERE Status_ <> 'completed';

-- 求集合A与集合B的交集,为集合C;
-- 并按日期分组汇总集合C的取消人数
SELECT Request_at, COUNT(*)
FROM Trips
WHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3'
  AND Status_ <> 'completed'
  AND Client_Id in (SELECT Users_id  /* 注意这里要用Client_Id */
               FROM Users
              WHERE Banned = 'NO')
GROUP BY Request_at;

-- 按日期分组汇总集合C的总人数
SELECT Request_at, COUNT(*) AS total
 FROM Trips
 WHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3'
   AND Client_Id in (SELECT Users_id
              FROM Users
                    WHERE Banned = 'NO')
 GROUP BY Request_at;

合并拆解的内容

SELECT t2.Request_at AS Day,
       CASE WHEN t1.cancel IS NULL THEN 0
            ELSE ROUND(t1.cancel/t2.total, 2)
         END AS `Cancellation Rate`
FROM
(SELECT Request_at, COUNT(*) AS cancel
 FROM Trips
 WHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3'
   AND Status_ <> 'completed'
   AND Client_Id in (SELECT Users_id
               FROM Users
              WHERE Banned = 'NO')
 GROUP BY Request_at) AS t1
RIGHT JOIN
(SELECT Request_at, COUNT(*) AS total
 FROM Trips
 WHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3'
   AND Client_Id in (SELECT Users_id
                FROM Users
               WHERE Banned = 'NO')
 GROUP BY Request_at) AS t2
ON t2.Request_at = t1.Request_at;

task11.png

优化后的SQL语句:

SELECT t.Request_at AS Day,
       ROUND(SUM(t.status_)/COUNT(*), 2) AS `Cancellation Rate`
FROM
(SELECT Request_at,
       CASE WHEN Status_='completed' THEN 0 ELSE 1 END AS status_
 FROM Trips
 WHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3'
   AND Client_Id in (SELECT Users_id
                FROM Users
               WHERE Banned = 'NO')) AS t
GROUP BY t.Request_at;

tt02.PNG