练习一: 各部门工资最高的员工(难度:中等)
创建Employee 表,包含所有员工信息,每个员工有其对应的 Id, salary 和 department Id。
| Id | Name | Salary | DepartmentId |
|---|---|---|---|
| 1 | Joe | 70000 | 1 |
| 2 | Henry | 80000 | 2 |
| 3 | Sam | 60000 | 2 |
| 4 | Max | 90000 | 1 |
创建Department 表,包含公司所有部门的信息。
| Id | Name |
|---|---|
| 1 | IT |
| 2 | Sales |
编写一个 SQL 查询,找出每个部门工资最高的员工。例如,根据上述给定的表格,Max 在 IT 部门有最高工资,Henry 在 Sales 部门有最高工资。
| Department | Employee | Salary |
|---|---|---|
| IT | Max | 90000 |
| Sales | Henry | 80000 |
回答:
①创表和更新数据
-- 员工信息表-- USE shop;CREATE TABLE Employee(Id INTEGER NOT NULL,Name VARCHAR(64) NOT NULL,Salary INTEGER NOT NULL,DepartmentId INTEGER NOT NULL,PRIMARY KEY (Id));INSERT INTO Employee VALUES('1', 'Joe', '70000', '1'), ('2', 'Henry', '80000', '2'),('3', 'Sam', '60000', '2'), ('4', 'Max', '90000', '1');-- SELECT * FROM Employee;-- 部门信息表CREATE TABLE Department(Id INTEGER NOT NULL,Name VARCHAR(64) NOT NULL,PRIMARY KEY (Id));INSERT INTO Department VALUES('1', 'IT'), ('2', 'Sales');-- SELECT * FROM Department;


②查询每个部门工资最高的员工
SELECT DP.Name AS Department,E.Employee,E.SalaryFROM Department AS DPLEFT JOIN(SELECT DepartmentId,E1.Name AS Employee,SalaryFROM Employee AS E1WHERE Salary = (SELECT MAX(Salary)FROM Employee AS E2WHERE E1.DepartmentId = E2.DepartmentIdGROUP BY DepartmentId)) AS EON DP.Id = E.DepartmentId ;

练习二: 换座位(难度:中等)
小美是一所中学的信息科技老师,她有一张 seat 座位表,平时用来储存学生名字和与他们相对应的座位 id。其中纵列的id是连续递增的。
小美想改变相邻俩学生的座位。你能不能帮她写一个 SQL query 来输出小美想要的结果呢?
请创建如下所示seat表:
示例:
| id | student |
|---|---|
| 1 | Abbot |
| 2 | Doris |
| 3 | Emerson |
| 4 | Green |
| 5 | Jeames |
假如数据输入的是上表,则输出结果如下:
| id | student |
|---|---|
| 1 | Doris |
| 2 | Abbot |
| 3 | Green |
| 4 | Emerson |
| 5 | Jeames |
注意:
如果学生人数是奇数,则不需要改变最后一个同学的座位。
回答:
①数据准备
②编写SQL语句:
SELECT CASE WHEN s2.id IS NULL THEN s1.idELSE s2.idEND AS id,CASE WHEN s2.student IS NULL THEN s1.studentELSE s2.studentEND AS studentFROM seat AS s1LEFT JOIN(SELECT id+1 AS id, studentFROM seatWHERE id%2=1UNIONSELECT id-1 AS id, studentFROM seatWHERE id%2=0) AS s2ON s1.id = s2.id;

练习三: 分数排名(难度:中等)
编写一个 SQL 查询来实现分数排名。如果两个分数相同,则两个分数排名(Rank)相同。请注意,平分后的下一个名次应该是下一个连续的整数值。换句话说,名次之间不应该有“间隔”。
创建以下Scores表:
| Id | Score |
|---|---|
| 1 | 3.50 |
| 2 | 3.65 |
| 3 | 4.00 |
| 4 | 3.85 |
| 5 | 4.00 |
| 6 | 3.65 |
例如,根据上述给定的 Scores 表,你的查询应该返回(按分数从高到低排列):
| Score | Rank |
|---|---|
| 4.00 | 1 |
| 4.00 | 1 |
| 3.85 | 2 |
| 3.65 | 3 |
| 3.65 | 3 |
| 3.50 | 4 |
回答:
①数据准备:
②编写SQL语句:
SELECT Score,DENSE_RANK() OVER (ORDER BY Score DESC) AS Rank_FROM Scores;

练习四:连续出现的数字(难度:中等)
编写一个 SQL 查询,查找所有至少连续出现三次的数字。
| Id | Num |
|---|---|
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 1 |
| 6 | 2 |
| 7 | 2 |
例如,给定上面的 Logs 表, 1 是唯一连续出现至少三次的数字。
| ConsecutiveNums |
|---|
| 1 |
回答:
①数据准备:Logs好像会和MySQL关键字冲突。这里避免冲突,改了一下表名。
CREATE TABLE LogsTemp(Id INTEGER NOT NULL,Num INTEGER NOT NULL,PRIMARY KEY (Id));INSERT INTO LogsTemp VALUES(1, 1), (2, 1), (3, 1), (4, 2),(5, 1), (6, 2), (7, 2);-- SELECT * FROM LogsTemp;

②编写SQL语句:
SELECT CONVERT(t.mv_avg_num, SIGNED) AS ConsecutiveNumsFROM(SELECT Id,AVG(Num) OVER (ROWS BETWEEN 1 PRECEDINGAND 1 FOLLOWING) AS mv_avg_numFROM LogsTempLIMIT 1, 9999) AS tWHERE t.mv_avg_num = (SELECT NumFROM LogsTemp AS t1WHERE t1.Id = t.Id)ORDER BY t.Id DESCLIMIT 1, 9999;

练习五:树节点 (难度:中等)
对于tree表,id是树节点的标识,p_id是其父节点的id。
| id | p_id |
|---|---|
| 1 | null |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
每个节点都是以下三种类型中的一种:
- Root: 如果节点是根节点。
- Leaf: 如果节点是叶子节点。
- Inner: 如果节点既不是根节点也不是叶子节点。
写一条查询语句打印节点id及对应的节点类型。按照节点id排序。上面例子的对应结果为:
| id | Type |
|---|---|
| 1 | Root |
| 2 | Inner |
| 3 | Leaf |
| 4 | Leaf |
| 5 | Leaf |
说明
- 节点‘1’是根节点,因为它的父节点为NULL,有‘2’和‘3’两个子节点。
- 节点‘2’是内部节点,因为它的父节点是‘1’,有子节点‘4’和‘5’。
- 节点‘3’,‘4’,’5’是叶子节点,因为它们有父节点但没有子节点。
下面是树的图形:
1/ \
2 3
/ \ 4 5
注意:
如果一个树只有一个节点,只需要输出根节点属性。
回答:
①数据准备:
CREATE TABLE tree(id INTEGER NOT NULL,p_id INTEGER ,PRIMARY KEY (id));INSERT INTO tree VALUES(1, NULL), (2, 1), (3, 1),(4, 2), (5, 2);-- SELECT * FROM tree;

②编写SQL语句:
SELECT id,CASE WHEN ISNULL(p_id) THEN 'Roo'WHEN id IN (SELECT p_id FROM tree) THEN 'Inner'ELSE 'Leaf'END AS Type_FROM tree;

练习六:至少有五名直接下属的经理 (难度:中等)
Employee表包含所有员工及其上级的信息。每位员工都有一个Id,并且还有一个对应主管的Id(ManagerId)。
| Id | Name | Department | ManagerId |
|---|---|---|---|
| 101 | John | A | null |
| 102 | Dan | A | 101 |
| 103 | James | A | 101 |
| 104 | Amy | A | 101 |
| 105 | Anne | A | 101 |
| 106 | Ron | B | 101 |
针对Employee表,写一条SQL语句找出有5个下属的主管。对于上面的表,结果应输出:
| Name |
|---|
| John |
注意:
没有人向自己汇报。
回答:
①数据准备:
CREATE TABLE Employee2(Id INTEGER NOT NULL,Name VARCHAR(64) NOT NULL,Department CHAR(1) NOT NULL,ManagerId INTEGER ,PRIMARY KEY (id));INSERT INTO Employee2 VALUES(101, 'John', 'A', NULL), (102, 'Dan', 'A', 101),(103, 'James', 'A', 101), (104, 'Amy', 'A', 101),(105, 'Anne', 'A', 101), (106, 'on', 'B', 101);-- SELECT * FROM Employee2;

②编写SQL语句:
SELECT t2.NameFROM (SELECT ManagerId, COUNT(*) AS count_idFROM Employee2GROUP BY ManagerId) AS t1LEFT JOIN(SELECT Id, NameFROM Employee2WHERE Id <> ManagerIdOR ISNULL(ManagerId)) AS t2ON t1.ManagerId = t2.IdWHERE t1.count_id >= 5;

练习七: 分数排名 (难度:中等)
练习三的分数表,实现排名功能,但是排名需要是非连续的,如下:
| Score | Rank |
|---|---|
| 4.00 | 1 |
| 4.00 | 1 |
| 3.85 | 3 |
| 3.65 | 4 |
| 3.65 | 4 |
| 3.50 | 6 |
回答:
SELECT Score,RANK() OVER (ORDER BY Score DESC) AS Rank_FROM Scores;

练习八:查询回答率最高的问题 (难度:中等)
求出survey_log表中回答率最高的问题,表格的字段有:uid, action, question_id, answer_id, q_num, timestamp。
uid是用户id;action的值为:”show”, “answer”, “skip”;当action是”answer”时,answer_id不为空,相反,当action是”show”和”skip”时为空(null);q_num是问题的数字序号。
写一条sql语句找出回答率最高的问题。
举例:
输入
| uid | action | question_id | answer_id | q_num | timestamp |
|---|---|---|---|---|---|
| 5 | show | 285 | null | 1 | 123 |
| 5 | answer | 285 | 124124 | 1 | 124 |
| 5 | show | 369 | null | 2 | 125 |
| 5 | skip | 369 | null | 2 | 126 |
输出
| survey_log |
|---|
| 285 |
说明:
问题285的回答率为1/1,然而问题369的回答率是0/1,所以输出是285。
注意:最高回答率的意思是:同一个问题出现的次数中回答的比例。
回答:
①数据准备:
CREATE TABLE survey_log(uid INTEGER NOT NULL,action_ VARCHAR(24) NOT NULL,question_id INTEGER NOT NULL,answer_id INTEGER ,q_num INTEGER NOT NULL,timestamp_ INTEGER NOT NULL);INSERT INTO survey_log VALUES(5, 'show', 285, null, 1, 123),(5, 'answer', 285, 124124, 1, 124),(5, 'show', 369, null, 2, 125),(5, 'skip', 369, null, 2, 126);-- SELECT * FROM survey_log;

②编写SQL语句:
SELECT question_id AS survey_logFROM survey_logGROUP BY question_idORDER BY COUNT(answer_id) / COUNT(*) DESC -- 根据回答率进行排名LIMIT 1; -- 取排名最高的一个

练习九:各部门前3高工资的员工(难度:中等)
将项目7练习1中的employee表清空,重新插入以下数据(其实是多插入5,6两行):
| Id | Name | Salary | DepartmentId |
|---|---|---|---|
| 1 | Joe | 70000 | 1 |
| 2 | Henry | 80000 | 2 |
| 3 | Sam | 60000 | 2 |
| 4 | Max | 90000 | 1 |
| 5 | Janet | 69000 | 1 |
| 6 | Randy | 85000 | 1 |
编写一个 SQL 查询,找出每个部门工资前三高的员工。例如,根据上述给定的表格,查询结果应返回:
| Department | Employee | Salary |
|---|---|---|
| IT | Max | 90000 |
| IT | Randy | 85000 |
| IT | Joe | 70000 |
| Sales | Henry | 80000 |
| Sales | Sam | 60000 |
此外,请考虑实现各部门前N高工资的员工功能。
回答:
①数据准备
INSERT INTO Employee VALUES(5, 'Janet', 69000, 1),(6, 'Randy', 85000, 1);-- SELECT * FROM Employee;

②编写SQL语句:
SELECT DT.Name AS Department,E.Name,E.SalaryFROM Department DTLEFT JOIN(SELECT DepartmentId,Name,Salary,RANK() OVER (PARTITION BY DepartmentIdORDER BY Salary DESC) AS Rank_FROM Employee) AS EON DT.Id = E.DepartmentIdWHERE E.Rank_ <=3;

练习十:平面上最近距离 (难度: 困难)
point_2d表包含一个平面内一些点(超过两个)的坐标值(x,y)。
写一条查询语句求出这些点中的最短距离并保留2位小数。
| x | y |
|---|---|
| -1 | -1 |
| 0 | 0 |
| -1 | -2 |
最短距离是1,从点(-1,-1)到点(-1,-2)。所以输出结果为:
| shortest |
|---|
| 1.00 |
注意: 所有点的最大距离小于10000。
回答:
数据准备
CREATE TABLE point_2d(x INTEGER NOT NULL,y INTEGER NOT NULL);INSERT INTO point_2d VALUES(-1, -1), (0, 0),(-1, -2);-- SELECT * FROM point_2d;

编写SQL语句:
SELECT CONVERT(ABS((p1.x-p2.x)+(p1.y-p2.y)),DECIMAL(3,2)) AS shortestFROM point_2d AS p1CROSS JOIN point_2d AS p2WHERE p1.x<>p2.x OR p1.y<>p2.yORDER BY shortestLIMIT 1;

练习十一:行程和用户(难度:困难)
Trips 表中存所有出租车的行程信息。每段行程有唯一键 Id,Client_Id 和 Driver_Id 是 Users 表中 Users_Id 的外键。Status 是枚举类型,枚举成员为 (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’)。
| Id | Client_Id | Driver_Id | City_Id | Status | Request_at |
|---|---|---|---|---|---|
| 1 | 1 | 10 | 1 | completed | 2013-10-1 |
| 2 | 2 | 11 | 1 | cancelled_by_driver | 2013-10-1 |
| 3 | 3 | 12 | 6 | completed | 2013-10-1 |
| 4 | 4 | 13 | 6 | cancelled_by_client | 2013-10-1 |
| 5 | 1 | 10 | 1 | completed | 2013-10-2 |
| 6 | 2 | 11 | 6 | completed | 2013-10-2 |
| 7 | 3 | 12 | 6 | completed | 2013-10-2 |
| 8 | 2 | 12 | 12 | completed | 2013-10-3 |
| 9 | 3 | 10 | 12 | completed | 2013-10-3 |
| 10 | 4 | 13 | 12 | cancelled_by_driver | 2013-10-3 |
Users 表存所有用户。每个用户有唯一键 Users_Id。Banned 表示这个用户是否被禁止,Role 则是一个表示(’client’, ‘driver’, ‘partner’)的枚举类型。
| Users_Id | Banned | Role |
|---|---|---|
| 1 | No | client |
| 2 | Yes | client |
| 3 | No | client |
| 4 | No | client |
| 10 | No | driver |
| 11 | No | driver |
| 12 | No | driver |
| 13 | No | driver |
写一段 SQL 语句查出2013年10月1日至2013年10月3日期间非禁止用户的取消率。基于上表,你的 SQL 语句应返回如下结果,取消率(Cancellation Rate)保留两位小数。
| Day | Cancellation Rate |
|---|---|
| 2013-10-01 | 0.33 |
| 2013-10-02 | 0.00 |
| 2013-10-03 | 0.50 |
回答:
创表:
CREATE TABLE Trips(Id SMALLINT UNSIGNED NOT NULL REFERENCES Users(Users_Id),Client_Id SMALLINT UNSIGNED NOT NULL REFERENCES Users(Users_Id),Driver_Id SMALLINT UNSIGNED NOT NULL,City_Id SMALLINT UNSIGNED NOT NULL,Status_ ENUM('completed','cancelled_by_driver','cancelled_by_client') NOT NULL,Request_at DATE NOT NULL );CREATE TABLE Users(Users_Id SMALLINT UNSIGNED NOT NULL,Banned VARCHAR(12) NOT NULL,Role_ ENUM('client', 'driver', 'partner') NOT NULL);
更新数据:
INSERT INTO Trips VALUES(1, 1, 10, 1, 'completed', '2013-10-1'),(2, 2, 11, 1, 'cancelled_by_driver', '2013-10-1'),(3, 3, 12, 6, 'completed', '2013-10-1'),(4, 4, 13, 6, 'cancelled_by_client', '2013-10-1'),(5, 1, 10, 1, 'completed', '2013-10-2'),(6, 2, 11, 6, 'completed', '2013-10-2'),(7, 3, 12, 6, 'completed', '2013-10-2'),(8, 2, 12, 12, 'completed', '2013-10-3'),(9, 3, 10, 12, 'completed', '2013-10-3'),(10, 4, 13, 12, 'cancelled_by_driver', '2013-10-3');-- SELECT * FROM Trips;

INSERT INTO Users VALUES(1, 'No', 'client'), (2, 'Yes', 'client'),(3, 'No', 'client'), (4, 'No', 'client'),(10, 'No', 'driver'), (11, 'No', 'driver'),(12, 'No', 'driver'), (13, 'No', 'driver');-- SELECT * FROM Users;

问题拆解:
- 查询2013年10月1日至2013年10月3日期间的记录,为集合A;
- 查询非禁止用户的记录,为集合B;
- 求集合A与集合B的交集,为集合C;
汇总集合C的取消人数和总人数,计算取消率,并保留2位小数。- 按日期分组汇总集合C的取消人数,为字段cancel
- 按日期分组汇总集合C的总人数,为字段total
- cancel/total 就是所求的取消率了。
编写SQL语句:
拆解的部分
-- 查询2013年10月1日至2013年10月3日期间的记录,为集合A;SELECT *FROM TripsWHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3';-- 查询非禁止用户的记录,为集合B;SELECT Users_idFROM UsersWHERE Banned = 'No';-- 查询取消的记录;SELECT *FROM TripsWHERE Status_ <> 'completed';-- 求集合A与集合B的交集,为集合C;-- 并按日期分组汇总集合C的取消人数SELECT Request_at, COUNT(*)FROM TripsWHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3'AND Status_ <> 'completed'AND Client_Id in (SELECT Users_id /* 注意这里要用Client_Id */FROM UsersWHERE Banned = 'NO')GROUP BY Request_at;-- 按日期分组汇总集合C的总人数SELECT Request_at, COUNT(*) AS totalFROM TripsWHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3'AND Client_Id in (SELECT Users_idFROM UsersWHERE Banned = 'NO')GROUP BY Request_at;
合并拆解的内容
SELECT t2.Request_at AS Day,CASE WHEN t1.cancel IS NULL THEN 0ELSE ROUND(t1.cancel/t2.total, 2)END AS `Cancellation Rate`FROM(SELECT Request_at, COUNT(*) AS cancelFROM TripsWHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3'AND Status_ <> 'completed'AND Client_Id in (SELECT Users_idFROM UsersWHERE Banned = 'NO')GROUP BY Request_at) AS t1RIGHT JOIN(SELECT Request_at, COUNT(*) AS totalFROM TripsWHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3'AND Client_Id in (SELECT Users_idFROM UsersWHERE Banned = 'NO')GROUP BY Request_at) AS t2ON t2.Request_at = t1.Request_at;

优化后的SQL语句:
SELECT t.Request_at AS Day,ROUND(SUM(t.status_)/COUNT(*), 2) AS `Cancellation Rate`FROM(SELECT Request_at,CASE WHEN Status_='completed' THEN 0 ELSE 1 END AS status_FROM TripsWHERE Request_at BETWEEN '2013-10-1' AND '2013-10-3'AND Client_Id in (SELECT Users_idFROM UsersWHERE Banned = 'NO')) AS tGROUP BY t.Request_at;

