• 在 T-SQL 的 WITH 语句文档中,CTE 被认为是 a temporary named result set。
    • This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE or MERGE statement.
    • This clause can also be used in a CREATE VIEW statement as part of its defining SELECT statement.
    • A common table expression can include references to itself.
    • This is referred to as a recursive common table expression.
  • 前述文档也给出了较为丰富的使用案例。这里仅谈谈我认为最大的两个作用:

    作用一:作为临时表

  • 使用 WITH 语句可以定义一个临时表,而不需要实际执行该表。

  • 可以在 WITH 语句中同时定义多个临时表。
  • 通过 WITH 语句,可以避免过多层的嵌套,减少重复 SQL,让复杂 SQL 易于理解。
  • 坏处是不便于 debug,在逻辑十分复杂时,通过临时表存储中间数据,反而更易 debug。
    1. -- 本案例来自 T-SQL 文档
    2. -- Define the CTE expression name and column list.
    3. WITH Sales_CTE (SalesPersonID, SalesOrderID, SalesYear)
    4. AS
    5. -- Define the CTE query.
    6. (
    7. SELECT SalesPersonID, SalesOrderID, YEAR(OrderDate) AS SalesYear
    8. FROM Sales.SalesOrderHeader
    9. WHERE SalesPersonID IS NOT NULL
    10. )
    11. -- Define the outer query referencing the CTE name.
    12. SELECT SalesPersonID, COUNT(SalesOrderID) AS TotalSales, SalesYear
    13. FROM Sales_CTE
    14. GROUP BY SalesYear, SalesPersonID
    15. ORDER BY SalesPersonID, SalesYear;

作用二:实现迭代/循环

  • 之前在使用 HiveQL 的时候,苦于 SQL 无法循环处理数据,都得让数据工程师帮忙开发函数。后来遇到一位DBA,很自信地告诉我,SQL是可以循环的,给我展示了他的 Oracle SQL。我仔细一查,HiveQL 还是不支持的,Apache Hive 的文档里写到:Recursive Queries are not supported
  • WITH 的循环,需要 UNION ALL 的配合,甚至可以通过数据库的变量功能来作为循环控制变量:
-- 本案例来自 T-SQL 文档
-- Create an Employee table.  
CREATE TABLE dbo.MyEmployees  
(  
EmployeeID SMALLINT NOT NULL,  
FirstName NVARCHAR(30)  NOT NULL,  
LastName  NVARCHAR(40) NOT NULL,  
Title NVARCHAR(50) NOT NULL,  
DeptID SMALLINT NOT NULL,  
ManagerID SMALLINT NULL,  
 CONSTRAINT PK_EmployeeID PRIMARY KEY CLUSTERED (EmployeeID ASC),
 CONSTRAINT FK_MyEmployees_ManagerID_EmployeeID FOREIGN KEY (ManagerID) REFERENCES dbo.MyEmployees (EmployeeID)
);  
-- Populate the table with values.  
INSERT INTO dbo.MyEmployees VALUES   
 (1, N'Ken', N'Sánchez', N'Chief Executive Officer',16,NULL)  
,(273, N'Brian', N'Welcker', N'Vice President of Sales',3,1)  
,(274, N'Stephen', N'Jiang', N'North American Sales Manager',3,273)  
,(275, N'Michael', N'Blythe', N'Sales Representative',3,274)  
,(276, N'Linda', N'Mitchell', N'Sales Representative',3,274)  
,(285, N'Syed', N'Abbas', N'Pacific Sales Manager',3,273)  
,(286, N'Lynn', N'Tsoflias', N'Sales Representative',3,285)  
,(16,  N'David',N'Bradley', N'Marketing Manager', 4, 273)  
,(23,  N'Mary', N'Gibson', N'Marketing Specialist', 4, 16);

-- 循环示例
WITH DirectReports(ManagerID, EmployeeID, Title, EmployeeLevel) AS   
(  
    SELECT ManagerID, EmployeeID, Title, 0 AS EmployeeLevel  
    FROM dbo.MyEmployees   
    WHERE ManagerID IS NULL  
    UNION ALL  
    SELECT e.ManagerID, e.EmployeeID, e.Title, EmployeeLevel + 1  
    FROM dbo.MyEmployees AS e  
        INNER JOIN DirectReports AS d  
        ON e.ManagerID = d.EmployeeID   
)  
SELECT ManagerID, EmployeeID, Title, EmployeeLevel   
FROM DirectReports  
ORDER BY ManagerID;

参考材料