SQL

译者:flink.sojb.cn

SQL查询是使用sqlQuery()方法指定的TableEnvironment。该方法返回SQL查询的结果为Table。A Table可以在后续的SQL和 Table API查询中使用,可以转换为DataSet或DataStream,也可以写入TableSink)。SQL和 Table API查询可以无缝混合,并进行整体优化并转换为单个程序。

要访问SQL查询中的表,必须在TableEnvironment中注册它。可以从TableSourceTableDataStream或DataSet 注册表。或者,用户还可以在TableEnvironment中注册外部目录以指定数据源的位置。

为方便起见,Table.toString()自动在其中以唯一名称注册表TableEnvironment并返回名称。因此,Table对象可以直接内联到SQL查询中(通过字符串连接),如下面的示例所示。

注意: Flink的SQL支持尚未完成。包含不受支持的SQL函数的查询会导致a TableException。以下部分列出了批处理和流表上SQL的受支持函数。

指定查询

以下示例显示如何在已注册和内联表中指定SQL查询。

  1. StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
  2. StreamTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(env);
  3. // ingest a DataStream from an external source
  4. DataStream<Tuple3<Long, String, Integer>> ds = env.addSource(...);
  5. // SQL query with an inlined (unregistered) table
  6. Table table = tableEnv.toTable(ds, "user, product, amount");
  7. Table result = tableEnv.sqlQuery(
  8. "SELECT SUM(amount) FROM " + table + " WHERE product LIKE '%Rubber%'");
  9. // SQL query with a registered table
  10. // register the DataStream as table "Orders"
  11. tableEnv.registerDataStream("Orders", ds, "user, product, amount");
  12. // run a SQL query on the Table and retrieve the result as a new Table
  13. Table result2 = tableEnv.sqlQuery(
  14. "SELECT product, amount FROM Orders WHERE product LIKE '%Rubber%'");
  15. // SQL update with a registered table
  16. // create and register a TableSink
  17. TableSink csvSink = new CsvTableSink("/path/to/file", ...);
  18. String[] fieldNames = {"product", "amount"};
  19. TypeInformation[] fieldTypes = {Types.STRING, Types.INT};
  20. tableEnv.registerTableSink("RubberOrders", fieldNames, fieldTypes, csvSink);
  21. // run a SQL update query on the Table and emit the result to the TableSink
  22. tableEnv.sqlUpdate(
  23. "INSERT INTO RubberOrders SELECT product, amount FROM Orders WHERE product LIKE '%Rubber%'");
  1. val env = StreamExecutionEnvironment.getExecutionEnvironment
  2. val tableEnv = TableEnvironment.getTableEnvironment(env)
  3. // read a DataStream from an external source val ds: DataStream[(Long, String, Integer)] = env.addSource(...)
  4. // SQL query with an inlined (unregistered) table val table = ds.toTable(tableEnv, 'user, 'product, 'amount)
  5. val result = tableEnv.sqlQuery(
  6. s"SELECT SUM(amount) FROM $table WHERE product LIKE '%Rubber%'")
  7. // SQL query with a registered table
  8. // register the DataStream under the name "Orders" tableEnv.registerDataStream("Orders", ds, 'user, 'product, 'amount)
  9. // run a SQL query on the Table and retrieve the result as a new Table val result2 = tableEnv.sqlQuery(
  10. "SELECT product, amount FROM Orders WHERE product LIKE '%Rubber%'")
  11. // SQL update with a registered table
  12. // create and register a TableSink TableSink csvSink = new CsvTableSink("/path/to/file", ...)
  13. val fieldNames: Array[String] = Array("product", "amount")
  14. val fieldTypes: Array[TypeInformation[_]] = Array(Types.STRING, Types.INT)
  15. tableEnv.registerTableSink("RubberOrders", fieldNames, fieldTypes, csvSink)
  16. // run a SQL update query on the Table and emit the result to the TableSink tableEnv.sqlUpdate(
  17. "INSERT INTO RubberOrders SELECT product, amount FROM Orders WHERE product LIKE '%Rubber%'")

支持的语法

Flink使用Apache Calcite解析SQL ,它支持标准的ANSI SQL。Flink不支持DDL语句。

以下BNF语法描述了批处理和流式查询中支持的SQL函数的超集。“ 算子”部分显示支持的函数的示例,并指示仅批处理或流式查询支持哪些函数。

  1. insert:
  2. INSERT INTO tableReference
  3. query
  4. query:
  5. values
  6. | {
  7. select
  8. | selectWithoutFrom
  9. | query UNION [ ALL ] query
  10. | query EXCEPT query
  11. | query INTERSECT query
  12. }
  13. [ ORDER BY orderItem [, orderItem ]* ]
  14. [ LIMIT { count | ALL } ]
  15. [ OFFSET start { ROW | ROWS } ]
  16. [ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY]
  17. orderItem:
  18. expression [ ASC | DESC ]
  19. select:
  20. SELECT [ ALL | DISTINCT ]
  21. { * | projectItem [, projectItem ]* }
  22. FROM tableExpression
  23. [ WHERE booleanExpression ]
  24. [ GROUP BY { groupItem [, groupItem ]* } ]
  25. [ HAVING booleanExpression ]
  26. [ WINDOW windowName AS windowSpec [, windowName AS windowSpec ]* ]
  27. selectWithoutFrom:
  28. SELECT [ ALL | DISTINCT ]
  29. { * | projectItem [, projectItem ]* }
  30. projectItem:
  31. expression [ [ AS ] columnAlias ]
  32. | tableAlias . *
  33. tableExpression:
  34. tableReference [, tableReference ]*
  35. | tableExpression [ NATURAL ] [ LEFT | RIGHT | FULL ] JOIN tableExpression [ joinCondition ]
  36. joinCondition:
  37. ON booleanExpression
  38. | USING '(' column [, column ]* ')'
  39. tableReference:
  40. tablePrimary
  41. [ [ AS ] alias [ '(' columnAlias [, columnAlias ]* ')' ] ]
  42. tablePrimary:
  43. [ TABLE ] [ [ catalogName . ] schemaName . ] tableName
  44. | LATERAL TABLE '(' functionName '(' expression [, expression ]* ')' ')'
  45. | UNNEST '(' expression ')'
  46. values:
  47. VALUES expression [, expression ]*
  48. groupItem:
  49. expression
  50. | '(' ')'
  51. | '(' expression [, expression ]* ')'
  52. | CUBE '(' expression [, expression ]* ')'
  53. | ROLLUP '(' expression [, expression ]* ')'
  54. | GROUPING SETS '(' groupItem [, groupItem ]* ')'
  55. windowRef:
  56. windowName
  57. | windowSpec
  58. windowSpec:
  59. [ windowName ]
  60. '('
  61. [ ORDER BY orderItem [, orderItem ]* ]
  62. [ PARTITION BY expression [, expression ]* ]
  63. [
  64. RANGE numericOrIntervalExpression {PRECEDING}
  65. | ROWS numericExpression {PRECEDING}
  66. ]
  67. ')'

Flink SQL对类似于Java标识符(表,属性,函数名)使用词法策略:

  • 无论是否引用标识符,都会保存标识符的大小写。
  • 之后,标识符区分大小写。
  • 与Java不同,反向标记允许标识符包含非字母数字字符(例如"SELECT a ASmy fieldFROM t")。

算子

Scan,Projection和过滤


操作:Scan/Select/As 批量 流

描述:

  1. SELECT * FROM Orders
  2. SELECT a, c AS d FROM Orders

操作:Where / Filter Batch Streaming

描述:

  1. SELECT * FROM Orders WHERE b = 'red'
  2. SELECT * FROM Orders WHERE a % 2 = 0

操作:User-defined Scalar Functions (Scalar UDF) 批量 流

描述:UDF必须在TableEnvironment中注册。有关如何指定和注册标量UDF的详细信息,请参阅UDF文档

  1. SELECT PRETTY_PRINT(user) FROM Orders

聚合

操作:GroupBy聚合 批处理 流 结果更新

注意:流表上的GroupBy会生成更新结果。有关详细信息,请参阅Streaming Concepts页面。

  1. SELECT a, SUM(b) as d
  2. FROM Orders
  3. GROUP BY a

操作:GroupBy窗口聚合 批量 流

描述:使用组窗口计算每个组的单个结果行。有关详细信息,请参阅GroupWindows部分。

  1. SELECT user, SUM(amount)
  2. FROM Orders
  3. GROUP BY TUMBLE(rowtime, INTERVAL '1' DAY), user

操作:Over Window聚合

描述:注意:必须在同一窗口中定义所有聚合,即相同的分区,排序和范围。目前,仅支持具有PRREDING(UNBOUNDED和有界)到CURRENT ROW范围的窗口。尚不支持使用FOLLOWING的范围。必须在单个时间属性上指定ORDER BY

  1. SELECT COUNT(amount) OVER (
  2. PARTITION BY user
  3. ORDER BY proctime
  4. ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
  5. FROM Orders
  6. SELECT COUNT(amount) OVER w, SUM(amount) OVER w
  7. FROM Orders
  8. WINDOW w AS (
  9. PARTITION BY user
  10. ORDER BY proctime
  11. ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)

操作:Distinct 批量 流 结果更新

  1. SELECT DISTINCT users FROM Orders

注意:对于流式查询,计算查询结果所需的状态可能会无限增长,具体取决于不同字段的数量。请提供具有有效保存间隔的查询配置,以防止过大的状态。有关详细信息,请参阅Streaming Concepts


操作:分组集,汇总,多维数据集 批量

描述:

  1. SELECT SUM(amount)
  2. FROM Orders
  3. GROUP BY GROUPING SETS ((user), (product))

操作:Having 批量 流

描述:

  1. SELECT SUM(amount)
  2. FROM Orders
  3. GROUP BY users
  4. HAVING SUM(amount) &gt; 50

操作:用户定义的聚合函数(UDAGG) 批量 流

描述:UDAGG必须在TableEnvironment中注册。有关如何指定和注册UDAGG的详细信息,请参阅UDF文档

  1. SELECT MyAggregate(amount)
  2. FROM Orders
  3. GROUP BY users

Join


操作:内部Equi-join 批量 流

描述:目前,仅支持等连接,即具有至少一个带有等式谓词的连接条件的连接。不支持任意交叉或theta连接。注意:连接顺序未优化。表按照FROM子句中指定的顺序连接。确保以不产生交叉连接(笛卡尔积)的顺序指定表,这些表不受支持并且会导致查询失败。

  1. SELECT *
  2. FROM Orders INNER JOIN Product ON Orders.productId = Product.id

注意:对于流式查询,计算查询结果所需的状态可能会无限增长,具体取决于不同输入行的数量。请提供具有有效保存间隔的查询配置,以防止过大的状态。有关详细信息,请参阅Streaming Concepts


操作:外部Equi-join 批量 流 结果更新

描述:目前,仅支持等连接,即具有至少一个带有等式谓词的连接条件的连接。不支持任意交叉或theta连接。注意:连接顺序未优化。表按照FROM子句中指定的顺序连接。确保以不产生交叉连接(笛卡尔积)的顺序指定表,这些表不受支持并且会导致查询失败。

  1. SELECT *
  2. FROM Orders LEFT JOIN Product ON Orders.productId = Product.id
  3. SELECT *
  4. FROM Orders RIGHT JOIN Product ON Orders.productId = Product.id
  5. SELECT *
  6. FROM Orders FULL OUTER JOIN Product ON Orders.productId = Product.id

注意:对于流式查询,计算查询结果所需的状态可能会无限增长,具体取决于不同输入行的数量。请提供具有有效保存间隔的查询配置,以防止过大的状态。有关详细信息,请参阅Streaming Concepts


操作:Time-windowed Join 批量 流

描述:注意:时间窗口连接是可以以流方式处理的常规连接的子集。时间窗口连接需要至少一个等连接谓词和一个限制双方时间的连接条件。这样的条件可以由两个适当的范围谓词(&lt;, &lt;=, &gt;=, &gt;),BETWEEN谓词或单个等式谓词来定义,其比较两个输入表的相同类型的时间属性(即,处理时间或事件时间)。例如,以下谓词是有效的窗口连接条件:

  • ltime = rtime
  • ltime &gt;= rtime AND ltime &lt; rtime + INTERVAL '10' MINUTE
  • ltime BETWEEN rtime - INTERVAL '10' SECOND AND rtime + INTERVAL '5' SECOND
  1. SELECT *
  2. FROM Orders o, Shipments s
  3. WHERE o.id = s.orderId AND
  4. o.ordertime BETWEEN s.shiptime - INTERVAL '4' HOUR AND s.shiptime

如果订单在收到订单后四小时发货,上面的示例将关联所有订单及其相应的货件。


操作:将数组扩展为关系 Batch Streaming

描述:尚未支持UnANDing WITH ORDINALITY。

  1. SELECT users, tag
  2. FROM Orders CROSS JOIN UNNEST(tags) AS t (tag)

操作:关联用户定义的表函数(UDTF) 批量 流

描述:UDTF必须在TableEnvironment中注册。有关如何指定和注册UDTF的详细信息,请参阅UDF文档。内部联接

  1. SELECT users, tag
  2. FROM Orders, LATERAL TABLE(unnest_udtf(tags)) t AS tag

左外连接

  1. SELECT users, tag
  2. FROM Orders LEFT JOIN LATERAL TABLE(unnest_udtf(tags)) t AS tag ON TRUE

注意:目前,TRUE对于横向表,只支持左外连接的谓词作为谓词。

设置 算子操作


操作:Union 批次

描述:

  1. SELECT *
  2. FROM (
  3. (SELECT user FROM Orders WHERE a % 2 = 0)
  4. UNION
  5. (SELECT user FROM Orders WHERE b = 0)
  6. )

操作:UnionAll Batch Streaming

描述:

  1. SELECT *
  2. FROM (
  3. (SELECT user FROM Orders WHERE a % 2 = 0)
  4. UNION ALL
  5. (SELECT user FROM Orders WHERE b = 0)
  6. )

操作:Intersect/ Except批量

  1. SELECT *
  2. FROM (
  3. (SELECT user FROM Orders WHERE a % 2 = 0)
  4. INTERSECT
  5. (SELECT user FROM Orders WHERE b = 0)
  6. )
  1. SELECT *
  2. FROM (
  3. (SELECT user FROM Orders WHERE a % 2 = 0)
  4. EXCEPT
  5. (SELECT user FROM Orders WHERE b = 0)
  6. )

操作:IN 批量 流中

描述:如果表达式存在于给定的表子查询中,则返回true。子查询表必须包含一列。此列必须与表达式具有相同的数据类型。

  1. SELECT user, amount
  2. FROM Orders
  3. WHERE product IN (
  4. SELECT product FROM NewProducts
  5. )

注意:对于流式查询, 算子操作将在连接和组 算子操作中重写。计算查询结果所需的状态可能会无限增长,具体取决于不同输入行的数量。请提供具有有效保存间隔的查询配置,以防止过大的状态。有关详细信息,请参阅Streaming Concepts


操作:Exists 批量 流

描述:如果子查询至少返回一行,则返回true。仅在可以在连接和组 算子操作中重写 算子操作时才支持。

  1. SELECT user, amount
  2. FROM Orders
  3. WHERE product EXISTS (
  4. SELECT product FROM NewProducts
  5. )

注意:对于流式查询, 算子操作将在连接和组 算子操作中重写。计算查询结果所需的状态可能会无限增长,具体取决于不同输入行的数量。请提供具有有效保存间隔的查询配置,以防止过大的状态。有关详细信息,请参阅Streaming Concepts

OrderBy&Limit


操作:Order By 批量 流

描述:注意:流式查询的结果必须主要按升序时间属性排序。支持其他排序属性。

  1. SELECT *
  2. FROM Orders
  3. ORDER BY orderTime

操作:Limit 批次

描述:

  1. SELECT *
  2. FROM Orders
  3. LIMIT 3

Insert


操作:Insert 批量 流处理

描述:输出表必须在TableEnvironment中注册(请参阅注册TableSink)。此外,已注册表的模式必须与查询的模式匹配。

  1. INSERT INTO OutputTable
  2. SELECT users, tag
  3. FROM Orders

GroupWindows

组窗口在GROUP BYSQL查询的子句中定义。就像具有常规GROUP BY子句的查询一样,带有GROUP BY包含组窗口函数的子句的查询会为每个组计算单个结果行。批处理和流表上的SQL支持以下组窗口函数。

组窗口函数 描述
TUMBLE(time_attr, interval) 定义翻滚时间窗口。翻滚时间窗口将行分配给具有固定持续时间(interval)的非重叠连续窗口。例如,5分钟的翻滚窗口以5分钟为间隔对行进行分组。可以在事件时间(流+批处理)或处理时间(流)上定义翻滚窗口。
HOP(time_attr, interval, interval) 定义跳跃时间窗口(在 Table API中称为滑动窗口)。跳跃时间窗口具有固定的持续时间(第二interval参数)并且按指定的跳跃间隔(第一interval参数)跳跃。如果跳跃间隔小于窗口大小,则跳跃窗口重叠。因此,可以将行分配给多个窗口。例如,15分钟大小和5分钟跳跃间隔的跳跃窗口将每行分配给3个不同的15分钟大小的窗口,这些窗口以5分钟的间隔进行评估。可以在事件时间(流+批处理)或处理时间(流)上定义跳跃窗口。
SESSION(time_attr, interval) 定义会话时间窗口。会话时间窗口没有固定的持续时间,但它们的界限由interval不活动时间定义,即如果在定义的间隙期间没有出现事件,则会话窗口关闭。例如,如果在30分钟不活动后观察到一行,则会开始一个30分钟间隙的会话窗口(否则该行将被添加到现有窗口中),如果在30分钟内未添加任何行,则会关闭。会话窗口可以在事件时间(流+批处理)或处理时间(流)上工作。

时间属性

对于流表的SQL查询,time_attr组窗口函数的参数必须引用指定行的处理时间或事件时间的有效时间属性。请参阅时间属性文档以了解如何定义时间属性。

对于批处理表上的SQL,time_attr组窗口函数的参数必须是类型的属性TIMESTAMP

选择组窗口开始和结束时间戳

可以使用以下辅助函数选择组窗口的开始和结束时间戳以及时间属性:

辅助函数 描述
TUMBLE_START(time_attr, interval) 返回相应的翻滚,跳跃或会话窗口的包含下限的时间戳。
HOP_START(time_attr, interval, interval)
SESSION_START(time_attr, interval)
TUMBLE_END(time_attr, interval) 返回相应的翻滚,跳跃或会话窗口的独占上限的时间戳。注意:独占上限时间戳不能在后续基于时间的 算子操作中用作行时属性,例如时间窗口连接组窗口或窗口聚合
HOP_END(time_attr, interval, interval)
SESSION_END(time_attr, interval)
TUMBLE_ROWTIME(time_attr, interval) 返回相应的翻滚,跳跃或会话窗口的包含上限的时间戳。结果属性是rowtime属性,可用于后续基于时间的 算子操作,例如时间窗口连接组窗口或窗口聚合
HOP_ROWTIME(time_attr, interval, interval)
SESSION_ROWTIME(time_attr, interval)
TUMBLE_PROCTIME(time_attr, interval) 返回proctime属性,该属性可用于后续基于时间的 算子操作,例如时间窗口连接组窗口或窗口聚合
HOP_PROCTIME(time_attr, interval, interval)
SESSION_PROCTIME(time_attr, interval)

注意:必须使用与GROUP BY子句中的组窗口函数完全相同的参数调用辅助函数。

以下示例显示如何在流表上指定具有组窗口的SQL查询。

  1. StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
  2. StreamTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(env);
  3. // ingest a DataStream from an external source
  4. DataStream<Tuple3<Long, String, Integer>> ds = env.addSource(...);
  5. // register the DataStream as table "Orders"
  6. tableEnv.registerDataStream("Orders", ds, "user, product, amount, proctime.proctime, rowtime.rowtime");
  7. // compute SUM(amount) per day (in event-time)
  8. Table result1 = tableEnv.sqlQuery(
  9. "SELECT user, " +
  10. " TUMBLE_START(rowtime, INTERVAL '1' DAY) as wStart, " +
  11. " SUM(amount) FROM Orders " +
  12. "GROUP BY TUMBLE(rowtime, INTERVAL '1' DAY), user");
  13. // compute SUM(amount) per day (in processing-time)
  14. Table result2 = tableEnv.sqlQuery(
  15. "SELECT user, SUM(amount) FROM Orders GROUP BY TUMBLE(proctime, INTERVAL '1' DAY), user");
  16. // compute every hour the SUM(amount) of the last 24 hours in event-time
  17. Table result3 = tableEnv.sqlQuery(
  18. "SELECT product, SUM(amount) FROM Orders GROUP BY HOP(rowtime, INTERVAL '1' HOUR, INTERVAL '1' DAY), product");
  19. // compute SUM(amount) per session with 12 hour inactivity gap (in event-time)
  20. Table result4 = tableEnv.sqlQuery(
  21. "SELECT user, " +
  22. " SESSION_START(rowtime, INTERVAL '12' HOUR) AS sStart, " +
  23. " SESSION_ROWTIME(rowtime, INTERVAL '12' HOUR) AS snd, " +
  24. " SUM(amount) " +
  25. "FROM Orders " +
  26. "GROUP BY SESSION(rowtime, INTERVAL '12' HOUR), user");
  1. val env = StreamExecutionEnvironment.getExecutionEnvironment
  2. val tableEnv = TableEnvironment.getTableEnvironment(env)
  3. // read a DataStream from an external source val ds: DataStream[(Long, String, Int)] = env.addSource(...)
  4. // register the DataStream under the name "Orders" tableEnv.registerDataStream("Orders", ds, 'user, 'product, 'amount, 'proctime.proctime, 'rowtime.rowtime)
  5. // compute SUM(amount) per day (in event-time) val result1 = tableEnv.sqlQuery(
  6. """
  7. |SELECT
  8. | user,
  9. | TUMBLE_START(rowtime, INTERVAL '1' DAY) as wStart,
  10. | SUM(amount)
  11. | FROM Orders
  12. | GROUP BY TUMBLE(rowtime, INTERVAL '1' DAY), user
  13. """.stripMargin)
  14. // compute SUM(amount) per day (in processing-time) val result2 = tableEnv.sqlQuery(
  15. "SELECT user, SUM(amount) FROM Orders GROUP BY TUMBLE(proctime, INTERVAL '1' DAY), user")
  16. // compute every hour the SUM(amount) of the last 24 hours in event-time val result3 = tableEnv.sqlQuery(
  17. "SELECT product, SUM(amount) FROM Orders GROUP BY HOP(rowtime, INTERVAL '1' HOUR, INTERVAL '1' DAY), product")
  18. // compute SUM(amount) per session with 12 hour inactivity gap (in event-time) val result4 = tableEnv.sqlQuery(
  19. """
  20. |SELECT
  21. | user,
  22. | SESSION_START(rowtime, INTERVAL '12' HOUR) AS sStart,
  23. | SESSION_END(rowtime, INTERVAL '12' HOUR) AS sEnd,
  24. | SUM(amount)
  25. | FROM Orders
  26. | GROUP BY SESSION(rowtime(), INTERVAL '12' HOUR), user
  27. """.stripMargin)

数据类型

SQL运行时构建于Flink的DataSet和DataStream API之上。在内部,它还使用Flink TypeInformation来定义数据类型。完全支持的类型列在org.apache.flink.table.api.Types。下表总结了SQL类型, Table API类型和生成的Java类之间的关系。

Table API SQL Java类型
Types.STRING VARCHAR java.lang.String
Types.BOOLEAN BOOLEAN java.lang.Boolean
Types.BYTE TINYINT java.lang.Byte
Types.SHORT SMALLINT java.lang.Short
Types.INT INTEGER, INT java.lang.Integer
Types.LONG BIGINT java.lang.Long
Types.FLOAT REAL, FLOAT java.lang.Float
Types.DOUBLE DOUBLE java.lang.Double
Types.DECIMAL DECIMAL java.math.BigDecimal
Types.SQL_DATE DATE java.sql.Date
Types.SQL_TIME TIME java.sql.Time
Types.SQL_TIMESTAMP TIMESTAMP(3) java.sql.Timestamp
Types.INTERVAL_MONTHS INTERVAL YEAR TO MONTH java.lang.Integer
Types.INTERVAL_MILLIS INTERVAL DAY TO SECOND(3) java.lang.Long
Types.PRIMITIVE_ARRAY ARRAY 例如 int[]
Types.OBJECT_ARRAY ARRAY 例如 java.lang.Byte[]
Types.MAP MAP java.util.HashMap
Types.MULTISET MULTISET 例如,java.util.HashMap&lt;String, Integer&gt;对于多重集合String
Types.ROW ROW org.apache.flink.types.Row

通用类型和(嵌套)复合类型(例如,POJO,元组,Row,Scala案例类)也可以是行的字段。

可以使用值访问函数访问具有任意嵌套的复合类型的字段。

通用类型被视为黑盒子,可以由用户定义的函数传递或处理。

保存关键字

虽然并非每个SQL函数都已实现,但某些字符串组合已被保存为关键字以供将来使用。如果要将以下字符串之一用作字段名称,请确保使用反引号将其包围(例如valuecount)。

  1. A, ABS, ABSOLUTE, ACTION, ADA, ADD, ADMIN, AFTER, ALL, ALLOCATE, ALLOW, ALTER, ALWAYS, AND, ANY, ARE, ARRAY, AS, ASC, ASENSITIVE, ASSERTION, ASSIGNMENT, ASYMMETRIC, AT, ATOMIC, ATTRIBUTE, ATTRIBUTES, AUTHORIZATION, AVG, BEFORE, BEGIN, BERNOULLI, BETWEEN, BIGINT, BINARY, BIT, BLOB, BOOLEAN, BOTH, BREADTH, BY, C, CALL, CALLED, CARDINALITY, CASCADE, CASCADED, CASE, CAST, CATALOG, CATALOG_NAME, CEIL, CEILING, CENTURY, CHAIN, CHAR, CHARACTER, CHARACTERISTICS, CHARACTERS, CHARACTER_LENGTH, CHARACTER_SET_CATALOG, CHARACTER_SET_NAME, CHARACTER_SET_SCHEMA, CHAR_LENGTH, CHECK, CLASS_ORIGIN, CLOB, CLOSE, COALESCE, COBOL, COLLATE, COLLATION, COLLATION_CATALOG, COLLATION_NAME, COLLATION_SCHEMA, COLLECT, COLUMN, COLUMN_NAME, COMMAND_FUNCTION, COMMAND_FUNCTION_CODE, COMMIT, COMMITTED, CONDITION, CONDITION_NUMBER, CONNECT, CONNECTION, CONNECTION_NAME, CONSTRAINT, CONSTRAINTS, CONSTRAINT_CATALOG, CONSTRAINT_NAME, CONSTRAINT_SCHEMA, CONSTRUCTOR, CONTAINS, CONTINUE, CONVERT, CORR, CORRESPONDING, COUNT, COVAR_POP, COVAR_SAMP, CREATE, CROSS, CUBE, CUME_DIST, CURRENT, CURRENT_CATALOG, CURRENT_DATE, CURRENT_DEFAULT_TRANSFORM_GROUP, CURRENT_PATH, CURRENT_ROLE, CURRENT_SCHEMA, CURRENT_TIME, CURRENT_TIMESTAMP, CURRENT_TRANSFORM_GROUP_FOR_TYPE, CURRENT_USER, CURSOR, CURSOR_NAME, CYCLE, DATA, DATABASE, DATE, DATETIME_INTERVAL_CODE, DATETIME_INTERVAL_PRECISION, DAY, DEALLOCATE, DEC, DECADE, DECIMAL, DECLARE, DEFAULT, DEFAULTS, DEFERRABLE, DEFERRED, DEFINED, DEFINER, DEGREE, DELETE, DENSE_RANK, DEPTH, DEREF, DERIVED, DESC, DESCRIBE, DESCRIPTION, DESCRIPTOR, DETERMINISTIC, DIAGNOSTICS, DISALLOW, DISCONNECT, DISPATCH, DISTINCT, DOMAIN, DOUBLE, DOW, DOY, DROP, DYNAMIC, DYNAMIC_FUNCTION, DYNAMIC_FUNCTION_CODE, EACH, ELEMENT, ELSE, END, END-EXEC, EPOCH, EQUALS, ESCAPE, EVERY, EXCEPT, EXCEPTION, EXCLUDE, EXCLUDING, EXEC, EXECUTE, EXISTS, EXP, EXPLAIN, EXTEND, EXTERNAL, EXTRACT, FALSE, FETCH, FILTER, FINAL, FIRST, FIRST_VALUE, FLOAT, FLOOR, FOLLOWING, FOR, FOREIGN, FORTRAN, FOUND, FRAC_SECOND, FREE, FROM, FULL, FUNCTION, FUSION, G, GENERAL, GENERATED, GET, GLOBAL, GO, GOTO, GRANT, GRANTED, GROUP, GROUPING, HAVING, HIERARCHY, HOLD, HOUR, IDENTITY, IMMEDIATE, IMPLEMENTATION, IMPORT, IN, INCLUDING, INCREMENT, INDICATOR, INITIALLY, INNER, INOUT, INPUT, INSENSITIVE, INSERT, INSTANCE, INSTANTIABLE, INT, INTEGER, INTERSECT, INTERSECTION, INTERVAL, INTO, INVOKER, IS, ISOLATION, JAVA, JOIN, K, KEY, KEY_MEMBER, KEY_TYPE, LABEL, LANGUAGE, LARGE, LAST, LAST_VALUE, LATERAL, LEADING, LEFT, LENGTH, LEVEL, LIBRARY, LIKE, LIMIT, LN, LOCAL, LOCALTIME, LOCALTIMESTAMP, LOCATOR, LOWER, M, MAP, MATCH, MATCHED, MAX, MAXVALUE, MEMBER, MERGE, MESSAGE_LENGTH, MESSAGE_OCTET_LENGTH, MESSAGE_TEXT, METHOD, MICROSECOND, MILLENNIUM, MIN, MINUTE, MINVALUE, MOD, MODIFIES, MODULE, MONTH, MORE, MULTISET, MUMPS, NAME, NAMES, NATIONAL, NATURAL, NCHAR, NCLOB, NESTING, NEW, NEXT, NO, NONE, NORMALIZE, NORMALIZED, NOT, NULL, NULLABLE, NULLIF, NULLS, NUMBER, NUMERIC, OBJECT, OCTETS, OCTET_LENGTH, OF, OFFSET, OLD, ON, ONLY, OPEN, OPTION, OPTIONS, OR, ORDER, ORDERING, ORDINALITY, OTHERS, OUT, OUTER, OUTPUT, OVER, OVERLAPS, OVERLAY, OVERRIDING, PAD, PARAMETER, PARAMETER_MODE, PARAMETER_NAME, PARAMETER_ORDINAL_POSITION, PARAMETER_SPECIFIC_CATALOG, PARAMETER_SPECIFIC_NAME, PARAMETER_SPECIFIC_SCHEMA, PARTIAL, PARTITION, PASCAL, PASSTHROUGH, PATH, PERCENTILE_CONT, PERCENTILE_DISC, PERCENT_RANK, PLACING, PLAN, PLI, POSITION, POWER, PRECEDING, PRECISION, PREPARE, PRESERVE, PRIMARY, PRIOR, PRIVILEGES, PROCEDURE, PUBLIC, QUARTER, RANGE, RANK, READ, READS, REAL, RECURSIVE, REF, REFERENCES, REFERENCING, REGR_AVGX, REGR_AVGY, REGR_COUNT, REGR_INTERCEPT, REGR_R2, REGR_SLOPE, REGR_SXX, REGR_SXY, REGR_SYY, RELATIVE, RELEASE, REPEATABLE, RESET, RESTART, RESTRICT, RESULT, RETURN, RETURNED_CARDINALITY, RETURNED_LENGTH, RETURNED_OCTET_LENGTH, RETURNED_SQLSTATE, RETURNS, REVOKE, RIGHT, ROLE, ROLLBACK, ROLLUP, ROUTINE, ROUTINE_CATALOG, ROUTINE_NAME, ROUTINE_SCHEMA, ROW, ROWS, ROW_COUNT, ROW_NUMBER, SAVEPOINT, SCALE, SCHEMA, SCHEMA_NAME, SCOPE, SCOPE_CATALOGS, SCOPE_NAME, SCOPE_SCHEMA, SCROLL, SEARCH, SECOND, SECTION, SECURITY, SELECT, SELF, SENSITIVE, SEQUENCE, SERIALIZABLE, SERVER, SERVER_NAME, SESSION, SESSION_USER, SET, SETS, SIMILAR, SIMPLE, SIZE, SMALLINT, SOME, SOURCE, SPACE, SPECIFIC, SPECIFICTYPE, SPECIFIC_NAME, SQL, SQLEXCEPTION, SQLSTATE, SQLWARNING, SQL_TSI_DAY, SQL_TSI_FRAC_SECOND, SQL_TSI_HOUR, SQL_TSI_MICROSECOND, SQL_TSI_MINUTE, SQL_TSI_MONTH, SQL_TSI_QUARTER, SQL_TSI_SECOND, SQL_TSI_WEEK, SQL_TSI_YEAR, SQRT, START, STATE, STATEMENT, STATIC, STDDEV_POP, STDDEV_SAMP, STREAM, STRUCTURE, STYLE, SUBCLASS_ORIGIN, SUBMULTISET, SUBSTITUTE, SUBSTRING, SUM, SYMMETRIC, SYSTEM, SYSTEM_USER, TABLE, TABLESAMPLE, TABLE_NAME, TEMPORARY, THEN, TIES, TIME, TIMESTAMP, TIMESTAMPADD, TIMESTAMPDIFF, TIMEZONE_HOUR, TIMEZONE_MINUTE, TINYINT, TO, TOP_LEVEL_COUNT, TRAILING, TRANSACTION, TRANSACTIONS_ACTIVE, TRANSACTIONS_COMMITTED, TRANSACTIONS_ROLLED_BACK, TRANSFORM, TRANSFORMS, TRANSLATE, TRANSLATION, TREAT, TRIGGER, TRIGGER_CATALOG, TRIGGER_NAME, TRIGGER_SCHEMA, TRIM, TRUE, TYPE, UESCAPE, UNBOUNDED, UNCOMMITTED, UNDER, UNION, UNIQUE, UNKNOWN, UNNAMED, UNNEST, UPDATE, UPPER, UPSERT, USAGE, USER, USER_DEFINED_TYPE_CATALOG, USER_DEFINED_TYPE_CODE, USER_DEFINED_TYPE_NAME, USER_DEFINED_TYPE_SCHEMA, USING, VALUE, VALUES, VARBINARY, VARCHAR, VARYING, VAR_POP, VAR_SAMP, VERSION, VIEW, WEEK, WHEN, WHENEVER, WHERE, WIDTH_BUCKET, WINDOW, WITH, WITHIN, WITHOUT, WORK, WRAPPER, WRITE, XML, YEAR, ZONE