Spark
Feature | Items | Iceberg | Hudi |
---|---|---|---|
DDL | SQL create table | ☑️ | ☑️ |
SQL create table … as select | ☑️ | ☑️ | |
SQL replace table … as select | ☑️ | ✖️ | |
SQL drop table | ☑️ | ✖️ | |
SQL alter table | ☑️ | ☑️ 部分支持 | |
Write | SQL insert into | ☑️ | ☑️ |
SQL insert overwrite | ☑️ | ☑️ | |
Read | SQL select(最新快照) | ☑️ | ☑️ |
DataFrame time travel query(某个时间点的快照) | ☑️ as-of-timestamp | ☑️ as.of.instant | |
DataFrame version travel query(某个版本点的快照) | ☑️ snapshot-id | ✖️ | |
Incremental | DataFrame incremental query(两个快照间的增量查询) | ☑️ start-snapshot-id, end-snapshot-id 可选的,默认为当前快照 |
☑️ beginInstantTime, endInstantTime |
Streaming | DataFrame incremental query(某个时间点后的增量查询) | ☑️ stream-from-timestamp | ☑️ beginInstantTime |
DataFrame write | ☑️ | ? | |
Update | SQL update | ☑️ | ☑️ |
SQL merge into | ☑️ | ☑️ | |
Delete | SQL delete from | ☑️ | ☑️ |
Flink
Feature | Items | Iceberg | Hudi |
---|---|---|---|
DDL | SQL create table | ☑️ | ☑️ |
SQL create table like | ☑️ | ✖️ | |
SQL drop table | ☑️ | ✖️ | |
SQL alter table | ☑️ 部分支持 | ☑️ 部分支持 | |
Write | SQL insert into | ☑️ | ☑️ |
SQL insert overwrite | ☑️ | ✖️ 计划中 | |
Read | SQL select(最新快照) | ☑️ | ☑️ |
(某个时间点的快照) | ✖️ | ✖️ | |
(某个版本点的快照) | ✖️ | ✖️ | |
Incremental | SQL select | ✖️ | ☑️ read.start-commit, read.end-commit |
Streaming | SQL select | ☑️ streaming, monitor-interval, start-snapshot-id |
☑️ read.streaming.enabled, read.streaming.check-interval, read.start-commit |
SQL insert into | ☑️ | ? |
:::info Time Travel 使用场景
- 回滚:恢复到表的以前版本
- 调试:检查以前版本的数据,以查看它是如何随时间变化的
- 审核历史:通过 commit 的线索,可以查看数据的变更记录
:::