Fault Tolerance Guarantees of Data Sources and Sinks 数据源和接收器的容错保证
Flink’s fault tolerance mechanism recovers programs in the presence of failures and continues to execute them. Such failures include machine hardware failures, network failures, transient program failures, etc. Flink的容错机制在出现故障时恢复程序并继续执行它们。这些故障包括机器硬件故障,网络故障,瞬时程序故障等。
Flink can guarantee exactly-once state updates to user-defined state only when the source participates in the snapshotting mechanism. The following table lists the state update guarantees of Flink coupled with the bundled connectors. 仅当源参与快照机制时,Flink才能保证将一次准确的状态更新为用户定义的状态。下表列出了Flink和捆绑的连接器的状态更新保证。
Please read the documentation of each connector to understand the details of the fault tolerance guarantees. 请阅读每个连接器的文档以了解容错保证的详细信息。
Source | Guarantees | Notes |
---|---|---|
Apache Kafka | exactly once | Use the appropriate Kafka connector for your version |
AWS Kinesis Streams | exactly once | |
RabbitMQ | at most once (v 0.10) / exactly once (v 1.0) | |
Twitter Streaming API | at most once | |
Collections | exactly once | |
Files | exactly once | |
Sockets | at most once |
To guarantee end-to-end exactly-once record delivery (in addition to exactly-once state semantics), the data sink needs to take part in the checkpointing mechanism. The following table lists the delivery guarantees (assuming exactly-once state updates) of Flink coupled with bundled sinks: 为了保证端到端的一次精确记录传递(除了一次精确的状态语义),数据接收器需要参与检查点机制。下表列出了Flink和捆绑的接收器的交付保证(假设一次状态更新):
Sink | Guarantees | Notes |
---|---|---|
HDFS rolling sink | exactly once | Implementation depends on Hadoop version |
Elasticsearch | at least once | |
Kafka producer | at least once | |
Cassandra sink | at least once / exactly once | exactly once only for idempotent updates |
AWS Kinesis Streams | at least once | |
File sinks | at least once | |
Socket sinks | at least once | |
Standard output | at least once | |
Redis sink | at least once |