- Drizzle- Fast and Adaptable Stream Processing at Scale
- [TODO] Pathways: Asynchronous Distributed Dataflow for ML
- Scaling Distributed Machine Learning with the Parameter Server
- Exoshuffle: Large-Scale Shuffle at the Application Level
- Dataframe Systems - Theory, Architecture, and Implementation
- [Doing] OneFlow: Redesign the Distributed Deep Learning Framework from Scratch
- [Book] Streaming Systems
- State Management in Apache Flink
- Apache Flink Stream and Batch Processing in a Single Engine
- 0. 重要会议
- The Dataflow Model
- Lightweight Asynchronous Snapshots for Distributed Dataflows
- Distributed snapshots: determining global states of distributed systems
- Bigtable: A Distributed Storage System for Structured Data
- GFS: The Google File System
- MapReduce: Simplified Data Processing on Large Clusters
- A Distributed Systems Reading List
- 1. 大数据技术栈