TinyKV 是PingCAP公司推出的一套开源分布式KV存储实战课程:https://github.com/tidb-incubator/tinykv,
宗旨实现一个简易的分布式 kv
这课程一共包含了4子项目:
- Project 1需要参与者独立完成一个单机的KV Server
- Project 2需要基于Raft算法实现分布式键值数据库服务端
- Project 3需要在Project 2的基础上支持多个Raft集群
- Project 4需要Project 3的基础上支持分布式事务
难度都是阶梯式的,等价于麻省理工学院有一套MIT 6.824课程
[
](https://github.com/watchpoints/tinykv/blob/course/doc/project1-StandaloneKV.md)
任务:Standalone KV
第一个 Project 是集成 Badger,实现一个简易的单机版 kv。
Badger 是一个很优秀的开源的单机版 kv 存储引擎,基于 LSM Tree 实现,读写性能都很好,需要简单熟悉下 Badger 的用法,可以参考下官方示例:https://github.com/dgraph-io/badger。
第一步、开始任务:阅读文档
执行了make project1可以看到抛出了一大堆异常,这些异常原因就是官方工程师给你写的单元测试没有跑通过, 你要做的只需要把/tinykv/kv/server/server_test.go 下的所有的单元测试用例调用的api里面的功能实现即可
https://github.com/tidb-incubator/tinykv/blob/course/doc/project1-StandaloneKV.md
https://pkg.go.dev/github.com/dgraph-io/badger#Txn
具体的实现,在 kv/storage/standalone_storage/standalone_storage.go 中,需要封装一下 Badger,然后实现 storage 接口中定义的几个方法。
- cf 定义
Column family (it will abbreviate to CF below) is a term like key namespace,namely the values of the same key in different column families is not the same.You can simply regard multiple column families as separate mini databasesconst (CfDefault string = "default"CfWrite string = "write"CfLock string = "lock")
- Get fetches the current value for a key for the specified CF
- Put replaces the value for a particular key for the specified CF in the database
The project can be broken down into 2 steps, including:
- Implement a standalone storage engine.
Implement raw key/value service handlers.
The first mission is implementing a wrapper of badger key/value API.
The service of gRPC server depends on an Storage which is defined in kv/storage/storage.go.
In this context, the standalone storage engine is just a wrapper of badger key/value API which is provided by two methods:
type Storage interface {// Other stuffsWrite(ctx *kvrpcpb.Context, batch []Modify) errorReader(ctx *kvrpcpb.Context) (StorageReader, error)}
https://github.com/Connor1996/badger
In addition, Server depends on a Storage, an interface you need to implement for the standalone storage engine located in kv
/storage/standalone_storage/standalone_storage.go.
Once the interface Storage is implemented in StandaloneStorage, you could implement the raw key/value service for the Server with it.
- badger (一个高性能的LSM K/V store)使用指南
https://pkg.go.dev/github.com/dgraph-io/badger#section-readme
https://pkg.go.dev/github.com/dgraph-io/badger#Txn
func TestBlob(t *testing.T)func TestGet(t *testing.T) {
badger (一个高性能的LSM K/V store)使用指南
- 所有的操作都是在事务中完成的, Badger的事物是基于MVCC实现的。
- https://github.com/Connor1996/badger/blob/master/db_test.go
- 单元测试:funcTestRawGet1(t *testing.T)
第三次看:文档

第二步、不公布
- 第一个你要实现的 standalone_storage.go
type Storage interface {// Other stuffsWrite(ctx *kvrpcpb.Context, batch []Modify) errorReader(ctx *kvrpcpb.Context) (StorageReader, error)}// StandAloneStorage is an implementation of `Storage` for a single-node TinyKV instance. It does not// communicate with other nodes and all data is stored locally.type StandAloneStorage struct {// Your Data Here (1).}func NewStandAloneStorage(conf *config.Config) *StandAloneStorage {// Your Code Here (1).return nil}func (s *StandAloneStorage) Start() error {// Your Code Here (1).return nil}func (s *StandAloneStorage) Stop() error {// Your Code Here (1).return nil}func (s *StandAloneStorage) Reader(ctx *kvrpcpb.Context) (storage.StorageReader, error) {// Your Code Here (1).return nil, nil}func (s *StandAloneStorage) Write(ctx *kvrpcpb.Context, batch []storage.Modify) error {// Your Code Here (1).return nil}
- 第二个你要实现的:raw_api.go
第三步、测试:server_test.go
remember to run make project1 to pass the test suite.
GOTEST := $(GO) test -v --count=1 --parallel=1 -p=1project1:$(GOTEST) ./kv/server -run 1make project1GO111MODULE=on go test -v --count=1 --parallel=1 -p=1 ./kv/server -run 1
第四步:我的疑问
- 问:实验1 TestRawGetAfterRawPut1,通一个key,插入不同记录,但是在查询时候。结果不正确了。
这个查询和报错时候指定 cf吗?原来的badger没有cf这个概念。
回答:

Badger doesn’t give support for column families. engine_util package (kv/util/engine_util) simulates column families by adding a prefix to keys. For example, a key key that belongs to a specific column family cf is stored as ${cf}_${key}. It wraps badger to provide operations with CFs, and also offers many useful helper functions. So you should do all read/write operations through engine_util provided methods. Please read util/engine_util/doc.go to learn more.
问:func TestRawScan1(t *testing.T) 这个测试case 是什么浏览方式。
答:
相关讨论
上面群聊未能解决的问题,可以移步到asktug请求帮助
如果已有问答能够帮助解决问题,帖主记得勾选“对我有用”哦
[
](https://asktug.com/t/topic/273355)
这是近日的提问帖:https://asktug.com/t/topic/273196
https://asktug.com/t/topic/273154/5
https://asktug.com/t/topic/273269/2
https://asktug.com/t/topic/273388/3
https://asktug.com/t/topic/273391?u=tidber_ybwcfwut
https://asktug.com/t/topic/273388/2
https://asktug.com/t/topic/273387
https://asktug.com/t/topic/273355
相关别人思路
- Talent Plan KV训练营Project1解题分享
- https://mp.weixin.qq.com/s/ulK9hZDlukbo4X3AaRByJw
- https://learnku.com/articles/63094
- https://learnku.com/articles/61598
