A key-value store engine optimized for writing stream of random items, using log-structured approach and write-ahead logging based on Bitcask.
Build:
make all
Run: set
, get
, del
$./engine set π πΈ
$./engine get π
πΈ
$./engine set π πΉ
$./engine get π
πΉ
$./engine del π
$./engine get π
$
Testing:
make all_test
make clean
Disk Design:
- The disk design uses
write-ahead logging
(WAL) to support fast writes, especially when writing an incomingstream of random items
.
Data Encoding:
| key size(4 Bytes) | value size(4 Bytes) | status(1 Bytes) | key(Variable) | value(Variable) | timestamp(23 Bytes)|
Index Design:
- Implement
Skip List
as in-memory indexer to support fastinsert/get/delete
operation. - Single seek to retrieve data. Index store the direct position to retrieve value for each key.
Limitation:
- While we store all the key and value into disk, but all key need to be keep in the memory. Enough memory is required.
Write Process:
- During the writing process, any update(
set/del
) operation creates a log that is appended to the end of the data file. - Then, the corresponding key's index is updated with an updated file ID and offset. If the key has been deleted, it is removed from the index.
Read Process:
- The
get key
involves looking up the index(skip list) to find theposition
(data file ID and offset of the key). If we cannot find it in the index, then it does not exist in our data store. - Otherwise, we retrieve its value from disk according to this
position
.
Delete Process:
- As mentioned in the write process, the
del key
operation writes a tombstone to the disk without actually removing the log of thekey
. - The data position is then removed from the index. The stale data is cleaned up later during the
merge
process.
Boot Process:
- Load all the data file with increasing order of the file id
- Load the data into index
- While we loading the log data, we remove a key that is stored in the index if the key was marked as deleted.
TODO:
- [ ]: Support Merge operation to merge the data files and cleanup outdated logs
- [ ]: Add CRC into encoded data to support crash recovery and data consistency
- [ ]: Support DB iterator
- [ ]: Task Processing Layer for incoming client request