LSM Tree
Motivations
Doing lots of disk writes in expensive (disk i/o). We batch writes by using in memory buffers to accumulate, and write in one-shot.
Has two components:
- C1: In memory buffers (small)
- C2: On disk (larger)
We merge in memory segments into disk.
Design
Make inserts and deletes cheap (disk writes).
LSM Trees cascade data from incoming stream into smaller, higher performing stores.
Merging algorithm a.k.a rolling merge
This is heart of LSM algorithm. It migrate entries from C0 to C1, C1 to C2, …, to free buffer space.
Removes a contiguous segment of entries from c0 to c1.
C0 - Memtable C1 Tree - SSTable.
Helps with compaction.
Recovery
Store instructions in log, can use for recovery.
Tradeoffs
Disk i/o not cheap.
Further questions
Will B+Tree augmented with memory be able to support the same thing?
References
Condensed ver: https://blog.acolyer.org/2014/11/26/the-log-structured-merge-tree-lsm-tree/ In-depth: https://www.cs.umb.edu/~poneil/lsmtree.pdf
See also
B-Tree vs LSM Tree: https://tikv.org/deep-dive/key-value-engine/b-tree-vs-lsm/ LSM Tree to PostgreSQL: https://www.youtube.com/watch?v=llbjTBgTPO4&ab_channel=PGCon SSTable