How does blocking increase compression?
Previously we have to store dictionary term pointers in a table:
|Freq||Postings ptr||Term ptr|
Given a string of 3.2MB, we have 3.2M positions, so term pointers are 3bytes each.
If we do blocking, now we only store every k block term pointers.
Then we encode relative pointer offsets in the string itself.
Since words are not very long, we can encode these pointers as 1 byte.
This allows us to save 9 bytes on 3 pointers,