Positional indexes
[Term]
doc_id | Freq | Postings |
---|---|---|
1 | 3 | 0, 5, 10 |
1 | 2 | 1, 2 |
Can do proximity queries
Disadvantages
Many positions e.g. SEC filings. -> 2-4x large than a non-positional index.
A positional index size is 35-50% of the volume of the original text.
Holds for all english-like languages.
Takes a longer time to merge, (a Bigram can perform the lookup instantaneously)
Need to look for the positions from the first word, merge all the documents in which both exist. Then, need to merge the 2 lists O(min(m, n))