# Zipf’s law

$$cf_i \propto 1/i = K/i$$

where

$$cf_i$$ is collection frequency, number of occurrences of the term $$t_i$$ in the collection.

We want to know about relative frequencies of terms in a collection (not vocabulary).

Implications

Most frequent term occurs cf1 times.

Second most frequent term occurs cf1/2 times.