How are documents represented?
They are vectors.
Terms are the axes of the vector space.
Their tf-idf weights are the scalar values.
They are high-dimensional.
A con of this is that we are unable to consider the ordering of words in a document, since there is no “ordering of axis”