<:> N-gram hierarchy



To obtain n-gram probabilities in a way which exploits the chain rule, work through the sequence building up an n-gram tree structure.

When this is done, convert the observed frequencies into the appropriate conditional probabilities.