less than 1 minute read

Perplexity is commonly used to evaluate language models.

Definition and Calculation

First, we need to know

Perplexity is the geometric mean of choices. is the number of data, is the number of choices.

If perplexity becomes negative, you might need to take into account normalization constants. If you calculate the perplexity right after you initialize the model (randomly fill parameters), perplexity could be greater than the number of unique words in the corpus.


Ideally, we want to know , but we need to consider the complete data log-likelihood . So, we take

is the number of simulation after enough number of iterations. is the value of latent variable under simulation. If we take the mean of perplexity, it could be an approximation of all possible .

Test Perplexity

This is a weighted average by the trained parameters. In the following example, we consider three topics.


We sum up .