Log Likelihood of LDA in CGS

1 minute read

Log likelihood of Latent Dirichlet in Collapsed Gibbs Sampling.

We want to calculate

Recall (49) and (50) in Minka, T. (2000). Estimating a Dirichlet distribution. is a Dirichlet parameter and is drawn. Then a is drawn from a multinomial with probability vector . is the number of times the outcome is .

Now we take log so that we get log likelihood.

Code for C++:

double llik(DATA_STRUCT *data, Parameters *parameters){
  int V = parameters -> V; // number of unique words
  int M = parameters -> M; // number of documents

  double polyaw = 0.0;
  for(int k=0; k<K; k++){
    double nw = parameters -> Nkv.row(k).sum();
    polyaw += lgamma(V*beta) - lgamma(V*beta + nw);

    for(int v=0; v<V; v++){
      polyaw += lgamma( (parameters -> Nkv(k,v)) + beta) - lgamma(beta);
    }
  }

  double polyad = 0.0;
  for(int d=0; d<M; d++){
    double nd = parameters -> Ndk.row(d).sum();
    polyad += lgamma( K*alpha ) - lgamma(K*alpha + nd);

    for(int k=0; k<K; k++){
      polyad += lgamma( (parameters -> Ndk(d,k)) + alpha ) - lgamma(alpha);
    }
  }

  double llik = polyad + polyaw;
  return llik;
}

We need to use Polya distribution, because in LDA model,

More specifically if we observe two or more , they are not independent each other.

In language, if we observe a certain topic (or a word) in a document, it is likely that we observe the same topic (word) again in the document (Polya’s Urn).

Updated: