It is well known that evaluating the likelihood of Boltzmann machines and other undirected graphical models is intractable. Fortunately, contrastive divergence learning has allowed these models to be applied successfully across several application domains. However, recent results suggest that CD is not always an accurate surrogate for maximum likelihood in practice.
We will present new alternatives for learning the parameters of a Boltzmann machine. We will also discuss how best to compare the resulting models, given that the likelihood function is unavailable.
