The EM algorithm is a generic framework that can be employed in the optimization of many generative models. It was originally proposed in Maximum likelihood from incomplete data via the em algorithm, Dempster A. P., Laird N. M., Rubin D. B., Journal of the Royal Statistical Society, B, 39(1):1–38, 11/1977, where the authors also proved its convergence at different levels of genericity.
For our purposes, we are going to consider a dataset, X, and a set of latent variables, Z, that we cannot observe. They can be part of the original model or introduced artificially as a trick to simplify the problem. A generative model parameterized with the vector θ has a log-likelihood equal to the following:
Of course, a large log-likelihood implies that the model is able to generate the original distribution with a small error. Therefore, our goal is...