An Online Variant of EM Algorithm based on the Hierarchical Mixture Model Learning

K. Maebashi, N. Suematsu, and A. Hayashi (Japan)


Machine Learning, Data Mining, EM algorithm, mixture models, large databases


The EM algorithm has been widely used in many learn ing or statistical tasks. However, since it requires mul tiple database scans, applying the EM algorithm to large databases is not realistic. In this paper we propose a vari ant of the EM algorithm which requires only one database scan and works within a confined memory space. The al gorithm is based on a generalization of the EM algorithm proposed for learning of hierarchical mixture models. A notable advantage of our algorithm over existing variants of the EM algorithm for large databases lies in its simplic ity. Our algorithm preserves the theoretical clearness of the EM algorithm.

Important Links:

Go Back