Distinguishing between Overlapping Components in Mixture Models

H. Sun and S. Wang (Canada)


: Mixture model, Ridge curve, Overlap rate, Cluster analysis, Theory and Founda tions.


The mixture of Gaussians is a fundamental data dis tribution model for many clustering algorithms. The ability of an algorithm to distinguish overlapping clus ters is one of the major criteria for evaluating its effi ciency. However, the phenomenon of component over lapping is still not well understood, especially in multi variate cases. In this paper, we introduce the concept of the ridge curve and establish a theory to measure the rate of overlap between two components. We in vestigate factors that affect the value of the overlap rate, and show how the theory can be used to gener ate truthed data as well as to measure the overlap rate of a given data set.

Important Links:

Go Back