Identification of Homogeneous Blocks in Large Binary Data Sets

F.-X. Jollois and M. Nadif (France)


Block clustering, Classification Maximum Likelihood, Block CEM algorithm, HBCM algorithm, Data Mining.


When the data consist of a large number of attributes tak ing values over a large number of cases, such as in data mining context, block clustering algorithms can be an in teresting approach. By clustering cases and attributes si multaneously, they allow to find patterns into homogeneous blocks, which can be viewed as a summary of the data. Re cently we have set this aim in a mixture approach and have proposed an algorithm under the classification likelihood approach. Unfortunately, this algorithm determines a cou ple of partitions with a priori fixed numbers of clusters. In this paper, we propose to overcome this difficulty by using an hybrid method which it uses jointly this algorithm and a new hierarchical block clustering method based on our mixture model. Performances of this method is tested on real and simulated data, with encouraging results.

Important Links:

Go Back