K-Means VQ Algorithm using a Low-Cost Parallel Cluster Computing

P.S.L. de Souza, A.S. Britto, Jr. (Brazil), R. Sabourin (Canada), S.R.S. de Souza, and D.L. Borges (


: Vector quantization, K-means algorithm, Parallel Computing 1.


It is well known that the time and memory necessary to create a codebook from large training databases have hindered the vector quantization based systems for real applications. To overcome this problem, we present a parallel approach for the K-means Vector Quantization (VQ) algorithm based on master/slave paradigm and low cost parallel cluster computing. Distributing the training samples over the slaves' local disks reduces the overhead associated with the communication process. In addition, models predicting computation and communication time have been developed. These models are useful to predict the optimal number of slaves taking into account the number of training samples and codebook size. The experiments have shown the efficiency of the proposed models and also a linear speed up of the vector quantization process used in a two-stage Hidden Markov Model (HMM)-based system for recognizing handwritten numeral strings.

Important Links:

Go Back