Parallelizing Neural Network Training for Cluster Systems

G. Dahl, A. McAvinney, and T. Newhall (USA)


Parallel Neural Network Training, Cluster


We present a technique for parallelizing the training of neural networks. Our technique is designed for parallelization on a cluster of workstations. To take advantage of parallelization on clusters, a solution must account for the higher network latencies and lower bandwidths of clusters as compared to custom parallel architectures. Parallelization approaches that may work well on special purpose parallel hardware, such as distributing the neurons of the neural network across processors, are not likely to work well on cluster systems because communication costs to process a single training pattern are too prohibitive. Our solution, Pattern Parallel Training, duplicates the full neural network at each cluster node. Each cooperating process in the cluster trains the neural network on a subset of the training set each epoch. We demonstrate the effectiveness of our approach by implementing and testing an MPI version of Pattern Parallel Training for the eight bit parity problem. Our results show a significant speed-up in training time as compared to sequential training. In addition, we analyze the communication costs of our technique and discuss which types of common neural network problems would benefit most from our approach.

