Task Level Pipelining on Multiple Accelerators via FPGA Switch

Takaaki Miyajima, Takuya Kuhara, Toshihiro Hanawa, Hideharu Amano, and Taisuke Boku


Interconnect for accelerators, GPU cluster, Accelerator computing


We show a task level pipelining on multiple accelerators with PEACH2. PEACH2, which is implemented on FPGA, enables ultra low latency direct communication among multiple accelerators over computational nodes. By installing PEACH2, typical high performance computation nodes are tightly coupled. In this environment, application can be accelerated by exploiting not only data level parallelism, but also task level parallelism. Furthermore, we can process multiple task on multiple accelerators in a pipelined manner. In our evaluation, pipelined application which is implemented in a task level pipelined manner achieves 52% speed up compared to a single GPU.

Important Links:

Go Back