A Highly Parallel GPU-based Hash Accelerator for a Data Deduplication System

X. Li and D.J. Lilja (USA)


GPU Computing, Deduplication system, Hash computing,CUDA


Recently, data storage systems with data deduplication have been introduced as a method of reducing storage space by eliminating redundant data. In a deduplication storage system, the collision-resistant fingerprint of each data segment must be calculated using a hash algorithm. This paper presents a GPU based accelerator, called g-Dedu, for processing the hash computation of the deduplication system. The g-Dedu accelerator algorithm is especially designed for handling the variable and small size of the data used in a deduplication system, which cannot be processed efficiently by a GPU in a straightforward way. Our data organization approach uses a hierarchical data structure to organize the processing data. A scheduler manages these data for optimal GPU processing. Our patterned data segment approach overcomes some noticeable performance drops resulting from the GPU memory model. Furthermore, different from some previous GPU hash accelerator work, our approach strictly follows the hash processing standard. Using this new approach, g-Dedu achieves 6 times speedup on the SHA-1 computation, and 7.4 times speedup on the SHA-2 computation when compared with a CPU-based implementation.

Important Links:

Go Back