An Efficient Vectorization of FIR Filter for Vector Processor

Zhong Liu


FIR filter, Vectorization, Vector Processor, YHFT-Matrix


The Vectorization of algorithms mapping for vector processor is a critical issues. An efficient vectorization of FIR filter for vector processor is proposed, in which the FIR filter computation is divided into N-step vector multiplication, each vector multiplication is executed in parallel by sixteen vector processing elements, the calculation of sixteen output is completed at once. Compared with existing methods, this method can fully exploit the instruction level and data level parallelism of vector processor, it can be applied to the FIR filter with different length of coefficients, it is not limited to vector processors whether to support the addition of reduction, and supports 8-bit, 16-bit fixed-point real, fixed-point complex, 32-bit floating-point real and complex data types. Experimental results show that the execution time for calculating 1024-points with a 50-tap fixed-point real FIR filter based on YHFT-Matrix is only 7.4 us, the vectorization of floating-point complex FIR filter achieves nearly 8x speedup over sequential algorithm of TMS320C67x, the vectorization of fixed-point complex FIR filter achieves nearly 16x speedup over sequential algorithm of TMS320C64x.

Important Links:

Go Back