PARALLEL IMPLEMENTATIONS OF 1-D FAST FOURIER TRANSFORM WITHOUT INTERPROCESSOR COMMUNICATION

R.A. Namneh, W.D. Pan, and S.-M. Yoo

References

  1. [1] J.W. Cooley & J.W. Tukey, An algorithm for the machine computation of the complex Fourier series, Math. Computation, 19, 1965, 297–301. doi:10.2307/2003354
  2. [2] M. Frigo & S.G. Johnson, The design and implementation of FFTW3, Proc. IEEE, 93 (2), 2005, 216–231. doi:10.1109/JPROC.2004.840301
  3. [3] M. Frigo, A fast Fourier transform compiler, Proc. 1999 ACM SIGPLAN Conf. on Programming Language Design and Implementation, Atlanta, GA, 1999, 180–169.
  4. [4] M. Frigo & S.G. Johnson, The fastest Fourier Transform in the west, Technical report MIT-LCS-TR-728, Cambridge, MA, September 1997.
  5. [5] A. Gupta & V. Kumar, The scalability of FFT on parallel computers, IEEE Trans. on Parallel and Distributed Systems, 4 (8), 1993, 922–932. doi:10.1109/71.238626
  6. [6] Z. Cvetanovic, Performance analysis of the FFT algorithm on shared memory parallel architecture, IBM Journal of Research and Development, 31 (4), 1987, 435–451.
  7. [7] P.N. Swarztrauber, Multiprocessors FFTs, Parallel Computing, 5 (1–2), 1987, 197–210. doi:10.1016/0167-8191(87)90018-4
  8. [8] P.N. Swarztrauber & S.W. Hammond, A comparison of optimal FFTs on torus and hypercube multi-computers, Parallel Computing, 27 (6), 2001, 847–859. doi:10.1016/S0167-8191(00)00107-1
  9. [9] D. Takahashi, A parallel 1-D FFT algorithm for the Hitachi SR8000, Parallel Computing, 29 (6), 2003, 679–690.
  10. [10] D.H. Bailey, FFTs in external or hierarchical memory, Journal of Supercomputing, 4 (1), 1990, 23–35. doi:10.1007/BF00162341
  11. [11] S.L. Johnsson, M. Jacquemin, & R.L. Krawitz, Communication efficient multi–processor FFT, Journal of Computational Physics, 102 (2), 1992, 381–397. doi:10.1016/0021-9991(92)90380-H
  12. [12] C. Calvin, Implementation of parallel FFT algorithms on distributed memory machines with a minimum overhead of communication, Parallel Computing, 22 (9), 1996, 1255–1279. doi:10.1016/S0167-8191(96)00039-7
  13. [13] I. Gertner & M. Rofheart, A Parallel algorithm for 2-D DFT computation with no inter-processor communication, IEEE Trans on Parallel and Distributed Systems, 1 (3), 1990, 377–382. doi:10.1109/71.80164
  14. [14] I. Gertner, A New efficient algorithm to compute the two-dimensional discrete Fourier Transform, IEEE Trans. on Acoustics, Speech and Signal Processing, 36 (7), 1988, 1036–1050. doi:10.1109/29.1627
  15. [15] C. Lu, M. An, Z. Qian, & R. Tolimieri, A hybrid parallel M-D FFT algorithm without interprocessor communication, Proc. IEEE Int. Conf. on ASSP, Minneapolis, MN, 1993, 281–284.
  16. [16] G. Kechriotis, M. An, M. Bietsas, R. Tolimieri, & E.S. Manolakos, A new approach for computing multidimensional DFT’s on parallel machines and its implementation on the iPSC/860 hypercube, IEEE Trans. on Signal Processing, 43 (1), 1995, 272–285. doi:10.1109/78.365307
  17. [17] H. Kwan, R.L. Nelson, E.J. Powers, & E.E. Swartzlander, Three-dimensional FFTs on a digital-signal parallel processor, with no interprocessor communication, Proc. 30th Asilomar Conf. on Signals, Systems, and Computers, Pacific Grove, CA,1996, 440–444.
  18. [18] S. Winograd, On computing the discrete-Fourier transform, Math. Computation, 32, 1978, 175–199. doi:10.2307/2006266
  19. [19] V. Milutinovic, J. Fortes, & L. Jamieson, A multiprocessor architecture for real-time computation of a class of DFT algorithms, IEEE Trans. on Acoustics, Speech, and Signal Processing, 34 (5), 1986, 1301–1309. doi:10.1109/TASSP.1986.1164924
  20. [20] F. Marino & E. Swartzlander, Parallel implementation of multidimensional transforms without interprocessor communication, IEEE Trans. on Computers, 48 (8), 1999, 951–961. doi:10.1109/12.795223
  21. [21] F. Marino, V. Piuri, & E. Swartzlander, A parallel implementation of the 2-D discrete wavelet transform without interprocessor communications, IEEE Trans. on Signal Processing, 47 (11), 1999, 3179–3183. doi:10.1109/78.796458
  22. [22] H. Shan, J.P. Singh, L. Oliker, & R. Biswas, Message passing vs. shared address space on a cluster of SMPs, Proc. 15th Int. Parallel and Distributed Processing Symp., San Francisco, CA, April 2001, 1–8.
  23. [23] W.H. Press, S.A. Teukolsky, W.T. Vetterling, & B.P. Flannery, Numerical recipes in C book on-line, 2nd ed., http://www.library.cornell.edu/nr/, 504–508.
  24. [24] S.L. Johnsson & C.-T. Ho, Optimal all-to-all personalized communication with minimum span on boolean cubes, Proc. Distributed Memory Computing Conf., Portland, OR, April–May 1991, 299–304.
  25. [25] D.S. Scott, Efficient all-to-all communication patterns in hypercube and mesh topologies, Proc. Distributed Memory Computing Conf., Portland, OR, April–May 1991, 398–403.

Important Links:

Go Back