Enhancing the Quality of Audio Transformations using the Multi-Scale Short-Time Fourier Transform

N. Juillerat (Switzerland), S.M. Arisona (USA), and S. Schubiger-Banz (Switzerland)


Adaptive Tiling, Audio Effect, Audio Transformation, STFT, Time-frequency, Transient.


This paper presents a new adaptive tiling technique of the time-frequency plane that is suitable for a wide range of audio transformations. The proposed algorithm separates components of the audio signal into different categories according to their degree of transience. Each category is then processed with an adequate time-frequency resolution: higher time resolution is used for transient components and higher frequency resolution for steady sounds. The algo rithm allows the audio signal to be modified and synthe sized back with minimal interferences between the differ ent components. An implementation of the proposed ap proach is presented and compared with other related ap proaches. The signal representation used by the presented algorithm is similar to that of a multi-channel short-time Fourier transform. Therefore, existing audio transforma tions based on the short-time Fourier transform can be adapted to the proposed approach with minimal modifica tion, and automatically benefit from significantly improved quality: transient smearing artifacts for instance are miti gated without sacrificing quality on steady sounds. This in cludes a wide range of audio transformations such as pitch shifting, time stretching, chorusing, harmonizing, noise re duction, whisperization and various other audio effects.

Important Links:

Go Back