Environmental Sound Recognition by the Instantaneous Spectrum Combined with the Time Pattern of Power

Y. Toyoda, J. Huang, S. Ding, and Y. Liu (Japan)


Environmental sound recognition; Combination of spec trum and power pattern; Robotic audition


Environmental sound recognition is an important function of robotic audition. Although HMM or TDNN based meth ods can also be used for environmental sound recognition, different with speech recognition, it is not possible to cre ate a perfect database covering all kinds of environmental sounds. Environmental sound recognition is more depen dent on the task of a robot or a computer system. From this point of view, the methods for environmental sound recognition must also be task dependent and evaluated by the acurracy, speed and simplicity. In this research, we tried to use a multi-layered perceptron type NN system for environmental sound recognition. The input data is the one-dimensional combination of instantaneous spectrum at power peak and the power pattern in time domain. Since for almost environmental sounds, their spectrum changes are not remarked compared with speech or voice, the com bination of power and frequency pattern will reserve the major features of environmental sounds but with drastically reduced data. Two experiments were conducted using an original database and a database created by the RWCP. The recognition rate for about 45 data types kinds of environ mental sound was about 90%. The advantages of the new method are fast and simple, and suitable for a on-board sys tem of a robot which can be used for home use, e.g. a se curity monitoring robot or a home helper robot.

