Speech Document Retrieval based on Dual Roles of Text Classification

C. Zhong, Z. Miao, L. Du, and J. Zhang (PRC)


Speech document retrieval, text classification, keyword spotting, retrieval model and SVM


In virtue of the idea of conventional text classification, this paper applies text classification to speech document retrieval on the basis of lattice-based keyword spotting, and implements a speech document retrieval prototype system. With the help of text classification, this system establishes two content-correlations between the keywords in one speech document and between the documents with the same topic, respectively corresponding to its dual roles in the speech document retrieval system. First, the former correlation makes it possible to construct a two-level retrieval model including keyword and topic, so as to provide users two kinds of retrieval modes. Secondly, the topic from text classification conversely removes some false alarms in the keywords obtained from keyword spotting, called the feedback on keywords from topic. The detailed discussion for the latter, including mutual information, N-best topics and partial N-best topics, is given. This paper mainly describes the principle of the main technologies, the completion situation of this prototype system, finally gives the experimental results and conclusion.

Important Links:

Go Back