Introduction of Logic in Language Modelling: The Minimum Perplexity Criterion

D. Bouchaffra


  1. [1] S.M. Katz, Estimation of probabilities from sparse data forthe language model component of a speech recognizer, IEEETrans. on Acoustics, Speech and Signal Processing, 35(3), 1981,400–401. doi:10.1109/TASSP.1987.1165125
  2. [2] P.F. Brown, V.J. Della Pietra, P.V. deSouza, J.C. Lai, &R.L. Mercer, Class-based n-grams models of natural language,Computational Linguistics, 18(14), 1992, 467–479.
  3. [3] T. Jaakola, M. Meila, & T. Jebara, Maximum entropy discrimination, in S.A. Solla, T.K. Leen, & K.R. Muller (Eds.),Advances in Neural Information Processing Systems, Vol. 12(Cambridge, MA: MIT Press, 1999).
  4. [4] P. Clarkson, Adaptation of statistical language models for automatic speech recognition, doctoral diss., Cambridge University,1999.
  5. [5] R. Rosenfeld, Adaptive statistical language modeling: A maximum entropy approach, doctoral diss., Carnegie-Mellon University, Pittsburgh, 1994.
  6. [6] J. Bellegarda, Exploiting latent semantic information in statistical language modelling, Proc. IEEE, 88(8), 2000, 1279–1296. doi:10.1109/5.880084
  7. [7] Y. Bengio, R. Ducharme, & P. Vincent, A neural probabilisticlanguage model, in T.K. Leen, T. Dietterich, & V. Tresp(Eds.), Advances in Neural Information Processing Systems,Vol. 13 (Cambridge, MA: MIT Press, 2001).
  8. [8] K. Church & W. Gale, Enhanced Good Turing and Cat-Cal: Two new methods for estimating probabilities of Englishbigrams, in E. Briscoe (Ed.), Computer Speech and Language(Amsterdam: Elsevier, 1990).
  9. [9] I. Dagan, F. Pereira, & L. Lee, Similarity-based estimation ofword co-occurrence probabilities, Proc. 32nd Annual Meetingof the Association for Computational Linguistics. New MexicoState University, June 1994, Las Cruces, 272–278.
  10. [10] F. Jelinek, R. Mercer, & S. Roukos, Principles of lexicallanguage modelling for speech recognition, in S. Furui &M. Moham Sondhi (Eds.), Advances in speech signal processing(New York: Mercer Dekker, 1992), 651–699.
  11. [11] A. Nadas, Estimation of probabilities in the language modelof the IBM speech recognition system, IEEE Trans. Acoustic,Speech and Signal Processing, 32, 1981, 819–861.
  12. [12] D. Bouchaffra, V. Govindaraju, & S.N. Srihari, Postprocessing of recognized strings using nonstationary Markovian mod-197els, IEEE Trans. Pattern Analysis and Machine Intelligence(PAMI), 21(10), 1999, 990–999. doi:10.1109/34.799906
  13. [13] D. Bouchaffra, E. Koontz, V. Kripasundar, & R.K. Srihari,Integrating signal and language context to improve handwritten phrase recognition: Alternative approaches, in Preliminary Papers of the Sixth International Workshop on ArtificialIntelligence and Statistics, Fort Lauderdale, FL, 1997, 47.
  14. [14] D. Bouchaffra, E. Koontz, V. Kripasundar, & R.K. Srihari, Incorporating diverse information sources in handwriting recognition postprocessing, International Journal of Imaging Systemsand Technology, 7, 1996, 320–329. doi:10.1002/(SICI)1098-1098(199624)7:4<320::AID-IMA7>3.0.CO;2-A
  15. [15] I.J. Good, The population frequencies of species and theestimation of population parameters, Biometrika, 40, 1953,237–264. doi:10.2307/2333344
  16. [16] S.F. Chen & R. Rosenfeld, A survey of smoothing techniquesfor ME models, IEEE Trans. on Speech and Audio Processing,8(1), 2000, 3750. doi:10.1109/89.817452
  17. [17] N.J. Nilsson, Probabilistic logic, Journal of Artificial Intelligence, 281, 1981, 71–81.
  18. [18] D. Bouchaffra, Theory and algorithms for analysing the consistent region in probabilistic logic, An International Journalof Computers & Mathematics, 25(3), 1993, 13, 25.
  19. [19] S. Russel & P. Norvig, Artificial intelligence: A modernapproach (New Jersey: Prentice Hall, 1995).
  20. [20] J. Dieudonné, Foundations of modern analysis (New York:Academic Press, 1960).

Important Links:

Go Back