Independent Component Analysis and Naive Bayes Classification

M. Bressan and J. Vitrià (Spain)

Keywords

Bayesian Learning, Naive Bayes, Independent Component Analysis, Feature Subset Selection.

Abstract

The Naive Bayes classifier is based on the not always realistic assumption that class-conditional distributions can be factorized in the product of their marginal densities. Independent Component Analysis attempts to minimize the statistical dependence on the transformed data. One of the most common ways of estimating the Inde pendent Component Analysis (ICA) representation for a given random vector consists in minimizing the Kullback Leibler distance between the joint density and the product of the marginal densities (mutual information). From this that ICA provides a representation where the independence assumption can be held on stronger grounds. In this paper we propose class-conditional ICA as a method that provides an adequate representation when Naive Bayes is the classifier of choice. We also explore the consequences of this choice on the problem of feature subset selection for classification. Experiments comparing the performance of Naive Bayes over different representations are performed on the UCI Letter Database and on the MNIST handwritten digit database.

Important Links:



Go Back