>The classification problem that I am working on has 37 input
> variables, 15 of them are categorical and the rest of them
> are continuous.
Is the structure you are working with a Naive Bayes classifier
C
/ | \
X1 X2 X3
or a form of logistic regression
X1 X2 X3
\ | /
\ | /
C
where all arcs point down in both figures? (C is a discrete class label,
and Xi are continuous or discrete feeatures.)
The first model requires learning the prior distribution P(C), using EM,
and the conditional distributions P(Xi|C).
e.g., if Xi is Gaussian, just compute the sample mean and variance of
feature i for each class label, and combine them using the posterior on
C as the mixing weights. For distributions which are not in the
exponential family, you will need to use an iterative M step.
The second model only requires fitting the softmax function
P(C|X1,...,Xn), which can be done using iteratively reweighted least
squares.
I recommend the following article for a discussion of these two models:
@techreport{Jordan95,
title = "Why the logistic function? {A} tutorial discussion on
probabilities and neural networks",
author = "M. I. Jordan",
institution = "MIT Computational Cognitive Science Report",
number = 9503,
month = "August",
year = 1995,
annote = "Discusses the relative merits of causal (BN) and
diagnostic (NN) models"
}
Kevin
This archive was generated by hypermail 2b29 : Sat Apr 08 2000 - 08:31:59 PDT