Re: [UAI] mixed variables

From: Kevin Murphy (murphyk@cs.berkeley.edu)
Date: Sat Apr 08 2000 - 08:28:06 PDT

  • Next message: Bob Welch: "Re: [UAI] mixed variables"

    >The classification problem that I am working on has 37 input
    > variables, 15 of them are categorical and the rest of them
    > are continuous.

    Is the structure you are working with a Naive Bayes classifier

       C
     / | \
    X1 X2 X3

    or a form of logistic regression

    X1 X2 X3
    \ | /
     \ | /
       C

    where all arcs point down in both figures? (C is a discrete class label,
    and Xi are continuous or discrete feeatures.)

    The first model requires learning the prior distribution P(C), using EM,
    and the conditional distributions P(Xi|C).
    e.g., if Xi is Gaussian, just compute the sample mean and variance of
    feature i for each class label, and combine them using the posterior on
    C as the mixing weights. For distributions which are not in the
    exponential family, you will need to use an iterative M step.

    The second model only requires fitting the softmax function
    P(C|X1,...,Xn), which can be done using iteratively reweighted least
    squares.

    I recommend the following article for a discussion of these two models:

    @techreport{Jordan95,
     title = "Why the logistic function? {A} tutorial discussion on
                      probabilities and neural networks",
      author = "M. I. Jordan",
      institution = "MIT Computational Cognitive Science Report",
      number = 9503,
      month = "August",
      year = 1995,
      annote = "Discusses the relative merits of causal (BN) and
                      diagnostic (NN) models"
    }

    Kevin



    This archive was generated by hypermail 2b29 : Sat Apr 08 2000 - 08:31:59 PDT