Re: [UAI] Definition of Bayesian network

From: Kevin S. Van Horn (Kevin_VanHorn@ndsu.nodak.edu)
Date: Mon Jul 30 2001 - 16:52:47 PDT

  • Next message: Bob Welch: "Re: [UAI] Definition of Bayesian network"

    On Mon, 30 Jul 2001, Michael Jordan wrote:

    > Back on the general discussion, I think that it's important that Milan
    > has pointed out some of the non-trivial issues that arise in making
    > the notion of "conditional probability" rigorous. But even without
    > getting into pathologies, one can see that some thought is required in
    > order to handle continuous variables with any degree of honesty.

    >From my reading of his work, I believe Jaynes would have said that a
    conditional probability P(A | B = b, X), where B is a continuous variable,
    is not well-defined until you also specify the limiting process. That is,
    we take P(A | B in nbr(b, eps), X) as fundamental, where nbr(b, eps) is
    some neighborhood of b that shrinks to a point as eps -> 0, and simply
    use P(A | B = b, X) as a shorthand for

      lim_{eps -> 0} P(A | B in nbr(b, eps), X).

    Thus, the "correct" definition of P(A | B = b, X) is problem-dependent,
    depending on what limiting process is appropriate for the problem at hand.

    The standard example is to take a uniform distribution over a sphere
    and consider the conditional distribution when we restrict ourselves to a
    particular great circle. If the great circle goes about the equator, it
    seems obvious that the conditional distribution is uniform. On the other
    hand, if the great circle goes through the poles, controversy arises.
    The reason for this controversy is that there are TWO
    obvious limiting processes to define the conditional probability, and
    these give very different answers. One limiting process takes the
    neighborhood of a point on the sphere to be all points latitude lying
    within a distance of eps. This gives a conditioning region that is a band
    centered on the great circle, and we take the limit as the width of this
    band goes to zero. The second limiting process takes the neighborhood of
    a point on the sphere to be all points with the same latitude and a
    longitude that differs by at most eps. This gives a conditioning region
    that is a pair of "orange slices" connected at the tips (at the poles).
    The first limiting process gives a uniform conditional distribution,
    whereas the second does not (the probability density goes to zero at the
    poles.)

    Jaynes has a discussion of this issue in the (unfinished) book previously
    mentioned (http://bayes.wustl.edu).

    > As a general remark on some of the discussions on probability theory
    > that recur on the UAI list, I think that it's important to emphasize
    > that probability theory is best viewed as a special case of measure
    > theory,

    Let me present another view, again based on Jaynes's ideas. The title of
    his book is "Probability Theory: The Logic of Science." Jaynes viewed
    probability theory primarily as a logic of plausible inference. So let's
    take a look at this from the perspective of mathematical logic. (This is
    my own elaboration of the Jaynesian view.) The product and sum rules of
    probability theory give us the proof theory of our logic.
    Set-theoretic probability theory gives us the model theory for our logic.
    That is, it allows us to construct sets of axioms (e.g., a set of
    conditional probabilities defining a joint probability distribution over
    the variables of interest) that are consistent, so that we may avoid
    reasoning from inconsistent premises.

    This distinction, I believe, cleans up the conceptual landscape quite a
    bit. For example, there was some discussion on this list recently about
    the definition of a random variable, and the fact that a random variable's
    definition changes if we enlarge the sample space. The Jaynesian
    viewpoint is that there are no "random" variables -- there are only
    variables whose values may not be known with certainty, and there is no
    logical distinction between these and any other variable. Only at the
    model theory level, when we concern ourselves with proving consistency,
    do we have to define the notion of a random variable, sample space, etc.
    Thus, measure theory helps us build consistent probabilistic models
    involving continuous variables, but once these are defined, we may ignore
    its subtleties and crank through the simple logical rules of probability
    theory to carry out our inferences (assuming that we follow Jaynes's
    policy with regard to infinite sets.)



    This archive was generated by hypermail 2b29 : Mon Jul 30 2001 - 16:57:02 PDT