Re: [UAI] Definition of Bayesian network

From: Bob Welch (indianpeaks@home.com)
Date: Sat Jul 21 2001 - 11:14:04 PDT

  • Next message: Marco Valtorta: "Re: [UAI] Definition of Bayesian network"

    Rich:

    I find that an intuitive explanation of a Bayesian network is often
    needlessly difficult to explain when approached from local primitives. The
    reason being that the BN is not unique to a given joint distribution. Given
    that the joint exists, there is a Bayesian network for every possible
    ordering of the variables. The chain rule promises the product of
    conditionals given the antecedents in the ordering equals the joint. It is
    only a matter of refining the parents of xj to be the smallest set in x1 ...
    xj-1 needed to make xj independent of the other antecedents.

    Now form the entire set of conditionals you can define in this manner and
    randomly select one for each variable x of the form P{x | p(x)}. Clearly you
    can't expect that random selection to conform to a DAG or the Markov
    condition. Nor should you expect the knowledge engineer to blithely go out
    into the word to discover conditional relationships and expect to find a
    Bayesian network (even though they are all over the place).

    The point is that if the reader of your book sees the definition, then he
    may be lead to believe that that gives him a guide as to how to begin
    searching for a Bayesian network. But starting with a set of primitive
    local relationships, such as conditional independence or conditional
    probability does not necessarily result in an discovery of the global DAG
    or Markov property.

    A guiding principle like causality may help but is no guarantee. My
    non-believer friends immediately want to interpret the arcs in a Bayes net
    as causal arcs and immediately argue that the counter to a DAG is feedback.
    They then dismiss the Bayes net as a construct even though existence of a
    Bayes net is as tautological as existence of a joint probability
    distribution which they are willing to accept, even when there is feedback.

    So why not start with a definition of a Bayesian network as a multiplicative
    decomposition of a joint distribution guaranteed by the chain rule. Then
    prove the iff theorems that the Markov property is satisfied and the
    implicit network is a DAG. Its not that hard to believe that joints exist
    for almost any set of variables -- and so Bayesian networks are always
    around. And now we see that the DAGs must always be there as well as the
    Markov properties, even though they seem like harder conditions to satisfy.

    I too like the idea that we would like to build systems from primitive local
    concepts. But the power of the Bayesian network approach over rule based
    expert systems is to recognize that there are global constraints these
    primitives must satisfy in order use them consistently. As an abstract
    concept, I for one find the additivity axiom for existence of a joint
    easier to swallow than a DAG or Markov property.

    Bob Welch

    - ----- Original Message -----
    From: <profrich@megsinet.net>
    To: <uai@cs.orst.edu>
    Sent: Wednesday, July 18, 2001 1:29 PM
    Subject: [UAI] Definition of Bayesian network

    > Dear Colleagues,
    >
    > In my 1990 book I defined a Bayesian network approximately as follows:
    >
    > Definition of Markov Condition: Suppose we have a joint probability
    > distribution P of the random variables in some set V and a DAG G=(V,E). We
    > say that (G,P) satisfies the Markov condition if for each variable X in V,
    > {X} is conditionally independent of the set of all its nondescendants
    given
    > the set of all its parents.
    >
    > Definition of Bayesian Network: Let P be a joint probability distribution
    > of the random variables in some set V, and G=(V,E) be a DAG. We call (G,P)
    > a Bayesian network if (G,P)satisfies the Markov condition.
    >
    > The fact that the joint is the product of the conditionals is then an iff
    > theorem.
    >
    > I used the same definition in my current book. However, a reviewer
    > commented that this was nonstandard and unintuitive. The reviewer
    suggested
    > I define it as a DAG along with specified conditional distributions (which
    > I realize is more often done). My definition would then be an iff theorem.
    >
    > My reason for defining it the way I did is that I feel `causal' networks
    > exist in nature without anyone specifying conditional probability
    > distributions. We identify them by noting that the conditional
    > independencies exist, not by seeing if the joint is the product of the
    > conditionals. So to me the conditional independencies are the more basic
    > concept.
    >
    > However, a researcher, with whom I discussed this, noted that telling a
    > person what numbers you plan to store at each node is not provable from my
    > definition, yet it should be part of the definition as Bayes Nets are not
    > only statistical objects, they are computational objects.
    >
    > I am left undecided about which definition seems more appropriate. I would
    > appreciate comments from the general community.
    >
    > Sincerely,
    >
    > Rich Neapolitan
    >



    This archive was generated by hypermail 2b29 : Sat Jul 21 2001 - 11:18:50 PDT