Re: [UAI] Definition of Bayesian network

From: Kathryn Blackmond Laskey (klaskey@gmu.edu)
Date: Tue Jul 31 2001 - 13:06:45 PDT

  • Next message: Torpy, Edward: "RE: [UAI] Test data for Web-usage mining/analysis algorithms"

    Kevin,

    > > As a general remark on some of the discussions on probability theory
    >> that recur on the UAI list, I think that it's important to emphasize
    >> that probability theory is best viewed as a special case of measure
    >> theory,
    >
    >Let me present another view, again based on Jaynes's ideas. The title of
    >his book is "Probability Theory: The Logic of Science." Jaynes viewed
    >probability theory primarily as a logic of plausible inference.

    So did Laplace and many other Enlightenment thinkers.

    BTW, I *highly* recommend Jaynes' book. It is fantastic.

    >So let's
    >take a look at this from the perspective of mathematical logic. (This is
    >my own elaboration of the Jaynesian view.) The product and sum rules of
    >probability theory give us the proof theory of our logic.
    >Set-theoretic probability theory gives us the model theory for our logic.
    >That is, it allows us to construct sets of axioms (e.g., a set of
    >conditional probabilities defining a joint probability distribution over
    >the variables of interest) that are consistent, so that we may avoid
    >reasoning from inconsistent premises.

    Yes.

    >This distinction, I believe, cleans up the conceptual landscape quite a
    >bit. For example, there was some discussion on this list recently about
    >the definition of a random variable, and the fact that a random variable's
    >definition changes if we enlarge the sample space. The Jaynesian
    >viewpoint is that there are no "random" variables -- there are only
    >variables whose values may not be known with certainty, and there is no
    >logical distinction between these and any other variable.

    At the object level perhaps you're right (although you'll find some
    people arguing vehemently that, for example, radioactive decay is
    "really" random). But things get *really* squishy when we try
    quantifying over models. As long as we allow quantification only
    over what some people call "objective" (I prefer "intersubjectively
    verifiable") properties of the world, then all our probability models
    will satisfy what Ron Howard calls the "clarity test." That is, a
    clairvoyant who knows the outcome of all intersubjectively verifiable
    past, present and future events could assign a definite value to each
    random variable in our model. When all our random variables satisfy
    the clarity test, then the paragraph above is a fair description of a
    philosophically satisfying (to many people) ontology for probability.

    But just try applying the clarity test to questions such as:

      What did John really mean when he threatened to quit?
      Do you think Emily is in love with Joe?
      Do you agree that Fred is in denial over his anger toward Julio?

    What is the ontological status of "that which John really meant," or
    "Emily's true feelings toward Joe," or "Fred's true disposition
    toward Julio?" (Or even worse, "what another person really thinks
    about Emily's true feelings toward Joe"!!!) Do the referents of these
    sentences count as legitimate random variables? Do you regard them as
    variables just like other variables (such as the sum of 2 and 2)?
    It's pretty clear their values cannot be known with certainty. It's
    not at all clear whether they have "real values" at all. Perhaps you
    are sure they are variables just like any other variable, but there
    are lots of people who will vehemently disagree with you. There are
    some who argue they should not be allowed to be included in
    probability models because they aren't "real" variables at all.
    Students in my classes tell me frequently that professor So-and-So
    has told them that probability CANNOT BE APPLIED to anything except
    "really random" phenomena. There are some who argue we must use some
    other formalism such as fuzzy logic for variables that don't satisfy
    the clarity test. Some go so far as to say engineers are WRONG to try
    to "squeeze the life out of" such inherently subjective phenomena by
    stuffing them into mathematical and logical representations. But
    some people happily include such hypotheses in Bayesian networks and
    give them conditional probabilities just like other random variables.
    A person can easily get sucked into arguments that last for weeks or
    months with no resolution in sight. This has happened to me plenty
    of times. Eventually I give up and filter the emails from the most
    passionate and prolific into boxes I peruse when I have nothing
    better to do (which is seldom). I have concluded that one's ontology
    for probability is a matter of religion, where by religion I mean
    something that cannot be resolved one way or the other by either
    logical argumentation or by experiment, but about which there are
    diametrically opposed passionately held points of view, and about
    which people are sure those with differing views are either stupid or
    malevolent, and decidedly WRONG. If you like spending energy
    shouting angrily at others over things you believe passionately but
    cannot prove either by logic or by argument, by all means go ahead.
    Eventually I'll put a filter on your emails, though. :-)

    The above does not stop me from defining random variables in a
    Bayesian network that refer to what an agent means or what an agent's
    feelings are toward another agent, or some other non-clarity-test
    phenomenon. It also doesn't stop people from having conversations
    about such phenomena, which not only are meaningful to the
    participants, but actually have observable effects in the world. For
    example, if Huang tells Joe he thinks Emily is in love with him, that
    might give Joe the courage to send her an email. This is independent
    of whether Emily's "true feelings" really are or really are not just
    like any other variable. Some day, we might have robots that Joe can
    ask to give him advice on his love life. They might have Bayesian
    networks with random variables referring to women's feelings about
    him. I bet at that time philosophers will *still* be arguing over
    whether Emily "really has" true feelings toward Joe. Joe won't care
    about that. He will care about whether the robot's inferences about
    Emily's feelings are accurate enough to keep him from making a fool
    out of himself. Philosophers can argue till kingdom come, and
    knowledge engineers are going to build Bayesian networks with random
    variables whose values don't satisfy the clarity test, and by gosh,
    those Bayesian networks will turn out to be quite USEFUL!!! Other
    knowledge engineers are going to say it's silly to accept the Law of
    the Excluded Middle when we are talking about whether Emily is in
    love with Joe, and they are going to build fuzzy graphical models,
    and by gosh, THOSE are going to be useful too!!! In fact, if you put
    really *good* knowledge engineers from the opposing camps into a
    bake-off, I'll lay odds they'll both come up with pretty
    high-performing systems. The philosophers will still be arguing over
    the true ontological status of what they are doing and whether it's
    philosophically coherent, and in the meantime we'll get better and
    better natural language understanding and commonsense reasoning
    systems, and eventually the philosophers will find themselves being
    dragged into using the systems themselves, and perhaps the writings
    of the more philosophically inclined engineers who participate in
    building them, as a reality check on their theories.

    For myself, I regard the value of the "Emily's true feelings" random
    variable as (approximately, where the approximation is good enough
    for my modeling purposes) a sufficient statistic for a highly complex
    brain state that it's simply not worth modeling in detail. By "brain
    state" I mean a much higher-dimensional (approximately) sufficient
    statistic that (approximately) d-separates Emily's cognitive-mental
    apparatus from the rest of the world. That is as close to a formal
    definition as I think it is useful to make for a discussion such as
    this. (Formal definitions are *very* useful, though, when designing
    a software spec.)

    >Only at the
    >model theory level, when we concern ourselves with proving consistency,
    >do we have to define the notion of a random variable, sample space, etc.
    >Thus, measure theory helps us build consistent probabilistic models
    >involving continuous variables, but once these are defined, we may ignore
    >its subtleties and crank through the simple logical rules of probability
    >theory to carry out our inferences (assuming that we follow Jaynes's
    >policy with regard to infinite sets.)

    OK, but again, one may not need measure theory. What I like about
    the prequential approach is its explicit agent-based ontology. There
    are agents, and the agents have beliefs and preferences and make
    choices about what bets to engage in regarding intersubjectively
    verifiable phenomena. By observing the agent's bets, one can obtain
    evidence about their beliefs and preferences. When the rules of the
    market permit agents to exploit "arbitrage opportunities" (i.e.,
    opportunities to make a profit on agents willing to accept sure-loss
    gambles), one expects agents willing to make sure-loss bets to be
    squeezed out of the market, either by getting smarter about what bets
    they will make or by going bankrupt. This provides evolutionary
    pressure toward coherence (at least in the bets one will accept, if
    not in one's internal thoughts). (See the work of Nau and McCardle
    downloadable from the Duke Fuqua School web site for more on a
    no-arbitrage ontology for probability.) As I said the other day, one
    can develop a game theoretic ontology for probability that doesn't
    use measure theory directly, but that evaluates probabilistic
    statements by their internal consistency AND their fit to observable
    events. Not all the details of this approach have been worked out as
    fully as measure theory has been worked out, but many standard
    measure-theoretic results have been re-proven, and the prospects look
    very promising to me. I don't want to get into any debates over
    whether Dawid/Shafer/Vovk (alphabetical order) are "really right" and
    the measure theorists "wrong." I don't think that's going to go
    anywhere. But I've been watching this line of work for a number of
    years, and I'm going to keep watching.

    At the risk of being one of those people whose emails get shunted off
    to the "never bother looking at this junk" box, those were some
    things I wanted to say about the "problem of interpreting
    probabilities." There! I've said them!

    (I have a book about this in my mind, but whether it'll ever get
    written is rather iffy, given all the other things I need to spend my
    time on. But I *will* take an hour I "shouldn't" to write an email
    like this. It's a lot less time-consuming than a book. And maybe
    some people will find it useful. But back to work.)

    Cheers,

    Kathy



    This archive was generated by hypermail 2b29 : Tue Jul 31 2001 - 13:12:31 PDT