Re: [UAI] Fuzzy sets vs. Bayesian Network

From: Scott Ferson (scott@ramas.com)
Date: Fri Feb 25 2000 - 12:10:28 PST

  • Next message: Hanan Lutfiyya: "[UAI] 4th IEEE International Workshop on Systems Management"

    Professor Bruyninckx' email address bounces my messages,
    so I've presumptuously post this to the list. I promise to be
    quiet after this.
    - -Scott

    Herman Bruyninckx wrote:

    > On Thu, 24 Feb 2000, Scott Ferson wrote:
    >
    > > Herman Bruyninckx wrote:
    > >
    > > > Bayesian theory is fully consistent: there is only _one_ way to do
    > > > your algebra.
    > >
    > > It is perhaps a bit misleading to say that because Bayesian theory
    > > is consistent that there's only one way to do a Bayesian analysis
    > > for a problem. Come on! In fact, if the use of subjective priors
    > > is involved, then there are as many different analyses as there
    > > are analysts.
    > I have two comments on this:
    > 1) if you are able to describe the problem exactly then also the
    > `subjective' priors are objectivised, throug the prinicple of maximum
    > entropy (which results from the fact that you don't use any information
    > that is not provided in your system; `information' in the real meaning
    > used in Information Theory). I admit that it is often not possible to
    > _exactly_ describe your system. But two independent people starting
    > from the _same_ data will end up with the _same_ result.
    > So, don't blame probability theory (i.e., the calculus of inference)
    > on the problem of arbitrary choice of priors.

    I believe this approach is not as useful as you suggest. Let me
    pose a question to you involving inference under incomplete
    information: All I tell you about a quantity A is that its value
    (or values) are somewhere between 2 and 4, and all we know
    about B is that it's between 1 and 3. Now I ask what you can
    say about the sum A+B.

    We can compute an answer based on a maximum entropy
    argument that we could attribute to Jaynes (or, for that matter,
    to Laplace himself). It models A and B with uniform distributions
    because we know only their ranges. It models the dependency
    between the quantities A and B with the independence copula
    because that is the most entropic copula. We then just compute
    the convolution, either analytically or by a brute-force method
    such as Monte Carlo, and find the answer to be a symmetric
    triangular distribution between 3 and 7.

    It is clear that this answer is free lunch, information-theoretically
    speaking. It is clear that all we really *know* about the sum
    is that its range is between 3 and 7. A different maximum
    entropy argument would then suggest we should model the
    sum as a uniform distribution. The inconsistency between the
    triangular and uniform consistutes a reductio ad absurdum.
    I think that maximum entropy arguments really only amount
    to mathematical wishful thinking. Its limitations are especially
    serious in disciplines like risk analysis where information is
    usually extremely sparse and wishful thinking can lead to
    very unpleasant outcomes like people dying and bridges
    collapsing.

    I agree that we shouldn't blame a mathematical theory for its
    misapplications. However, I think that there is some blame
    that should go to overzealous proponents of a theory [not you,
    of course] who've tried to convince risk analysts that probability
    theory is the only consistent calculus for propagating uncertainty.
    This is clearly erroneous. Interval analysis is certainly consistent
    (albeit less powerful when information is abundant).

    > 2) For one reason or another, people don;t want to understand the
    > difference between (i) inference (= _calculating_ probability
    > distributions from given inputs, using bayes rule as the _unique_
    > and _consistent_ calculus for doing this), and (ii) decision making
    > (= looking at the resulting inference, and deciding what to do next).
    > This decision making is _not_ unique and consistent.
    > Most other uncertainty calculi mixed both problems together.

    I generally agree that there's an important difference
    between formal inference and modeling on a broader
    scale. One can be pure while the other is quite messy.

    > > In principle, the different t-(co)norms correspond to different
    > > underlying mechanisms that govern how the inputs should
    > > be combined. This should rightly be considered flexibility,
    > > rather than merely loose definition.
    > What I (and others) say is: Bayesian probability has _first principles_ to
    > derive the single rule with which to process the inputs; other uncertainty
    > calculi don't. So, this is not flexibility, this is putting your head in
    > the ground, not willing to see which first principles you violate (or
    > adhere to).

    I suspect that the real problem is that the fuzzy types
    have not yet given adequate interpretations for any
    of these norms or co-norms. You may be right that
    there aren't any coherent interpretations that they're
    going to come up with.

    > > > Hence, fuzzy logic is indeed more general; in fact it is too general to
    > > > be still called a scientific paradigm (because of the above-mentioned
    > > > indefiniteness of its calculus).
    > >
    > > It does seem clear that a fuzzy approach is more general because
    > > it is axiomatically weaker, but I don't think it's quite fair to conclude
    > > it's uselessly general.
    > I agree with this: fuzzy logic has been used for half a century already,
    > but, as an old controls guy has told me some time ``we used to call this
    > common sense, now they think they need university professors to explain the
    > same thing'' :-)

    Hmmm.

    > > Perhaps a more generous interpretation is
    > > that it has *wider applicability* than a strict probabilistic approach.
    > > Concomitantly, we'd expect it to be a less powerful theory in
    > > situations where the axioms of probability theory are satisfied.
    > I agree with this completely: I have encountered enough situations in my
    > own research (sensor based robotics) where it is indeed impossible to do a
    > clean probabilistic analysis. (Or I should rather say: ``I am too stupid to
    > do it''...) And we hired a PhD in fuzzy theory and possibility theory to
    > help us out; the result being that I now have experienced that these
    > theories are incredibly flexible but at the same time incredibly arbitrary.

    I find the arbitrariness disturbing too. I'm actually less convinced
    in the practical utility of fuzzy methods than you seem to be.

    > > First, remember that it's not the *probability* of being
    > > tall or small. It's not a probability at all. It's something
    > > else, sometimes called "possibility", which measures the
    > > degree something is true (not its frequency, or even one's
    > > belief that it's true).
    > This is true, but allow me to remark that I haven't seen any better
    > `definition' than ``it's something else''. No axiomatic foundations, such
    > that you can never be sure whether it's your calculus or your algorithm
    > that leads to bad results....

    Well, they do have a clear axiomatic foundation. I agree however
    that the fuzzy types have not given a clear interpretation of what
    possibility really *is*. What is this measure really measuring?
    To be fair, in an axiomatic treatment, they're not really required to
    give interpretations, but doing so would certainly help both doubters
    and supporters understand what they're talking about. Actually,
    I suspect they're still shopping around for some good interpretations.
    Maybe we should give them a bit more time. Probability theory had
    a head start of a few centuries, and the current personalist-belief
    definition of probability only emerged as the dominant interpretation
    in this century. Newton never had a clear definition of continuity.
    Indeed, if he'd know about the possibility of Weierstrass functions,
    he probably wouldn't have bothered to invent the Calculus. Maybe
    something will eventually come of this fuzzy stuff, even if it's not what
    the originators thought it'd be.

    > > Second, you should be careful about characterizing the
    > > sets as "mutually exclusive" which is a phrase that recalls
    > > Boolean logic. In a fuzzy set theory, the set of tall people
    > > and the set of small people could well be not mutually
    > > exclusive. I'm tall for a jockey, but pretty small for a
    > > basketball player. It makes a difference what the sets
    > > were constructed to represent.
    > This is one of the basic examples where the abovementioned difference
    > between (i) inference and (ii) decision making shows up: in fuzzy logic one
    > _starts_ by making a decision (e.g., this fuzzy set represents ``tall''
    > people), and you carry this decision through all your subsequent
    > operations, often resulting in strange things where the label ``tall''
    > doesn't fit anymore. Probability theory, on the other hand, starts from a
    > parameterized description of the same system: a PDF describing the length
    > of people; only at the end one has to make a decision about whether someone
    > is to be called ``tall'' or not. And, as I said before, the criteria of
    > this decision making are rather flexible.

    Maybe the division is between the pure calculus by which we propagate
    uncertainty and make formal inferences, and the messy process of
    modeling which includes selecting the calculus, marshalling the relevant
    data, positing assumptions and drawing conclusions. If the modeling
    process is so strained that the meaning of "tall" by the end "doesn't fit
    anymore", well, then that's just a bad application. Maybe we shouldn't
    blame that on fuzzy sets theory, anymore than we blame probability
    theory for its misapplications. I don't find convincing your argument
    that the "first principles" of probability theory will prevent nonsense in
    its applications. It's really a question of where and how you introduce
    assumptions. We all know that you can make a good model or a bad
    one, no matter how good the mathematical underpinnings. Probability
    theory can't help you pick your assumptions.

    > [...]
    > > Vagueness is just one of several domains where probability
    > > has limited usefulness.
    > I don't agree. ``Vagueness'' is nothing more than incomplete information.
    > And this is handled nicely in probability theory (The Jaynes-Cox-Polya-
    > Amari variant that is, not Fisher's).
    > Maybe not everybody was as lucky as myself and discovered ``Probability
    > Theory As Extended Logic'' on <http://bayes.wustl.edu/> :-) A must read
    > page!

    So you think vagueness is "nothing more than incomplete
    information"? It's easy to show that it has nothing to do with
    incomplete information. I could have all the heights of every
    single individual in the population down to the nanometer,
    yet still not be sure whether someone deserves the appellation
    of "tall". There are still borderline cases. Or did you mean to
    say it is nothing more that incomplete *specification*? That's
    the more common argument.

    I'm familiar with the web page you cite, and with Jaynes' book.
    Although the book seems like a thorough treatment, I found it
    ultimately unconvincing from both mathematical and practical
    perspectives. I know that must sound like heresy to you, but
    I think that the Jaynesian answer to the interval question is patently
    absurd. I'd written Jaynes about a year before his death about
    the problem, but he never responded. I imagine that his health
    may have been failing by that time.

    By the way, are you familiar with any published discussion of
    the axioms of maximum entropy? I was looking for them in
    Jaynes' book, but couldn't find them there. Only incomplete
    (yet insistent) recollections by Russian emigre mathematicians
    made me even aware that they existed. I've never seen them
    discussed anywhere. Has anyone?

    Scott Ferson <scott@ramas.com>
    Applied Biomathematics
    631-751-4350, fax -3435

    P.S.
    Email directed to Bruyninckx' personal address bounces back.
    Perhaps my ".com" domain makes me look like a spammer.
    Or maybe it's the *content* of the message?

       ----- Transcript of session follows -----
    .. while talking to krimson.cc.kuleuven.ac.be.:
    >>> MAIL From:<scott@ramas.com>
    <<< 550 Access denied (see http://www.kuleuven.ac.be/ludit/nospam.htm)
    554 <Herman.Bruyninckx@mech.kuleuven.ac.be>... Service unavailable

    ------- End of Forwarded Message



    This archive was generated by hypermail 2b29 : Fri Feb 25 2000 - 12:11:54 PST