Professor Bruyninckx' email address bounces my messages,
so I've presumptuously post this to the list. I promise to be
quiet after this.
- -Scott
Herman Bruyninckx wrote:
> On Thu, 24 Feb 2000, Scott Ferson wrote:
>
> > Herman Bruyninckx wrote:
> >
> > > Bayesian theory is fully consistent: there is only _one_ way to do
> > > your algebra.
> >
> > It is perhaps a bit misleading to say that because Bayesian theory
> > is consistent that there's only one way to do a Bayesian analysis
> > for a problem. Come on! In fact, if the use of subjective priors
> > is involved, then there are as many different analyses as there
> > are analysts.
> I have two comments on this:
> 1) if you are able to describe the problem exactly then also the
> `subjective' priors are objectivised, throug the prinicple of maximum
> entropy (which results from the fact that you don't use any information
> that is not provided in your system; `information' in the real meaning
> used in Information Theory). I admit that it is often not possible to
> _exactly_ describe your system. But two independent people starting
> from the _same_ data will end up with the _same_ result.
> So, don't blame probability theory (i.e., the calculus of inference)
> on the problem of arbitrary choice of priors.
I believe this approach is not as useful as you suggest. Let me
pose a question to you involving inference under incomplete
information: All I tell you about a quantity A is that its value
(or values) are somewhere between 2 and 4, and all we know
about B is that it's between 1 and 3. Now I ask what you can
say about the sum A+B.
We can compute an answer based on a maximum entropy
argument that we could attribute to Jaynes (or, for that matter,
to Laplace himself). It models A and B with uniform distributions
because we know only their ranges. It models the dependency
between the quantities A and B with the independence copula
because that is the most entropic copula. We then just compute
the convolution, either analytically or by a brute-force method
such as Monte Carlo, and find the answer to be a symmetric
triangular distribution between 3 and 7.
It is clear that this answer is free lunch, information-theoretically
speaking. It is clear that all we really *know* about the sum
is that its range is between 3 and 7. A different maximum
entropy argument would then suggest we should model the
sum as a uniform distribution. The inconsistency between the
triangular and uniform consistutes a reductio ad absurdum.
I think that maximum entropy arguments really only amount
to mathematical wishful thinking. Its limitations are especially
serious in disciplines like risk analysis where information is
usually extremely sparse and wishful thinking can lead to
very unpleasant outcomes like people dying and bridges
collapsing.
I agree that we shouldn't blame a mathematical theory for its
misapplications. However, I think that there is some blame
that should go to overzealous proponents of a theory [not you,
of course] who've tried to convince risk analysts that probability
theory is the only consistent calculus for propagating uncertainty.
This is clearly erroneous. Interval analysis is certainly consistent
(albeit less powerful when information is abundant).
> 2) For one reason or another, people don;t want to understand the
> difference between (i) inference (= _calculating_ probability
> distributions from given inputs, using bayes rule as the _unique_
> and _consistent_ calculus for doing this), and (ii) decision making
> (= looking at the resulting inference, and deciding what to do next).
> This decision making is _not_ unique and consistent.
> Most other uncertainty calculi mixed both problems together.
I generally agree that there's an important difference
between formal inference and modeling on a broader
scale. One can be pure while the other is quite messy.
> > In principle, the different t-(co)norms correspond to different
> > underlying mechanisms that govern how the inputs should
> > be combined. This should rightly be considered flexibility,
> > rather than merely loose definition.
> What I (and others) say is: Bayesian probability has _first principles_ to
> derive the single rule with which to process the inputs; other uncertainty
> calculi don't. So, this is not flexibility, this is putting your head in
> the ground, not willing to see which first principles you violate (or
> adhere to).
I suspect that the real problem is that the fuzzy types
have not yet given adequate interpretations for any
of these norms or co-norms. You may be right that
there aren't any coherent interpretations that they're
going to come up with.
> > > Hence, fuzzy logic is indeed more general; in fact it is too general to
> > > be still called a scientific paradigm (because of the above-mentioned
> > > indefiniteness of its calculus).
> >
> > It does seem clear that a fuzzy approach is more general because
> > it is axiomatically weaker, but I don't think it's quite fair to conclude
> > it's uselessly general.
> I agree with this: fuzzy logic has been used for half a century already,
> but, as an old controls guy has told me some time ``we used to call this
> common sense, now they think they need university professors to explain the
> same thing'' :-)
Hmmm.
> > Perhaps a more generous interpretation is
> > that it has *wider applicability* than a strict probabilistic approach.
> > Concomitantly, we'd expect it to be a less powerful theory in
> > situations where the axioms of probability theory are satisfied.
> I agree with this completely: I have encountered enough situations in my
> own research (sensor based robotics) where it is indeed impossible to do a
> clean probabilistic analysis. (Or I should rather say: ``I am too stupid to
> do it''...) And we hired a PhD in fuzzy theory and possibility theory to
> help us out; the result being that I now have experienced that these
> theories are incredibly flexible but at the same time incredibly arbitrary.
I find the arbitrariness disturbing too. I'm actually less convinced
in the practical utility of fuzzy methods than you seem to be.
> > First, remember that it's not the *probability* of being
> > tall or small. It's not a probability at all. It's something
> > else, sometimes called "possibility", which measures the
> > degree something is true (not its frequency, or even one's
> > belief that it's true).
> This is true, but allow me to remark that I haven't seen any better
> `definition' than ``it's something else''. No axiomatic foundations, such
> that you can never be sure whether it's your calculus or your algorithm
> that leads to bad results....
Well, they do have a clear axiomatic foundation. I agree however
that the fuzzy types have not given a clear interpretation of what
possibility really *is*. What is this measure really measuring?
To be fair, in an axiomatic treatment, they're not really required to
give interpretations, but doing so would certainly help both doubters
and supporters understand what they're talking about. Actually,
I suspect they're still shopping around for some good interpretations.
Maybe we should give them a bit more time. Probability theory had
a head start of a few centuries, and the current personalist-belief
definition of probability only emerged as the dominant interpretation
in this century. Newton never had a clear definition of continuity.
Indeed, if he'd know about the possibility of Weierstrass functions,
he probably wouldn't have bothered to invent the Calculus. Maybe
something will eventually come of this fuzzy stuff, even if it's not what
the originators thought it'd be.
> > Second, you should be careful about characterizing the
> > sets as "mutually exclusive" which is a phrase that recalls
> > Boolean logic. In a fuzzy set theory, the set of tall people
> > and the set of small people could well be not mutually
> > exclusive. I'm tall for a jockey, but pretty small for a
> > basketball player. It makes a difference what the sets
> > were constructed to represent.
> This is one of the basic examples where the abovementioned difference
> between (i) inference and (ii) decision making shows up: in fuzzy logic one
> _starts_ by making a decision (e.g., this fuzzy set represents ``tall''
> people), and you carry this decision through all your subsequent
> operations, often resulting in strange things where the label ``tall''
> doesn't fit anymore. Probability theory, on the other hand, starts from a
> parameterized description of the same system: a PDF describing the length
> of people; only at the end one has to make a decision about whether someone
> is to be called ``tall'' or not. And, as I said before, the criteria of
> this decision making are rather flexible.
Maybe the division is between the pure calculus by which we propagate
uncertainty and make formal inferences, and the messy process of
modeling which includes selecting the calculus, marshalling the relevant
data, positing assumptions and drawing conclusions. If the modeling
process is so strained that the meaning of "tall" by the end "doesn't fit
anymore", well, then that's just a bad application. Maybe we shouldn't
blame that on fuzzy sets theory, anymore than we blame probability
theory for its misapplications. I don't find convincing your argument
that the "first principles" of probability theory will prevent nonsense in
its applications. It's really a question of where and how you introduce
assumptions. We all know that you can make a good model or a bad
one, no matter how good the mathematical underpinnings. Probability
theory can't help you pick your assumptions.
> [...]
> > Vagueness is just one of several domains where probability
> > has limited usefulness.
> I don't agree. ``Vagueness'' is nothing more than incomplete information.
> And this is handled nicely in probability theory (The Jaynes-Cox-Polya-
> Amari variant that is, not Fisher's).
> Maybe not everybody was as lucky as myself and discovered ``Probability
> Theory As Extended Logic'' on <http://bayes.wustl.edu/> :-) A must read
> page!
So you think vagueness is "nothing more than incomplete
information"? It's easy to show that it has nothing to do with
incomplete information. I could have all the heights of every
single individual in the population down to the nanometer,
yet still not be sure whether someone deserves the appellation
of "tall". There are still borderline cases. Or did you mean to
say it is nothing more that incomplete *specification*? That's
the more common argument.
I'm familiar with the web page you cite, and with Jaynes' book.
Although the book seems like a thorough treatment, I found it
ultimately unconvincing from both mathematical and practical
perspectives. I know that must sound like heresy to you, but
I think that the Jaynesian answer to the interval question is patently
absurd. I'd written Jaynes about a year before his death about
the problem, but he never responded. I imagine that his health
may have been failing by that time.
By the way, are you familiar with any published discussion of
the axioms of maximum entropy? I was looking for them in
Jaynes' book, but couldn't find them there. Only incomplete
(yet insistent) recollections by Russian emigre mathematicians
made me even aware that they existed. I've never seen them
discussed anywhere. Has anyone?
Scott Ferson <scott@ramas.com>
Applied Biomathematics
631-751-4350, fax -3435
P.S.
Email directed to Bruyninckx' personal address bounces back.
Perhaps my ".com" domain makes me look like a spammer.
Or maybe it's the *content* of the message?
----- Transcript of session follows -----
.. while talking to krimson.cc.kuleuven.ac.be.:
>>> MAIL From:<scott@ramas.com>
<<< 550 Access denied (see http://www.kuleuven.ac.be/ludit/nospam.htm)
554 <Herman.Bruyninckx@mech.kuleuven.ac.be>... Service unavailable
------- End of Forwarded Message
This archive was generated by hypermail 2b29 : Fri Feb 25 2000 - 12:11:54 PST