First of all, I would like to commend both Lotfi and Kathy for the
reasoned, respectful and open-minded dialog, a nice contrast to the too
many dogmatic and antagonistic "dialogs" I have seen in the past within the
fuzzy/Bayesian/AI community.
Now, a few quick observations:
1) At least as a starting point, I have always found it most useful to
view "fuzzy sets" as a shorthand for "membership functions defined over
[ordinary or "crisp"] sets". In other words, rather than treating the
class of fuzzy sets as a super-class of the class of crisp sets, I view it
as a class of constructs that are tied to crisp sets. I believe this
distinction may clarify some of the discussion, and remove some of the
unnecessarily competitive treatment the "crisp vs. fuzzy" issue has
received. The issue now become whether someone can construct a class of
membership functions that has some value (conceptual, computational,
pedagogical, whatever) to its users.
2) Typically, the membership functions are restricted to the ordered
interval [0,1] with specific properties to define the 0 and 1 endpoints and
at least an ordinal scale. It is conceivable to permit a more general
class of functions onto a partially-ordered set (e.g. intervals), but I
have doubts about the practical value of such a weak construct. On the
other hand, attempts to interpret the function's range as cardinal run the
risk of reducibility to second-order probabilities.
3) There may be some confusion between "imprecise probabilities" and
"probabilities of imprecisely-defined events".
a) In the former, your space of events can pass a rigorous binary clarity
test, but you admit imperfections in the process of assessing
probabilities. ("I can't say for sure what the probability is that Jane
will arrive home before 6:30, but I'd say it's about 75%.") This may be a
way of capturing something about the myriad possible influencing factors
that are not modeled explicitly (e.g. all the possible events that might
have delayed Jane's train), or about the method of generating the numbers
(show a subject an exact probability distribution for a long time, then
hide it and ask the subject the probabilities of various intervals, and
you'll get variability). The value of admitting such variability is that
for decision purposes (e.g. when should I start cooking dinner) it may be
unnecessary to determine the precise probability.
b) In the latter case, you may have a perfectly well-defined crisp
probability distribution over the (crisp) event domain, but still wish to
define fuzzy events over that crisp domain and derive or reason about their
probabilities. ("Here is my exact probability distribution over the time
Jane arrives home. Here is my fuzzy membership function for 'early'. Now,
what's the probability that she arrives home early?"). Here, using the
"level sets" interpretation of the membership functions, we can use a fuzzy
description as a compact (at least approximate) representation of a nested
set of interval-valued problems, indexed by the membership value. (E.g.
use the a-level set on all your membership functions to compute the
interval values on probabilities, utilities, etc., and derive the interval
for your answer, then "stack" these a-level sets to assemble the fuzzy set
for the answer. Then, if desired, apply a set of predetermined linguistic
filters to translate the answer into the appropriate language.) In
decision situations, the value is that you obtain an answer with a
degree-of-confidence measure attached.
Sure it's possible to include both aspects of fuzziness in the same
problem, but I'd prefer to start with the simpler cases where there is more
to agree on, then add complexity. You want to go even further, what about
fuzzy-valued membership functions -- don't go there, yet. :-)
4) Natural language, even when restricted to static aspects of word
meaning, is a separate issue from fuzziness. You could address the
problems in 4a and 4b above using explicit membership functions rather than
words. Words, as input data or output descriptors, simply add another
layer of complexity with possibly different (meta-)semantics. The
fundamental assumption for linguistic descriptors, in my humble opinion, is
that the community of speakers and listeners share some common agreement on
approximately the same interpretations for verbal descriptors, including
underlying assumptions about context. So in the context of
coming-home-from-work, people might share the view that arrival time has an
essential noise level on the order of about 30 minutes (again, in a
particular context) which is essential to interpreting the word
"early". It is a separate and highly worthwhile challenge to model this
social-convergence phenomenon, which may require the combined arsenals of
fuzzy logic and probability theory. But I'm afraid that its complexity can
only obscure the more fundamental non-linguistic issues involved in
"imprecise probability" theory.
5) With respect to the use of biological or artificial neural nets as
models for how we treat probability, some caution is in order. As the
"normative systems" school has been preaching for decades now, the behavior
of humans or animals should be the minimal criterion for success, not the
optimal goal to be emulated. Certainly organisms in their natural
environments can deal effectively with chance and fuzziness within a
comfortable range, and our engineered systems ought to do at least as well
there, but those very strengths may turn into weaknesses when extended to
extreme or anomalous situations. Psychologists have documented many
instances of suboptimal and even inconsistent behavior in probability
judgment and decision making. Hybrid or dual-mode approaches may be the
best practical way to capture benefits of both the perceptual and the
analytic approaches, but that still leaves the analytic approach modelers
with the same fundamental issues to address (plus the problem of how to
meld the components into a single system).
6) One final thing -- let's not over-invoke Occam's razor here. Logically
equivalent concepts may have very different semantics that lead to a
valuable diversity in conceptual approaches. Diversity of viewpoint and
even insularity have their benefits, as long as there is enough
understanding and communication to transfer lessons learned and to ensure
consistency. Human society has benefited from the diversity of natural
languages, and although some might argue, so has computer science. So
those who want to study t-norms, please do so, and those who like
second-order probabilities, please do so too, and likewise those who want
to study computational semantics. I hope we can share and benefit from one
another's results.
Jonathan Weiss
At 12:13 PM 2001-09-30, zadeh wrote:
>Dear Kathy:
>
> Thanks for the insightful comments. Here is what I have to say.
>
> (a) Please note that my comment regarding imprecise
>probabilities relates to standard axiomatics of standard probability
>theory, PT, and not to what may be found in research monographs.
>However, construction of an axiomatic system for probability theory with
>imprecise probabilities is complicated by the fact that there are many
>different ways in which probabilities may be imprecise. Can you point me
>to a comprehensive theory which goes beyond what may be found in
>Walley's treatise on imprecise probabilities? Is there a general
>definition of conditional probability when the underlying probabilities
>are imprecise?
>
> (b) When we describe an imprecise probability by a second-order
>probability distribution, we assume that the latter is known precisely.
>Is this realistic? Furthermore, if at the end of analysis we compute
>expectations, as we usually do, then the use of second-order
>probabilities is equivalent to equating the imprecise probability to the
>expected value of the second-order probability. For these and other
>reasons, second-order probabilities are not in favor within the
>probability community.
>
> (c) When an imprecise probability is assumed to be
>interval-valued, what is likely to happen is that after a few stages of
>computation the bounding interval will be close to [0,l].
>
> (d) With regard to your comment on perceptions, see my paper,"
>A New Direction in AI--Toward a Computational Theory of Perceptions," in
>the Spring issue of the AI Magazine. In my approach, the point of
>departure is not a collection of raw perceptions,but their description
>in a natural language,e.g.,"it is very unlikely that Jane is very rich
>." Standard probability theory cannot deal with perception-based
>information because there is no mechanism in the theory for
>understanding natural language.
>
> (e) Your points regarding novel modes of computation are well
>taken. No disagreement.
>
> With my warm regards.
>
>
>Lotfi
>
>
>--
>Professor in the Graduate School, Computer Science Division
>Department of Electrical Engineering and Computer Sciences
>University of California
>Berkeley, CA 94720 -1776
>Director, Berkeley Initiative in Soft Computing (BISC)
>
>Address:
>Computer Science Division
>University of California
>Berkeley, CA 94720-1776
>Tel(office): (510) 642-4959 Fax(office): (510) 642-1712
>Tel(home): (510) 526-2569
>Fax(home): (510) 526-2433, (510) 526-5181
>zadeh@cs.berkeley.edu
>http://www.cs.berkeley.edu/People/Faculty/Homepages/zadeh.html
This archive was generated by hypermail 2b29 : Sun Sep 30 2001 - 19:09:55 PDT