Re: Total Ignorance

Jonathan Weiss (jjweiss@ix.netcom.com)
Thu, 10 Jun 1999 00:35:43 -0400

OK, let's try something a little more concrete that doesn't require Bayesian
priors:

At a dance, there was a special contest with many prizes awarded. The rules
were quite simple:

All men who wear a hat get a prize.
All men who don't wear a hat don't get a prize.
All women who wear glasses get a prize
All women who don't wear glasses don't get a prize.

Exactly 90 percent of the men wore hats (and therefore got prizes).
Exactly 80 percent of the women wore glasses (and therefore got prizes).

(Just in case there are any lawyers out there, everyone at the dance was
either
a man or a woman, and nobody was both a man and a woman. Sheesh!)

With no further information, what can you say about the overall percentage of
people at the dance who got prizes?

The answer isn't completely determined because it depends on the ratio of men
to women at the dance. However, if p is the fraction of attendees that were
men (0 <= p <= 1), the answer is (.9p + .8(1-p)), which takes on its
minimum or
80% when the dance has all women and its maximum of 90% when the dance has all
men. If you can show me how as few as 72% or as many as 98% ot the attendees
received prizes, I'll take off my hat (and glasses) to you.

What changes when we replace "Man" by Q, "Woman" by ~Q, "Hat" by A1, "Glasses"
by A2, and "Prize" by R?

To continue the parallel with Rolf's example (quoted below),

In the dance example, we don't know anything about Q: not its "prior", not
the
proportion of hat-wearers who are men, not the proportion of
hat-and-not-glasses-wearers who are men, etc.

>For example, if Q depends on A1 and A2 in the following way:
>
>  1) (A1 and ¬A2) --> ¬Q,
>  2) (¬A1 and A2) --> Q,
>
>then we get Bel(R)=Pl(R)=P(R)=0.72 and Bel(¬R)=Pl(¬R)=P(¬R)=0.28

(1) corresponds to "None of the men wear hats but no glasses", which is the
same thing as "All of the hat-wearing men also wear glasses". (2) corresponds
to "All the glasses-wearing women also wear hats". If this is the case, then
A1 and A2 are definitely not independent, so there is no way Pr(A1,A2) can be
.72. In fact, its values are restricted to [.8,.9]

Once again, the problem is not the use of belief functions, but the failure to
represent constraints adequately. Another simple example: There are three
numbers A, B, and C that sum to 10. A is in the interval [2,5], B is in the
interval [3,6], and C is in the interval [1,4]. Now define D = A+B+C.
With no
further information, what can we say about the possible values of D? If your
answer is that it can range over the interval [6,15] you ignored the "sum to
10" constraint.

Belief functions are fine (for those who like them), but they can't ignore
constraints such as nonnegativity or summing-to-1 without logical problems
down
the road. Yes, this takes away some of the simplicity of representation and
some of the freedom that make belief functions attractive, but the alternative
is inconsistency and/or incoherency. (Note: I remember fighting the same
battle, with only limited success, in the fuzzy logic community in the late
1970s.)

Jonathan Weiss

At 6/9/99 08:35 AM, Rolf Haenni wrote:
>Hi all,
>
>thanks to Judea's help clarifying the situation by identifying the notion
>of BELIEF as the "PROBABILITY OF NECESSITY" (or probability of
>provability). I agree with this point of view. In fact, in probabilistic
>argumentation systems, instead of BELIEF we prefer to say DEGREE OF
>SUPPORT, which is defined as the probability of the supporting arguments (=
>possible proofs). More precisely, it's a conditional probabilitiy given no
>contradiction.
>
>Unfortunately, I don't have the time to reply to every individual point
>discussed in the emails I received today. However, I see that proponents of
>the Baysian approach have big difficulties to accept a value Bel(R)=0.72
>lower than 0.8 and a value Pl(R)=0.98 higher than 0.9, as Kathryn B. Laskey
>said:
>
>>One can "explain" the phenomenon by saying that there is only a 0.72 chance
>>that "the evidence would prove R," but I was never able to come up with a
>>way to argue this convincingly to a subject matter expert. I guess that's
>>because I can't argue it convincingly to myself.  I can follow the
>>mathematics, but I don't have a handle on what it means.
>
>Let me try to clarify this further. As I already said, the crucial point is
>the total ignorance about Q. Total ignorance means YOU DON'T KNOW ANYTHING
>ABOUT Q, i.e. first of all, you don't know a prior probability (see my
>example about the existence of god), but secondly, it also means that you
>don't even know whether such an (independent) prior probability exists.
>Note that the truth of Q could possibly depend on R, or on A1 or on A2. YOU
>DON'T KNOW IT.
>
>For example, if Q depends on A1 and A2 in the following way:
>
>  1) (A1 and ¬A2) --> ¬Q,
>  2) (¬A1 and A2) --> Q,
>
>then we get Bel(R)=Pl(R)=P(R)=0.72 and Bel(¬R)=Pl(¬R)=P(¬R)=0.28. In
>contrast, if Q depends on A1 and A2 by
>
>  1) (A1 and ¬A2) --> Q,
>  2) (¬A1 and A2) --> ¬Q,
>
>then Bel(R)=Pl(R)=P(R)=0.98 and Bel(¬R)=Pl(¬R)=P(¬R)=0.02. This shows how
>the probabilities for the cases 2) and 3) can simultaneously jump either to
>R or to ¬R (see my last email):
>
>  1) A1 and A2     --> R is automatically true       (0.72)
>  2) A1 and ¬A2    --> nothing can be said about R   (0.18)
>  3) ¬A1 and A2    --> nothing can be said about R   (0.08)
>  4) ¬A1 and ¬A2   --> ¬R is automatically true      (0.02)
>
>To summarize, if nothing is known about Q (not even whether an independent
>prior probability exists), then it makes perfectly sense to say that the
>Belief (or the probability of the provability) is 0.72. The intuition that
>the value must be between 0.8 and 0.9 comes from the assumption that a
>prior probabilty exists.
>This is finally the main point producing all the confusion. It's clear,
>that for a proponent of the Bayesian approach is perhaps difficult to give
>up the assumption that prior probabilities exist. However, I think it's
>necessary in order to capture the nature of total ignorance properly.
>
>The message of K.S.Van Horn underlines all this:
>
>KEVIN S. VAN HORN wrote:
>>...regardless of the value of P(Q), we know from 0 <= P(Q) <= 1 that 0.8 <=
>>P(R) <= 0.9.
>
>==> as I said, it may be difficult to give it up!!! :-)
>
>KEVIN S. VAN HORN wrote:
>>Again, Haenni's theory is losing information by giving unnecessarily
>>loose bounds.
>
>==> or should we say, YOU are ADDING information??? :-)
>
>To conclude, I think it should be clear now that the main difference
>between the Bayesian and the Belief Function approach is just given by the
>way in which total ignorance is handled. For me, the "existence of
>God"-example is a strong indication that total ignorance is handled more
>accurately by belief functions (and also by probabilistic argmentation
>systems), that's all.
>
>Enjoy your day,
>
>Rolf Haenni
>
>
>
>
>
>
>
>
>
>
>
>
>************************************************************************
>*                                                                      *
>*  Dr. Rolf Haenni                        __/  __/  __/ __/  _______/  *
>*  Institute of Informatics (IIUF)       __/  __/  __/ __/  __/        *
>*  University of Fribourg, Switzerland  __/  __/  __/ __/  _____/      *
>*  Phone: ++41 26 300 83 31            __/  __/  __/ __/  __/          *
>*  Email: rolf.haenni@unifr.ch        __/  __/  ______/  __/           *
>*                                                                      *
>************************************************************************
>*  World Wide Web:
<http://www2-iiuf.unifr.ch/tcs/rolf.haenni>http://www2-iiuf.unifr.ch/tcs/rol
f.haenni           *
>************************************************************************
>