Re: Bayesian priors representing ignorance

Jonathan Weiss (jjweiss@ix.netcom.com)
Thu, 10 Jun 1999 02:41:52 -0400

At 6/9/99 11:14 PM, Kevin S. Van Horn wrote:
>In the context of assigning Bayesian priors that represent complete
>ignorance, Jonathan Weiss asks:
>
>  1) Someone presents you with a huge deck of cards (not standard playing
>     cards -- each card has a spot of a given color on it).  Before even
>     one card is seen, what is the probability that the first card dealt
>     is red?
>
>The problem as stated is ill-posed until we know what set of alternatives we
>are considering. 

If that is the case, then every real-world problem is ill-posed.

Suppose that the only alternatives I know of are red and
>not-red, and I am otherwise completely ignorant -- in particular, I don't
>even know that red is a color or I haven't the vaguest idea what colors are.
>Call this state of information X0.  Then, by the permutation invariance
>argument given below, this state of ignorance must be represented by P(red |
>X0) = 1/2.

Only if you first accept the possibility of being completely ignorant. My
point throughout is that complete ignorance is itself ill-defined. Besides,
the only ignorance we should be addressing is ignorance about the colors of
the
particular cards in this example, not ignorance about what a color is or what
the names of colors are.

Anyhow, if someone came up to you and presented a deck of cards, and asked you
to predict "Red" or "Not-red" with a prize of $5000 if you call it correctly
(and no penalty for being wrong), would you really be indifferent between
calling "Red" or "Not-Red"? Or might you reason like this: "If all the cards
in this deck are one color, there is no evidence that that one color is
red; if
the deck contains exactly two colors, even if one of them is red my "ignorant"
prior would be 50-50; and if there are three or more colors in the deck, even
if one of them is red my "ignorant" prior for Red would be less than 50-50.
So
no matter how many colors of cards are in the deck, I will be no worse off and
may be much better off to call "Not-Red". In your "informed ignorance" you
actually have a great deal of information -- about decks of cards in general,
about the possible motivation of people who offer you crazy propositions like
this, about the semantics of color naming (does crimson count as red?), etc.

>
>  2) Assuming you assigned some finite probability P(red), now for the same
>     card that you still haven't seen, what is the probability that it it
>     blue?
>
>Let's continue to assume that I know nothing beyond what is stated in the
>problem, and am still ignorant of the concept of color.  Assuming that I
>can't rule out the possibility that the card is neither blue nor red, I
>am now aware of three possibilities: red, blue, and not-red-not-blue.  Call
>this state of knowledge X1.  Then P(red | X1) = 1/3.
>
>This might seem to contradict my previous assessment of P(red | X0) = 1/2.
>But X0 and X1 are not identical states of information.  We are talking about
>two qualitatively different conditional probabilities, one conditioned on
>X0, the other conditioned on X1.  It should surprise nobody that my
>assignment of probabilities changes when I have access to more information.

What is different about your state of information? All I did was ask another
question. I didn't tell you anything more about the world. The colors were
there whether you were aware of them or not. If you admit to changing
probabilities just based on the fact that someone has asked you a question, I
have some betting propositions for you... ;-)
>
>  3) Now, what is the probability that it is yellow?  Black?  Purple?
>     Orange? White?  Fuchsia?  etc.?  Has your P(red) assessment changed?
>     How many colors can you name?  Are you willing to assign them equal
>     probabilities just based on ignorance?
>
>Yes: again assuming a complete ignorance of the concept of color, P(red | X)
>changes as X -- my set of mutually exclusive and exhaustive possibilities
>(sample space) -- changes.  And yes, if I am truly ignorant, and cannot
>attach any semantic content to these labels, then the only sensible thing I
>can do is assign equal probabilities to the possibilities.

Again, my point is that this puts you in an untenable position, because
with no
additional substantive information about what color is on the card, or what
the
alternative possibilities are (nobody ever said initially that yellow,
fuchsia,
etc. were not possible choices, so why assume they have probability = 0?).
>
>  4) Now, suppose you are told reliably that every card in the deck is either
>     red, blue, or green.  Now what is your P(red)?
>
>Call this state of information X2.  Then P(red | X2) = 1/3.
>
>Here's the permutation-invariance argument.  Suppose I relabel the colors,
>for example, I relabel red as "blue", blue as "green", and green as "red".
>Call this state of information X2'.  If I am truly ignorant, I can't
>distinguish between this problem and the original, so the probability
>distributions conditional on X2 and X2' should be the same.  This holds for
>any permutation of the labels.  The only distribution that remains invariant
>under any permutation of the labels is the uniform distribution, that is,
>P(c | X2) = 1/3 for each label c.
>
>  5) One more bit of information now:  among the blue cards, there are light
>     blue and dark blue.  Does this change P(red)?
>
>The important phrase here is "one more bit of information": our
>probabilities are conditioned on different information than we had in
>problem (4).  So, of course, P(red | X3) != P(red | X2), where X3 is the
>state of information described in (5).  And, as a truly ignorant person who
>doesn't know what "light blue" and "dark blue" mean, this is no different
>from breaking up not-red into blue and not-red-not-blue, as in (2).

Ah, but again stipulating that ignorance applies only to what color is on the
card, you might want to keep p(red) and p(green) at 1/3 and divide the 1/3 for
p(blue) into 1/6 light blue and 1/6 dark blue. Since my question stipulated
that this new information affected only the sub-structure of the blue
cards, it
shouldn't change p(red) or p(green).
>
>What's really going on here is that Weiss is playing bait and switch: he
>asks us to assign probabilities based on an assumption of total ignorance,
>then criticizes those assignments based on *additional* information that a
>person totally ignorant of the semantic content of the labels "red,"
>"green," et cetera would not have.  The fact that these labels are colors
>immediately makes relevant a great body of information we all have about
>colors.  We are not, in fact, in state of complete ignorance.

Right!! And in the real world, we never are, about anything. Our knowledge
may be very incomplete and wrong and all kinds of things, but when push comes
to shove we always have enough information to form a consistent prior based on
that knowledge. It may be wrong, but it is at least consistent. There is no
total ignorance.

But in this case, I don't presuppose any knowledge of colors and their
properties. All the information I have presented, and all the questions
asked,
have used nothing but set theoretic (or logical) constructs, such as the
partitioning of Blue into Light-Blue and Dark-Blue. It wouldn't make any
difference to any of my arguments if we were to replace the colors on the
cards
by integers or animal pictures.
>
>However, there is a form of ignorance that is worth examining here.  Human
>color perception is such that three coordinates -- for example, hue, chroma,
>and lightness -- suffice to specify all perceivable colors.  The set of all
>colors then occupies a compact three-dimensional volume.  I haven't examined
>the problem (nor studied color theory) in sufficient detail to give a
>compelling argument that one particular prior over this volume represents a
>state of complete ignorance, but my intuition suggests that a uniform
>prior over, say, the color space of the Munsell Color System, should do the
>job.  (My reasoning is that equal volumes in this color space apparently
>represent equal volumes in human perceptual space.)

There are lots of ways to organize color space, and Munsell is only one. How
do you pick what space to assign a uniform density over? Anyway, linguistic
labels like "red" don't correspond to well-defined subsets of this space.
There's still a lot of information that must be filled in (subjectively) by
the
person who is answering the questions.
>
>Weiss continues:
>
>  [...] what would be an uninformed prior over the set of real numbers?
>
>It depends on what kind of parameter you are talking about.  If you have a
>location parameter, translation invariance arguments give an (improper)
>uniform prior over the entire real line.  If you have a scale parameter,
>scale invariance arguments give an (improper) prior proportional to 1/x
>(uniform over log x).

Now who's assuming a lot of information? This subject doesn't know his colors
yet, but he can tell a location parameter from a scale parameter and apply
invariance arguments!?! :-)

Of what value is an improper uniform prior when you have to make a bet? The
probability of any bounded interval would be zero. There is always some
information, and there should always be a proper density that integrates to
1.0. (No, it doesn't have to be a Gaussian.)

=====

I would very much like to see a concrete example of a situation where an
"uninformed" prior is superior in any decision-relevant way to a possibly
arbitrary or erroneous, but at least consistent, "real" prior.

Jonathan Weiss