probability definitions

Henry Kyburg (kyburg@cs.rochester.edu)
Wed, 8 Jul 1998 14:09:20 -0400

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: David Wolpert: "probability definitions"
Previous message: Rob Engels: "2nd CFP: Special Issue ICAE on Current Trends and Applications of ML in Engineering and Industry (De"

Having spent 35 years largely focussed on the "interpretation" of
"probability", and written a bunch of books on probability, ranging from
straight probability theory to pretty straight philosophy, I have nevertheless
found the recent discussion interesting and enlightening. This will be a bit
long winded, but it may help us to avoid some of the misunderstandings that
have enlivened the interesting discussion of probability and random
variables.
There is plenty to be lively about without depending for liveliness on
misunderstanding!

As a number of people have noted, the English term "probability" is ambiguous:
it has several meanings. It is also vague, in the sense that a number of
these
meanings are not crisp. Nevertheless, we can say several things.

Like many words ("distance", for example) that are used in multiple and
ambiguous ways, "probability" has several quite precise meanings. First, a
probability (as any mathematician will tell you!) is a positive, additive
function, normalized to 1, defined on a field of sets. There's an end of the
matter, right?

Not quite. We can accept this, and start to worry about what the sets are,
about how probability measures get assigned to them, and so on. Or we can
accept simple answers: Any sets you choose, so long as they form a field;
assign measures to them by "convention" or "assumption".

Ah, what a mathematical trick! But a useful one, an invaluable one if we want
to get on with the business of developing probability theory. And while all
the fields we have ever encountered have been finite, we can avoid all
question
of size by taking probability to be sigma-additive.

NOTE that this is not a matter of providing a "real definition" of the
"concept" of probability. It is providing a _nominal_ definition of a certain
class of mathematical functions. Of course we expect there to be some useful
association between these functions and the English term "probability" -- but
what that is remains to be seen.

We can still take seriously the question of "interpreting" probability. One
can take the field of sets to be those comprised by kollectives: infinite
sequences, characterized by some sort of "randomness". Or to be the set of
"measureable" subsets of a given set. Or the set of all subsets of a given
finite set. And the sets themselves can be sets of objects (balls, patients,
insured people), events (rolls of a die, draws of a ball), worlds (so that
since propositions can be identified with sets of worlds, probabilities can be
assigned to propositions), and by a slight extension, we can take a function
whose domain is a field of statements also to be a probability function. Even
in ordinary language one meaning of probability is quite precise.

Such reflections led Carnap to the view that there were two important
explicata for "probability," where by an explicatum for a term of ordinary
language he meant a technical term that shared much of the meaning of the
ordinary language term, but which was precisely defined in a formal system,
and
therefore avoided some of the vagueness and all of the ambiguity of the term
of ordinary language. In the case of "probability" one explicatum was an
empirical notion corresponding to the statistical use of the term: a
probability claim was to be construed as a claim about the world, and
required
evidential support, like any other such claim. Candidates for explicata of
this notion are relative frequencies, limiting frequences, etc. The other
explicatum, i which Carnap was mainly interested, was a logical notion, and
was
to be given by a logical measure on the sentences of a formal language: a
probability in this sense is a matter of logic. Of course what interests us
are the conditional probabilities P(H|E) where H is a hypothesis we are
interested in and E represents our total evidence. This view has been pursued
recently by Bacchus, Halpern, and others, and for many years by E. T. Jaynes,
in addition to a stable of philosophers.

So far, no mention of belief! No mention, either, of the grounds on which we
assign probabilities of either sort. But there is a tie embodied in common
usage: We should believe what the evidence makes probable; we should take
probability "as the very guide of life" (Bishop Butler, 1861); we should
accept
statistical hypotheses (probabilities construed as empirical) only when the
evidence (statistical or otherwise) renders them practically certain, ...

A logical notion of probability, as imagined by J. M. Keynes, would be
"legislatiuve for rational belief" in the sense that if a person had total
evidence E, he should assign probability P(H|E) to the proposition H. If we
had a logical notion of probability, we could talk about the probability of
statistical hypotheses relative to evidence; thus it was Carnap's original
goal
to provide a logical definition of probability for first order languages that
would be intuitively compelling for almost everybody, in the sense that Z-F
set
theory is acceptable to almost everybody.

At this point I'm afraid things get complicated. No measure on the sentences
of a formal language that has been proposed has been regarded as intuitively
compelling. Carnap himself gave up the search for a measure for logical
probability that would have the intuitive appeal of modus ponens as a
principle
of logic. Ramsey and de Finetti latched on to the relation between belief and
probability, and both took the point of view that only probabilistic
consistency should constrain degrees of belief. Probability is normative:
one's degrees of belief ought to correspond to some probability function whose
domain is a field of objects corresponding to a set of statements of your
language.

Some people think there are other constraints that should be imposed on
degrees of belief -- for example that they should conform to known empirical
frequencies. Some people think that the requirement that degrees of belief be
real-valued is too strong, and that a more useful explication of probability
would have it be interval valued. Some people argue for an explication of
probability that does not require a prior distribution. Many people agree
that
as a guide in life, for making decisions, for determining expectations, what
we need is an explication that has as its domain a field of statements. Many
think that since probability should also be relative to evidence, the
domain of
the function should really be the cross product of a field of statements and a
set of sets of statements comprising possible bodies of knowledge.

To put my own cards on the table, my favorite explication of probability takes
it to have a domain that is the cross product of a field of statements and a
set of possible bodies of knowledge, and to have a range of intervals. It
also
requires that every probability be based on a known frequency -- i.e., an item
of statistics in the body of knowledge. This approach requires no prior
probabilities, but it does require a nonmonotonic rule of acceptance: we must
tentatively accept highly probable statistical generalizations. The major
problem faced by this approach is that of determing what "known frequency" is
relevant to a given statement.

As Carnap might have said, "In my house are many explicata,..." ranging from
frequencies to limiting frequencies, to bare measures, to normalized
subjective
degrees of belief, to belief functions, to ...

Oh yes, random variables. Most people seem to recognize that they are
functions. Maybe with non-numerical range. Many of us find the term
"variable" inappropriate when applied to a function (whether its range is the
reals or the set {guilty,innocent}). So some of us prefer other locutions,
such as "random quantity" that seems to wear
its character more plainly on its sleeve. The issue, so far as I can tell,
has to to with linguistic fastidiousness rather than the preferred explication
of "probability".

So far as I know, the term "random quantity" was invented by L. J. Savage.
He, de Finetti, and I engaged in a long three-way correspondence on the
occasion of my translation of de Finetti's "Prevision" (published as
"Foresight'). We devoted considerable effort to coming up with the most
desirable translation of "nombre aleatoire", which played a very important
role
in that work. "Random quantity" was what we came up with, as I recall at
Savage's urging. (It was a somewhat crippled correspondence: I knew almost no
Italian, and was not comfortable in French; de Finetti, the Italian, knew
relatively little English; Savage was the linguist among us, but I'm not sure
how good his Italian was at that time (around 1955.)

The upshot was that the understanding of ``random quantity'' or ``random
variable'' was completely independent of the underlying question of
interpreting probability. Although a random quantity ordinarily is thought of
as having a range in the reals, there is no reason it can't have a range of
{guilty,notguilty} or {red,white,blue}. No matter how you interpret
probability, the value of a random variable is determined by its argument: the
ball we picked from the urn, the socks Judea picked from his drawer. How to
understand the probability that a given random quantity has a particular
value,
of course DOES require that we think about how we should explicate probability
in the case at hand. That is no doubt how this discussion that I originally
thought of as merely terminological has produced so much insight into
probability.

Next message: David Wolpert: "probability definitions"
Previous message: Rob Engels: "2nd CFP: Special Issue ICAE on Current Trends and Applications of ML in Engineering and Industry (De"