RE: "information gained" sometimes != "entropy reduction" ??

Eric Horvitz (horvitz@microsoft.com)
Tue, 11 Aug 1998 17:07:16 -0700

The expected utility of a decision based on the updated distribution will be
>= to the expected utility of the decision before integrating the evidence.
Eric

-----Original Message-----
From: Robert Dodier [mailto:dodier@bechtel.Colorado.EDU]
Sent: Tuesday, August 11, 1998 4:52 PM
To: uai@CS.ORST.EDU
Subject: "information gained" sometimes != "entropy reduction" ??

Esteemed colleagues,

I have a brief question concerning terminology, this time
about "information."

As a pleasant learning exercise, I am reinventing the wheel of
Bayesian network inference. As one of the subsidiary outputs, I am
planning to compute the difference in entropy between the posterior
for some variable before a certain evidence item is introduced and
the entropy of the posterior of the same variable after the evidence.

Now what we'll usually see, I imagine, is that evidence usually
reduces the entropy of the posterior, and I believe it is consistent
with conventional terminology to say "reduction of entropy == gain
of information" -- so many bits per item of evidence.

But I know there is no guarantee that the posterior will have less
entropy after the evidence is introduced. (I often have that feeling
of "now I am more confused than before!")

In this scenario, where is the "information gain"? In absorbing the
evidence, something is gained -- but what? What is the quantity
(if there is one) that's always increased by absorbing evidence?

I can, of course, leave the word "information" out of the picture and
refer simple to "change of entropy". But "information" is so suggestive
and attractive -- I would rather use it if I can.

Your comments are greatly appreciated.

Regards,
Robert Dodier