"information gained" sometimes != "entropy reduction" ??

Robert Dodier (dodier@bechtel.Colorado.EDU)
Tue, 11 Aug 1998 17:51:38 -0600 (MDT)

Esteemed colleagues,

I have a brief question concerning terminology, this time
about "information."

As a pleasant learning exercise, I am reinventing the wheel of
Bayesian network inference. As one of the subsidiary outputs, I am
planning to compute the difference in entropy between the posterior
for some variable before a certain evidence item is introduced and
the entropy of the posterior of the same variable after the evidence.

Now what we'll usually see, I imagine, is that evidence usually
reduces the entropy of the posterior, and I believe it is consistent
with conventional terminology to say "reduction of entropy == gain
of information" -- so many bits per item of evidence.

But I know there is no guarantee that the posterior will have less
entropy after the evidence is introduced. (I often have that feeling
of "now I am more confused than before!")

In this scenario, where is the "information gain"? In absorbing the
evidence, something is gained -- but what? What is the quantity
(if there is one) that's always increased by absorbing evidence?

I can, of course, leave the word "information" out of the picture and
refer simple to "change of entropy". But "information" is so suggestive
and attractive -- I would rather use it if I can.

Your comments are greatly appreciated.

Regards,
Robert Dodier