Re: "information gained" sometimes != "entropy reduction" ?

Yang Xiang (yxiang@cs.uregina.ca)
Tue, 25 Aug 1998 09:27:49 -0600 (CST)

>
> I have not followed the discussion from the start, but it seems that there is
> a confusion between the expected entropy when gaining information and the
> actual entropy when you have received the information.
>
> If you have a possible source of information and you take the entropy for all
> possible answers and average with the probabilities of these answers, then
> this mean is never larger than your present entropy.
>
> Certainly, the entropy may increase with specific answers.
>
> I appologize, if my comment is out of focus.
>
> /Finn V. Jensen
>
>
>

I agree with Finn's comment.

Mathmatically, for any two variables X and Y (n variable case
is similar), the joint entropy H(X,Y) is proven to be no less than the
conditional entropy of X given Y, H(X|Y):
H(X,Y) >= H(X|Y).
H(X|Y) represents, on average, the information in X after Y is learned.

On the other hand, we are talking about the posterior entropy
H(X|Y=y0) when a specific value of Y=y0 is learned. If we compare the
definition H(X,Y) = - Sum_x,y P(x,y) log P(x,y)
with H(X|Y=y0) = - Sum_x P(x,y0)log P(x|y0),
then it is possible to choose P(X,Y) and y0 to make H(X|Y=y0) larger
or smaller than H(X,Y) as demonstrated by the examples of serval others.

Yang Xiang, Ph.D. Associate Professor
Department of Computer Science Tel: (306) 585-4088
University of Regina Fax: (306) 585-4745
Regina, Saskatchewan E-mail: yxiang@cs.uregina.ca
Canada S4S 0A2 WWW: http://cs.uregina.ca/~yxiang/