Maybe I do not understand the problem/question since I do not 
have the thread start messages but it seems normal. Consider 
1-dimensional example 
d1 = {1/4, 3/4}, old information (current state)
d2 = {3/4, 1/4}, new information
d1 * d2 = {1/2, 1/2}, entropy(d1*d2) > entropy(d1) -- (no information)
d1 * d1 = {1/10, 9/10}, entropy(d1*d1) < entropy(d1)
So the entropy can both increase and decrease when we learn new 
information. It depends how this new information is connected 
with the old information.
In multidimensional case the situation is the same. If we have 
an arbitrary first distribution (e.g., 2-dimensional described in the 
previous message) then we can always find a distribution about one 
variable (e.g., 2-valued variable x) so that it either increases or 
decreases the joint information (except for some special cases where 
they are independent and the joint information is not changed). In the 
previous example, if we learn that x=0 we obtain that the information 
is increased (the entropy is decreased).
So to learn something does not mean that the certainty (the quantity 
of information) will be increased. Usually the loss of information can 
be interpreted as a contradiction with the old information, with 
the current state of knowledge. Thus in probabilistic approaches 
the higher contradiction of two propositions the higher uncertainty 
of the joint proposition (in contrast to, e.g., fuzzy approaches).
Regards,
Alexandr Savinov
-- Alexandr A. Savinov, PhD Senior Scientific Collaborator, Laboratory of AI Systems Inst. Math., Moldavian Acad. Sci. str. Academiei 5, MD-2028 Kishinev, Moldavia Tel: +3732-73-81-30, Fax: +3732-73-80-27 mailto:savinov@math.md http://www.geocities.com/ResearchTriangle/7220/