I would like to mention that the same happens in
Markov chains for which predicting let's say 2
steps in the future leads to less uncertainty than
predicting 1 step in the future.
E.g. consider the following simple Markov chain
1 ^
/ \ | time
2 3 |
.5 \ / .5
4
Then, if we are in state 4, predicting the next
state leads to uncertainty, but predicting the
state two steps later leads to state 1 with
100% certainty.
The reason is that branches collapse. I knew someone
who made a method for POMDPs and was using (increasing)
entropy as a measure for defining regions, but I told him
that his method could not work in the way he thought it
would, since entropy is not a monothonic increasing function.
Ciao,
Marco Wiering
IDSIA