tree.h
, tree.c
, queue.h
and stack.h
. Sorry about that! In this assignment, you will implement and experiment with value iteration and prioritized sweeping for the "Jack's Car Rental" problem (p. 99 of Sutton and Barto). We will modify the problem statement slightly so that it is easier to complete.
This problem can be defined as follows:
The Poisson distribution generates a value of u with probability (lambdau / u!) * exp(-u), where lambda is the mean parameter of the distribution, as given above.
Let n[L] be the number of cars at location L at the beginning of the day, let x be the number of cars moved to this location, let in[L] be the number of cars returned, and out[L] be the number of cars requested. Then the number of cars rented is rented[L] = min(n[L] + x, out[L]), because cars returned on one day are not available for rental until the next day. The number of cars remaining at the end of the day is min(10, n[L] + x + in[L] - rented[L]).
The code is organized in four files:
mdp.h
defines an abstract class
Problem
, which encapsulates all of the domain-specific
aspects of a problem. It also defines the class MDP
,
which implements Value Iteration and various helper functions. Your
job is to modify the MDP
class to implement Prioritized
Sweeping.
jacks.h, jacks.cc
. These files define the car
rental domain by subclassing Problem
. Notice that most
of the work in this code is involved with constructing the probability
transition model. The CPU time required to compute the model is
about the same as the time required to perform value iteration!
jacksvi.cc
. This is the main program. It just
creates an instance of an JacksProblem
and then an
instance of MDP
, and invokes
MDP::ValueIteration
. Finally, it prints out the
resulting policy.
Your code should implement prioritized sweeping. As with my Value Iteration code, it should count the number of primitive Q backups and display this number when it terminates.
All of the code that you need for this problem is available in the following tar file. It has been tested under Solaris and on my laptop. I am using modified versions of Tim Budd's library routines for data structures, rather than STL (sorry!). If anyone wants to port this program to STL, that would be great!
Please turn in a hardcopy listing of your program and a comparison of prioritized sweeping and value iteration on this program.