Description of SimulateFrom

   Simulates a POMDP from a given belief.

   Simulates a POMDP from a given belief ('start') for 'n' steps and using
   the given policy ('V') to decide the actions to execute.

   The functions drawns a state at random from the 'start' distribution
   and uses is as a hidden state in the simulation. This hidden state is
   used to query the observation and reward models.

   Returns the accumulated discounted reward fro the simulation.
   
   This function is basically used to get statistics for the policy
   derived after a planning session. As the elements in 'V' increase, the
   simulation can get quite time demanding. Actually, this is the most
   expensive step of the statistics collection that is more costly than
   the planning itself.

   See also GetPOMDPSolutionStatistics.

0001 function r=SimulateFrom(P,V,start,n)
0002 %   Simulates a POMDP from a given belief.
0003 %
0004 %   Simulates a POMDP from a given belief ('start') for 'n' steps and using
0005 %   the given policy ('V') to decide the actions to execute.
0006 %
0007 %   The functions drawns a state at random from the 'start' distribution
0008 %   and uses is as a hidden state in the simulation. This hidden state is
0009 %   used to query the observation and reward models.
0010 %
0011 %   Returns the accumulated discounted reward fro the simulation.
0012 %
0013 %   This function is basically used to get statistics for the policy
0014 %   derived after a planning session. As the elements in 'V' increase, the
0015 %   simulation can get quite time demanding. Actually, this is the most
0016 %   expensive step of the statistics collection that is more costly than
0017 %   the planning itself.
0018 %
0019 %   See also GetPOMDPSolutionStatistics.
0020 
0021   A=get(P,'ActionSpace');
0022   noPolicy=empty(V);
0023   s=rand(start);
0024   b=start;
0025   r=0;
0026   gamma=1;
0027   for i=1:n
0028     if noPolicy
0029       a=rand(A);
0030     else
0031       a=OptimalAction(V,b);
0032     end
0033     [s b o r1]=SimulationStep(P,b,s,a);
0034     r=r+gamma*r1;
0035     gamma=gamma*P.gamma;
0036   end
0037

SimulateFrom

PURPOSE

SYNOPSIS

DESCRIPTION

CROSS-REFERENCE INFORMATION

SOURCE CODE