![]() |
SimulateFromPURPOSE
Simulates a POMDP from a given belief.
SYNOPSIS
function r=SimulateFrom(P,V,start,n)
DESCRIPTION
Simulates a POMDP from a given belief. Simulates a POMDP from a given belief ('start') for 'n' steps and using the given policy ('V') to decide the actions to execute. The functions drawns a state at random from the 'start' distribution and uses is as a hidden state in the simulation. This hidden state is used to query the observation and reward models. Returns the accumulated discounted reward fro the simulation. This function is basically used to get statistics for the policy derived after a planning session. As the elements in 'V' increase, the simulation can get quite time demanding. Actually, this is the most expensive step of the statistics collection that is more costly than the planning itself. See also GetPOMDPSolutionStatistics. CROSS-REFERENCE INFORMATION
This function calls:
SOURCE CODE
0001 function r=SimulateFrom(P,V,start,n) 0002 % Simulates a POMDP from a given belief. 0003 % 0004 % Simulates a POMDP from a given belief ('start') for 'n' steps and using 0005 % the given policy ('V') to decide the actions to execute. 0006 % 0007 % The functions drawns a state at random from the 'start' distribution 0008 % and uses is as a hidden state in the simulation. This hidden state is 0009 % used to query the observation and reward models. 0010 % 0011 % Returns the accumulated discounted reward fro the simulation. 0012 % 0013 % This function is basically used to get statistics for the policy 0014 % derived after a planning session. As the elements in 'V' increase, the 0015 % simulation can get quite time demanding. Actually, this is the most 0016 % expensive step of the statistics collection that is more costly than 0017 % the planning itself. 0018 % 0019 % See also GetPOMDPSolutionStatistics. 0020 0021 A=get(P,'ActionSpace'); 0022 noPolicy=empty(V); 0023 s=rand(start); 0024 b=start; 0025 r=0; 0026 gamma=1; 0027 for i=1:n 0028 if noPolicy 0029 a=rand(A); 0030 else 0031 a=OptimalAction(V,b); 0032 end 0033 [s b o r1]=SimulationStep(P,b,s,a); 0034 r=r+gamma*r1; 0035 gamma=gamma*P.gamma; 0036 end 0037 |