Institut de Robòtica i Informàtica Industrial

ComputeAlpha_a

PURPOSE ^

Compute the alpha_i_n-element for the given action and belief.

SYNOPSIS ^

function Element_a=ComputeAlpha_a(P,V,b,a,Alphas_j_a_o)

DESCRIPTION ^

   Compute the alpha_i_n-element for the given action and belief.

   Define the alpha-elements to be used in the backup (The \alpha_n^i
   elements).

   Actually, this function implements the last equation of section 5.2 in
   the paper.

   Note that in this case the Alpha_j_a_o elements can be pre-computed.
   If they are, the Alphas_j_a_o parameter is a non-emtpy cell array with
   one entry for each alpha-element (of the previous policy), each action 
   and each observation. 
   If this set is empty, the corresponding alpha elements are computed on
   the fly inside this function.

CROSS-REFERENCE INFORMATION ^

This function calls:
  • size Returns the size of a policy.
  • Expectation Expectation between a belief and a alpha-element.
  • Expectation Expectation between a belief and a alpha-element.
  • get Get for GBeliefs.
  • Expectation Expectation between a belief and a alpha-element.
  • GMixture Gaussian mixture constructor.
  • get Get function for the GMixture object.
  • get Gaussian object get function.
  • get Get function for CS_CO_CA_POMDPs.
  • get Get function for CS_CO_DA_POMDPs.
  • get Get function for CS_CO_POMDPs.
  • get Get function for CS_DO_CA_POMDPs.
  • get Get function for CS_DO_DA_POMDPs.
  • ComputeAlpha_j_a_o Computes a particular alpha-element.
  • get Get function for CS_POMDPs.
  • get Get function for DS_CO_CA_POMDPs.
  • get Get function for DS_CO_DA_POMDPs.
  • get Get function for DS_DO_CA_POMDPs.
  • get Get function for DS_DO_DA_POMDPs.
  • ComputeAlpha_j_a_o Computes a particular alpha-element.
  • get Get functio for POMDPs.
  • GetRewardModelFixedA Defines the reward function for a given action.
  • GetRewardModelFixedA Defines the reward function for a given action.
  • GetRewardModelFixedA Defines the reward function for a given action.
  • GetRewardModelFixedA Defines the reward function for a given action.
  • dim Dimensionality of a continuous space.
  • max Upper bound of a CSpace
  • dim Dimensionality of a discrete space.
This function is called by:
  • Backup Backup for a given belief (continuous state version).
  • Backup Backupt for a given belief (discrete state version).

SOURCE CODE ^

0001 function Element_a=ComputeAlpha_a(P,V,b,a,Alphas_j_a_o)
0002 %   Compute the alpha_i_n-element for the given action and belief.
0003 %
0004 %   Define the alpha-elements to be used in the backup (The \alpha_n^i
0005 %   elements).
0006 %
0007 %   Actually, this function implements the last equation of section 5.2 in
0008 %   the paper.
0009 %
0010 %   Note that in this case the Alpha_j_a_o elements can be pre-computed.
0011 %   If they are, the Alphas_j_a_o parameter is a non-emtpy cell array with
0012 %   one entry for each alpha-element (of the previous policy), each action
0013 %   and each observation.
0014 %   If this set is empty, the corresponding alpha elements are computed on
0015 %   the fly inside this function.
0016 
0017   rj=num2cell(1:size(V));
0018   O=get(P,'ObsSpace');
0019   no=dim(O);
0020   gamma=get(P,'gamma');
0021   noP=isempty(Alphas_j_a_o);
0022   Element_ao=GMixture;
0023   for o=1:no
0024     if noP
0025       Alphas_j=cellfun(@(j)(ComputeAlpha_j_a_o(P,V,j,a,o)),rj,'UniformOutput',false);
0026     else
0027       Alphas_j=Alphas_j_a_o(:,a,o);
0028     end
0029     [v_a_o nAlpha_j]=max(cellfun(@(g)(Expectation(b,g)),Alphas_j));
0030     Element_ao=Element_ao+Alphas_j{nAlpha_j};
0031   end
0032   Element_a=GetRewardModelFixedA(P,a)+gamma*Element_ao;
0033   
0034   
0035


Institut de Robòtica i Informàtica Industrial

Generated on Wed 05-Aug-2009 15:05:21 by m2html © 2003