Institut de Robòtica i Informàtica Industrial

GetTest1Parameters

PURPOSE ^

The example on Figure 1.

SYNOPSIS ^

function [POMDP P]=GetTest1Parameters(varargin)

DESCRIPTION ^

   The example on Figure 1.

   Example of how to define the POMDP and the parameters for a problem
   with a continuous state space and discrete action and observation
   spaces.
   Actually this example encodes the problem in Figure 1 of the paper that
   is used for most of the reported experiments (with little variants in
   some cases).

   Parameters:
     ncBelief: Number of components in the belief mixtures.
     ncAlpha: Number of components in the alpha mixtures.
     actionScale: Scale factor to apply to the right/left displacements.

   Parameters are optional from left to rigth (if two parameters are given
   we assume they are ncBelief and ncAlpha, etc).
   When not given we use the following default values
      ncBelief=4
      ncAlpha=9
      actionScale=2
   Those correspond to the parameters used in Figure 2.

   To solve the problem you can use

      [POMDP P B V Val Alpha t]=TestOne('Test1');

   To solve it many times collecting statistics on the quality of each
   solution use

       TestRep('Test1','myresults',1:10);

   changing 'myresults' by the label you want to add to the output
   file names.

   After executing the separate experiments (this can be very time
   demanding!) you can summarize the obtained statistics using:

        [tics SM SD]=GetPOMDPAverageStatistics('Test1-myresults')

   All problems (and their parameters) should be encoded in a file like
   this one.

   See also TestOne, TestRep, POMDP, GetHallwayParameters.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [POMDP P]=GetTest1Parameters(varargin)
0002 %   The example on Figure 1.
0003 %
0004 %   Example of how to define the POMDP and the parameters for a problem
0005 %   with a continuous state space and discrete action and observation
0006 %   spaces.
0007 %   Actually this example encodes the problem in Figure 1 of the paper that
0008 %   is used for most of the reported experiments (with little variants in
0009 %   some cases).
0010 %
0011 %   Parameters:
0012 %     ncBelief: Number of components in the belief mixtures.
0013 %     ncAlpha: Number of components in the alpha mixtures.
0014 %     actionScale: Scale factor to apply to the right/left displacements.
0015 %
0016 %   Parameters are optional from left to rigth (if two parameters are given
0017 %   we assume they are ncBelief and ncAlpha, etc).
0018 %   When not given we use the following default values
0019 %      ncBelief=4
0020 %      ncAlpha=9
0021 %      actionScale=2
0022 %   Those correspond to the parameters used in Figure 2.
0023 %
0024 %   To solve the problem you can use
0025 %
0026 %      [POMDP P B V Val Alpha t]=TestOne('Test1');
0027 %
0028 %   To solve it many times collecting statistics on the quality of each
0029 %   solution use
0030 %
0031 %       TestRep('Test1','myresults',1:10);
0032 %
0033 %   changing 'myresults' by the label you want to add to the output
0034 %   file names.
0035 %
0036 %   After executing the separate experiments (this can be very time
0037 %   demanding!) you can summarize the obtained statistics using:
0038 %
0039 %        [tics SM SD]=GetPOMDPAverageStatistics('Test1-myresults')
0040 %
0041 %   All problems (and their parameters) should be encoded in a file like
0042 %   this one.
0043 %
0044 %   See also TestOne, TestRep, POMDP, GetHallwayParameters.
0045 
0046   if nargin==0
0047     ncBelief=4;
0048   else
0049     ncBelief=varargin{1};
0050   end
0051   
0052   if nargin<2
0053     ncAlpha=9;
0054   else
0055     ncAlpha=varargin{2};
0056   end
0057   
0058   if nargin<3
0059     actionScale=2;
0060   else
0061     actionScale=varargin{3};
0062   end
0063   
0064 
0065   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0066   % Define the POMDP
0067     % State space is 1-D in the range -20,20
0068     S=CSpace(-20,20);
0069   
0070     % 3 actions: left, right, enterDoor
0071     A=DSpace(3);
0072 
0073     % 4 Observations: left-end, right-end, door, corridor
0074     O=DSpace(4);
0075 
0076     % discount factor
0077     gamma=0.95;
0078 
0079     % Action model with continuous states and discrete actions
0080     mu_a={-actionScale actionScale 0};
0081     Sigma_a={0.05 0.05 0.05};
0082     AM=CS_DA_ActionModel(S,A,mu_a,Sigma_a);
0083 
0084     % Observation model with continuous states and discrete observations.
0085     % Note that we actually define p(o,s) and that we assume p(s) to be
0086     % uniform. Thus Gaussians should be evenly distributed in 's' and with
0087     % the adequate covariance to define a uniform coverage.
0088     so=1.6;
0089     om{1}=GMixture(ones(1,5),...
0090                    {Gaussian(-21,so) Gaussian(-19,so) Gaussian(-17,so) Gaussian(-15,so) Gaussian(-13,so)}); % left-end
0091     om{2}=GMixture(ones(1,5),...
0092                    {Gaussian( 21,so) Gaussian( 19,so) Gaussian( 17,so) Gaussian( 15,so) Gaussian( 13,so)}); % right-end
0093     om{3}=GMixture(ones(1,4),...
0094                    {Gaussian(-11,so) Gaussian( -5,so) Gaussian(  3,so) Gaussian(  9,so)}); % door
0095     om{4}=GMixture(ones(1,8),...
0096                    {Gaussian(-9,so) Gaussian(-7,so) Gaussian(-3,so) Gaussian(-1,so) ...
0097                     Gaussian( 1,so) Gaussian( 5,so) Gaussian( 7,so) Gaussian(11,so)}); % corridor
0098     OM=CS_DO_ObsModel(S,O,om);
0099     
0100     % reward model with continuous states and discrete actions
0101     rm{1}=GMixture([-2 -2 -2],{Gaussian(-21,1) Gaussian(-19,1) Gaussian(-17,1)});
0102     rm{2}=GMixture([-2 -2 -2],{Gaussian( 21,1) Gaussian( 19,1) Gaussian( 17,1)});
0103     rm{3}=GMixture([-10 2 -10],{Gaussian(-25,250) Gaussian(3,3) Gaussian(25,250)});
0104     RM=CS_DA_RewardModel(S,A,rm);
0105 
0106     % Assemble the POMDP
0107     POMDP=CS_DO_DA_POMDP('Test1',S,A,O,AM,OM,RM,gamma,ncAlpha);
0108     
0109   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0110   % Define the parameters for sampling beliefs
0111     % Define the start belief
0112     g1=Gaussian(-15,30);
0113     P.start=GBelief(GMixture([1 1 1 1],{g1 g1+10 g1+20 g1+30}),ncBelief);
0114 
0115     % Define the rest of parameters
0116     % First the noes for Belief sampling
0117     P.nBeliefs=500;
0118     P.dBelief=0.1;
0119     P.stepsXtrial=30;
0120     P.rMin=-0.5;
0121     P.rMax= 0.5;
0122   
0123   
0124   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0125   % Define the parameters for testing
0126     P.maxTime=2500;
0127     P.stTime=100;
0128     P.numTrials=100;
0129   
0130   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0131   % Define the parameters for solving
0132     P.stopCriteria=@(n,t,vc)(t>P.maxTime);
0133


Institut de Robòtica i Informàtica Industrial

Generated on Wed 05-Aug-2009 15:05:21 by m2html © 2003