GetTest1ParametersPURPOSEThe example on Figure 1.
SYNOPSISfunction [POMDP P]=GetTest1Parameters(varargin)
DESCRIPTIONThe example on Figure 1. Example of how to define the POMDP and the parameters for a problem with a continuous state space and discrete action and observation spaces. Actually this example encodes the problem in Figure 1 of the paper that is used for most of the reported experiments (with little variants in some cases). Parameters: ncBelief: Number of components in the belief mixtures. ncAlpha: Number of components in the alpha mixtures. actionScale: Scale factor to apply to the right/left displacements. Parameters are optional from left to rigth (if two parameters are given we assume they are ncBelief and ncAlpha, etc). When not given we use the following default values ncBelief=4 ncAlpha=9 actionScale=2 Those correspond to the parameters used in Figure 2. To solve the problem you can use [POMDP P B V Val Alpha t]=TestOne('Test1'); To solve it many times collecting statistics on the quality of each solution use TestRep('Test1','myresults',1:10); changing 'myresults' by the label you want to add to the output file names. After executing the separate experiments (this can be very time demanding!) you can summarize the obtained statistics using: [tics SM SD]=GetPOMDPAverageStatistics('Test1-myresults') All problems (and their parameters) should be encoded in a file like this one. See also TestOne, TestRep, POMDP, GetHallwayParameters. CROSS-REFERENCE INFORMATIONThis function calls:
SOURCE CODE0001 function [POMDP P]=GetTest1Parameters(varargin) 0002 % The example on Figure 1. 0003 % 0004 % Example of how to define the POMDP and the parameters for a problem 0005 % with a continuous state space and discrete action and observation 0006 % spaces. 0007 % Actually this example encodes the problem in Figure 1 of the paper that 0008 % is used for most of the reported experiments (with little variants in 0009 % some cases). 0010 % 0011 % Parameters: 0012 % ncBelief: Number of components in the belief mixtures. 0013 % ncAlpha: Number of components in the alpha mixtures. 0014 % actionScale: Scale factor to apply to the right/left displacements. 0015 % 0016 % Parameters are optional from left to rigth (if two parameters are given 0017 % we assume they are ncBelief and ncAlpha, etc). 0018 % When not given we use the following default values 0019 % ncBelief=4 0020 % ncAlpha=9 0021 % actionScale=2 0022 % Those correspond to the parameters used in Figure 2. 0023 % 0024 % To solve the problem you can use 0025 % 0026 % [POMDP P B V Val Alpha t]=TestOne('Test1'); 0027 % 0028 % To solve it many times collecting statistics on the quality of each 0029 % solution use 0030 % 0031 % TestRep('Test1','myresults',1:10); 0032 % 0033 % changing 'myresults' by the label you want to add to the output 0034 % file names. 0035 % 0036 % After executing the separate experiments (this can be very time 0037 % demanding!) you can summarize the obtained statistics using: 0038 % 0039 % [tics SM SD]=GetPOMDPAverageStatistics('Test1-myresults') 0040 % 0041 % All problems (and their parameters) should be encoded in a file like 0042 % this one. 0043 % 0044 % See also TestOne, TestRep, POMDP, GetHallwayParameters. 0045 0046 if nargin==0 0047 ncBelief=4; 0048 else 0049 ncBelief=varargin{1}; 0050 end 0051 0052 if nargin<2 0053 ncAlpha=9; 0054 else 0055 ncAlpha=varargin{2}; 0056 end 0057 0058 if nargin<3 0059 actionScale=2; 0060 else 0061 actionScale=varargin{3}; 0062 end 0063 0064 0065 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0066 % Define the POMDP 0067 % State space is 1-D in the range -20,20 0068 S=CSpace(-20,20); 0069 0070 % 3 actions: left, right, enterDoor 0071 A=DSpace(3); 0072 0073 % 4 Observations: left-end, right-end, door, corridor 0074 O=DSpace(4); 0075 0076 % discount factor 0077 gamma=0.95; 0078 0079 % Action model with continuous states and discrete actions 0080 mu_a={-actionScale actionScale 0}; 0081 Sigma_a={0.05 0.05 0.05}; 0082 AM=CS_DA_ActionModel(S,A,mu_a,Sigma_a); 0083 0084 % Observation model with continuous states and discrete observations. 0085 % Note that we actually define p(o,s) and that we assume p(s) to be 0086 % uniform. Thus Gaussians should be evenly distributed in 's' and with 0087 % the adequate covariance to define a uniform coverage. 0088 so=1.6; 0089 om{1}=GMixture(ones(1,5),... 0090 {Gaussian(-21,so) Gaussian(-19,so) Gaussian(-17,so) Gaussian(-15,so) Gaussian(-13,so)}); % left-end 0091 om{2}=GMixture(ones(1,5),... 0092 {Gaussian( 21,so) Gaussian( 19,so) Gaussian( 17,so) Gaussian( 15,so) Gaussian( 13,so)}); % right-end 0093 om{3}=GMixture(ones(1,4),... 0094 {Gaussian(-11,so) Gaussian( -5,so) Gaussian( 3,so) Gaussian( 9,so)}); % door 0095 om{4}=GMixture(ones(1,8),... 0096 {Gaussian(-9,so) Gaussian(-7,so) Gaussian(-3,so) Gaussian(-1,so) ... 0097 Gaussian( 1,so) Gaussian( 5,so) Gaussian( 7,so) Gaussian(11,so)}); % corridor 0098 OM=CS_DO_ObsModel(S,O,om); 0099 0100 % reward model with continuous states and discrete actions 0101 rm{1}=GMixture([-2 -2 -2],{Gaussian(-21,1) Gaussian(-19,1) Gaussian(-17,1)}); 0102 rm{2}=GMixture([-2 -2 -2],{Gaussian( 21,1) Gaussian( 19,1) Gaussian( 17,1)}); 0103 rm{3}=GMixture([-10 2 -10],{Gaussian(-25,250) Gaussian(3,3) Gaussian(25,250)}); 0104 RM=CS_DA_RewardModel(S,A,rm); 0105 0106 % Assemble the POMDP 0107 POMDP=CS_DO_DA_POMDP('Test1',S,A,O,AM,OM,RM,gamma,ncAlpha); 0108 0109 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0110 % Define the parameters for sampling beliefs 0111 % Define the start belief 0112 g1=Gaussian(-15,30); 0113 P.start=GBelief(GMixture([1 1 1 1],{g1 g1+10 g1+20 g1+30}),ncBelief); 0114 0115 % Define the rest of parameters 0116 % First the noes for Belief sampling 0117 P.nBeliefs=500; 0118 P.dBelief=0.1; 0119 P.stepsXtrial=30; 0120 P.rMin=-0.5; 0121 P.rMax= 0.5; 0122 0123 0124 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0125 % Define the parameters for testing 0126 P.maxTime=2500; 0127 P.stTime=100; 0128 P.numTrials=100; 0129 0130 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0131 % Define the parameters for solving 0132 P.stopCriteria=@(n,t,vc)(t>P.maxTime); 0133 |