Institut de Robòtica i Informàtica Industrial

ReadDiscretePOMDPData

PURPOSE ^

Reads a discrete POMDP from a file.

SYNOPSIS ^

function POMDPData=ReadDiscretePOMDPData(filename)

DESCRIPTION ^

   Reads a discrete POMDP from a file.

   Parser for "Tony's POMDP file format"
      http://www.cs.brown.edu/research/ai/pomdp/examples/pomdp-file-spec.html
   This format is only valid for discrete state/actions/observations POMDPs.
 
   The applications we have in main when developing this planner is robot
   navigation. Therefore we assume an observation model and a 
   reward model typical of these applications: 

         p(o|s) -> depends only on the reached state 's' and not on the
                   executed action (as it does in the general POMDP model)

         r_a(s) -> depends only on the reached state 's' and on the executed
                   action 'a' but not the departing state or the obtained
                   observations (as it does in the general POMDP model).
                   Actually, in a pure robot navigation scenario the
                   reward only depends on the reached state and not on the
                   executed action but we keep the action dependancy to
                   have a code coherent with the papers.

   POMDP with different observation and reward models produce an error.

   This function is adapted from that defined in the PERSEUS
   software by Matthijs Spaan and Nikos Vlassis.

   Parameters:
        filename  - string denoting POMDP file to be parsed
   Outputs:
        POMDPData     - struct (see below)


   PMDPData struct members definition:
 
     nStates       - number of states
     StateNames    - (nStates x X) chars, name of each state *)

     nActions      - number of actions
     ActionNames   - (nActions x X) chars, name of each action *)
     Actions       - {nActions} This is used as a description of the action by
                     Perseus. For discrete actions POMDPs (as the ones read
                     here) this is just number from 1 to nActions. From
                     continuous action spaces each element of this set can
                     be a vector describing an action or whatever is needed.

     nObs          - number of observations
     ObsNames      - (nObs x X) chars, name of each observation *)

     gamma         - discount factor.
     start         - (1 x nStates) start distribution *)

     T             - (nStates x nStates x nActions)
                         s'        s          a        T(s',s,a)=p(s'|s,a)

     O             - (nObs X nStates)
                         o      s'                     O{o}(s')=p(o|s')

     R             - (nActions X nStates)              R{a}(s')=r_a(s')

   Members marked by *) are optional: they might not be present in
   the POMDP file, in that case these members are non-existing or
   empty.

CROSS-REFERENCE INFORMATION ^

This function calls:
  • size Returns the size of a policy.
This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function POMDPData=ReadDiscretePOMDPData(filename)
0002 %   Reads a discrete POMDP from a file.
0003 %
0004 %   Parser for "Tony's POMDP file format"
0005 %      http://www.cs.brown.edu/research/ai/pomdp/examples/pomdp-file-spec.html
0006 %   This format is only valid for discrete state/actions/observations POMDPs.
0007 %
0008 %   The applications we have in main when developing this planner is robot
0009 %   navigation. Therefore we assume an observation model and a
0010 %   reward model typical of these applications:
0011 %
0012 %         p(o|s) -> depends only on the reached state 's' and not on the
0013 %                   executed action (as it does in the general POMDP model)
0014 %
0015 %         r_a(s) -> depends only on the reached state 's' and on the executed
0016 %                   action 'a' but not the departing state or the obtained
0017 %                   observations (as it does in the general POMDP model).
0018 %                   Actually, in a pure robot navigation scenario the
0019 %                   reward only depends on the reached state and not on the
0020 %                   executed action but we keep the action dependancy to
0021 %                   have a code coherent with the papers.
0022 %
0023 %   POMDP with different observation and reward models produce an error.
0024 %
0025 %   This function is adapted from that defined in the PERSEUS
0026 %   software by Matthijs Spaan and Nikos Vlassis.
0027 %
0028 %   Parameters:
0029 %        filename  - string denoting POMDP file to be parsed
0030 %   Outputs:
0031 %        POMDPData     - struct (see below)
0032 %
0033 %
0034 %   PMDPData struct members definition:
0035 %
0036 %     nStates       - number of states
0037 %     StateNames    - (nStates x X) chars, name of each state *)
0038 %
0039 %     nActions      - number of actions
0040 %     ActionNames   - (nActions x X) chars, name of each action *)
0041 %     Actions       - {nActions} This is used as a description of the action by
0042 %                     Perseus. For discrete actions POMDPs (as the ones read
0043 %                     here) this is just number from 1 to nActions. From
0044 %                     continuous action spaces each element of this set can
0045 %                     be a vector describing an action or whatever is needed.
0046 %
0047 %     nObs          - number of observations
0048 %     ObsNames      - (nObs x X) chars, name of each observation *)
0049 %
0050 %     gamma         - discount factor.
0051 %     start         - (1 x nStates) start distribution *)
0052 %
0053 %     T             - (nStates x nStates x nActions)
0054 %                         s'        s          a        T(s',s,a)=p(s'|s,a)
0055 %
0056 %     O             - (nObs X nStates)
0057 %                         o      s'                     O{o}(s')=p(o|s')
0058 %
0059 %     R             - (nActions X nStates)              R{a}(s')=r_a(s')
0060 %
0061 %   Members marked by *) are optional: they might not be present in
0062 %   the POMDP file, in that case these members are non-existing or
0063 %   empty.
0064 %
0065   if nargin<1
0066     error('Specify filename to be parsed');
0067   end
0068  
0069   file=textread(filename,'%s','delimiter','\n','whitespace','','bufsize',100000,'commentstyle','shell');
0070   nrLines=length(file);
0071   
0072   % read the preamble
0073   POMDPData=processPreamble(file);
0074   
0075   if POMDPData.nStates<1
0076     error('POMDPData has only %d states.',POMDPData.nStates);
0077   end
0078   if POMDPData.nActions<1
0079     error('POMDPData has only %d actions.',POMDPData.nActions);
0080   end
0081   if POMDPData.nObs<1
0082     error('POMDPData has only %d observations.',POMDPData.nObs);
0083   end
0084   
0085   % allocate memory
0086   for i=1:POMDPData.nObs
0087     POMDPData.O{i}=zeros(POMDPData.nStates,1);
0088   end
0089   POMDPData.T=cell(1,POMDPData.nActions);
0090   for i=1:POMDPData.nActions
0091     POMDPData.T{i}=zeros(POMDPData.nStates,POMDPData.nStates);
0092   end
0093   for i=1:POMDPData.nActions
0094     POMDPData.R{i}=zeros(POMDPData.nStates,1);
0095   end
0096  
0097   % process each line
0098   for i=1:nrLines
0099     if ~isempty(file{i})
0100       switch file{i}(1)
0101         case 'T'
0102           if ~isempty(strfind(file{i},':'))
0103             POMDPData=processTransition(POMDPData,file,i);
0104           end
0105         case 'R'
0106           if ~isempty(strfind(file{i},':'))
0107             POMDPData=processReward(POMDPData,file,i);
0108           end
0109         case 'O'
0110           if ~isempty(strfind(file{i},':'))
0111             POMDPData=processObservation(POMDPData,file,i);
0112           end
0113         case 's'
0114           if strcmp('start:',file{i}(1:6))
0115             [s,f,t]=regexp(file{i},'([-\d\.]+)');
0116             [foo,d]=size(t);
0117             if d~=POMDPData.nStates
0118               POMDPData.start=parseNextLine(file,i+1,POMDPData.nStates,1)';
0119             else
0120               POMDPData.start=zeros(d,1);
0121               string=file{i};
0122               for j=1:d
0123                 POMDPData.start(j)=str2double(string(t{j}(1):t{j}(2)));
0124               end
0125             end
0126           end
0127       end
0128     end
0129   end
0130   
0131 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0132 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0133 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0134 function POMDPData = processPreamble(file)
0135   [nr,members]=getNumberAndMembers(file,'states:');
0136   POMDPData.nStates=nr;
0137   POMDPData.StateNames=members;
0138   
0139   [nr,members]=getNumberAndMembers(file,'actions:');
0140   POMDPData.nActions=nr;
0141   POMDPData.ActionNames=members;
0142   for a=1:POMDPData.nActions
0143     POMDPData.Actions{a}=a; %In discrete action POMDPDatas actions are just numbered
0144   end
0145   
0146   [nr,members]=getNumberAndMembers(file,'observations:');
0147   POMDPData.nObs=nr;
0148   POMDPData.ObsNames=members;
0149   
0150   for i=1:length(file)
0151     if strmatch('discount:',file{i});
0152       POMDPData.gamma=sscanf(file{i},'discount: %f');
0153       break;
0154     end
0155   end
0156   
0157   
0158 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0159 function [nr, members] = getNumberAndMembers(file,baseString)
0160   for i=1:length(file)
0161     if strmatch(baseString,file{i})
0162       string=file{i};
0163       break;
0164     end
0165   end
0166   
0167   % try to find a number here
0168   [s,f,t]=regexp(string,sprintf('%s%s',baseString,'\s*(\d+)'));
0169   if isempty(s)
0170     % catch 'X: <list of X>' where X={states,actions,observations}
0171     % first strip baseString
0172     [s,f,t]=regexp(string,baseString);
0173     string1=string(f(1)+1:end);
0174     % see if there are more members on the next line
0175     stop=0;
0176     k=0;
0177     while ~stop
0178       k=k+1;
0179       if isempty(strfind(file{i+k},':'))
0180         string1=strcat([string1 ' ' file{i+k}]);
0181       else
0182         stop=1;
0183       end
0184     end
0185     [s,f,t]=regexp(string1,'\s*(\S+)\s*');
0186     [foo,nr]=size(t);
0187     members='';
0188     for a=1:nr
0189       members=strvcat(members,string1(t{a}(1):t{a}(2)));
0190     end
0191   else
0192     nr=str2double(string(t{1}(1):t{1}(2)));
0193     members='';
0194   end
0195 
0196 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0197 function POMDPData = processTransition(POMDPData,file,i)
0198   string=file{i};
0199 
0200   if nnz(string==':')==3
0201     % catch 'T: <action> : <start-state> : <end-state> <prob>'
0202     pat='T\s*:\s*(\S+)\s*:\s*(\S+)\s*:\s*(\S+)\s+([\d\.]+)';
0203     [s,f,t]=regexp(string,pat);
0204     
0205     if ~isempty(t)
0206       prob=str2double(string(t{1}(4,1):t{1}(4,2)));
0207     else % probably the prob is on the next line
0208       % catch 'T: <action> : <start-state> : <end-state>
0209       pat='T\s*:\s*(\S+)\s*:\s*(\S+)\s*:\s*(\S+)\s*';
0210       [s,f,t]=regexp(string,pat);
0211       prob=parseNextLine(file,i+1,1,1);
0212     end
0213     
0214     action=expandAction(POMDPData,string(t{1}(1,1):t{1}(1,2)));
0215     from=expandState(POMDPData,string(t{1}(2,1):t{1}(2,2)));
0216     to=expandState(POMDPData,string(t{1}(3,1):t{1}(3,2)));
0217     
0218     POMDPData.T{action}(to,from)=prob;
0219     
0220   elseif nnz(string==':')==2
0221     % catch 'T: <action> : <start-state>'
0222     pat='T\s*:\s*(\S+)\s*:\s*(\S+)';
0223     [s,f,t]=regexp(string,pat);
0224     action=expandAction(POMDPData,string(t{1}(1,1):t{1}(1,2)));
0225     from=expandState(POMDPData,string(t{1}(2,1):t{1}(2,2)));
0226     % catch all probs
0227     % first try if they are at the end of this line
0228     string=string(t{1}(2,2)+1:end);
0229     [s,f,t]=regexp(string,'([\d\.]+)');
0230     [foo,d]=size(t);
0231     if d~=POMDPData.nStates
0232       % hmm, probably they are on the next line
0233       string=file{i+1};
0234       [s,f,t]=regexp(string,'([\d\.]+)');
0235       [foo,d]=size(t);
0236       if d~=POMDPData.nStates
0237         error(['Not the correct number of probabilities on the next ' 'line.']);
0238       end
0239     end
0240     
0241     for to=1:d
0242       prob=str2double(string(t{to}(1):t{to}(2)));
0243       for a=action'
0244         POMDPData.T{a}(to,from)=prob;
0245       end
0246     end
0247     
0248   else
0249     % catch 'T: <action>
0250     pat='T\s*:\s*(\S+)\s*';
0251     [s,f,t]=regexp(string,pat);
0252     action=expandAction(POMDPData,string(t{1}(1,1):t{1}(1,2)));
0253     values=parseNextLine(file,i+1,POMDPData.nStates,POMDPData.nStates);
0254     for a=action
0255       POMDPData.T{action}(:,:)=values';
0256     end
0257   end
0258 
0259 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0260 function POMDPData = processObservation(POMDPData,file,i)
0261   string=file{i};
0262 
0263   if nnz(string==':')==3
0264     % catch 'O: <action> : <end-state> : <observation> <prob>'
0265     pat='O\s*:\s*(\S+)\s*:\s*(\S+)\s*:\s*(\S+)\s+([\d\.]+)';
0266     [s,f,t]=regexp(string,pat);
0267     
0268     if ~isempty(t)
0269       prob=str2double(string(t{1}(4,1):t{1}(4,2)));
0270     else % probably the prob is on the next line
0271       % catch 'O: <action> : <start-state> : <end-state>
0272       pat='O\s*:\s*(\S+)\s*:\s*(\S+)\s*:\s*(\S+)\s*';
0273       [s,f,t]=regexp(string,pat);
0274       prob=parseNextLine(file,i+1,1,1);
0275     end
0276     
0277     action=expandAction(POMDPData,string(t{1}(1,1):t{1}(1,2)));
0278     to=expandState(POMDPData,string(t{1}(2,1):t{1}(2,2)));
0279     observation=expandObservation(POMDPData,string(t{1}(3,1):t{1}(3,2)));
0280     
0281     if length(action)~=POMDPData.nActions
0282       error('The observation Model should be independent of the action');
0283     end
0284     no=size(observation,2);
0285     for i=1:no
0286       POMDPData.O{observation(i)}(to)=prob; % p(o|s)=p(o|s1,a)
0287     end
0288     
0289   elseif nnz(string==':')==2
0290     % catch 'O: <action> : <end-state>'
0291     pat='O\s*:\s*(\S+)\s*:\s*(\S+)';
0292     [s,f,t]=regexp(string,pat);
0293     action=expandAction(POMDPData,string(t{1}(1,1):t{1}(1,2)));
0294     to=expandState(POMDPData,string(t{1}(2,1):t{1}(2,2)));
0295     % catch all probs
0296     % first try if they are at the end of this line
0297     string=string(t{1}(2,2)+1:end);
0298     [s,f,t]=regexp(string,'([\d\.]+)');
0299     [foo,d]=size(t);
0300     if d~=POMDPData.nObs
0301       % hmm, probably they are on the next line
0302       string=file{i+1};
0303       [s,f,t]=regexp(string,'([\d\.]+)');
0304       [foo,d]=size(t);
0305       if d~=POMDPData.nObs
0306         error(['Not the correct number of probabilities on the next ' 'line.']);
0307       end
0308     end
0309     
0310     if length(action)~=POMDPData.nActions
0311       error('The observation Model should be independent of the action');
0312     end
0313     
0314     for obs=1:d
0315       prob=str2double(string(t{obs}(1):t{obs}(2)));
0316       POMDPData.O{obs}(to)=prob; % p(o|s)=p(o|s a)
0317     end
0318   else
0319     % catch 'O: <action>
0320     pat='O\s*:\s*(\S+)\s*';
0321     [s,f,t]=regexp(string,pat);
0322     action=expandAction(POMDPData,string(t{1}(1,1):t{1}(1,2)));
0323     values=parseNextLine(file,i+1,POMDPData.nObs,POMDPData.nStates)';
0324     
0325     if length(action)~=POMDPData.nActions
0326       error('The observation Model should be independent of the action');
0327     end
0328     
0329     for i=1:POMDPData.nObs
0330       for j=1:POMDPData.nStates
0331         POMDPData.O{i}(j)=values(i,j); % p(o|s)=p(o|s1,a)
0332       end
0333     end
0334   end
0335 
0336 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0337 function POMDPData = processReward(POMDPData,file,i)
0338   string=file{i};
0339 
0340   if nnz(string==':')==4
0341     % catch 'R: <action> : <start-state> : <end-state> : <observation> <reward>'
0342     % Reward can be negative
0343     pat=['R\s*:\s*(\S+)\s*:\s*(\S+)\s*:\s*(\S+)\s*:\s*(\S+)\s+([-\d\.]+' ...
0344         ')'];
0345     [s,f,t]=regexp(string,pat);
0346     
0347     if ~isempty(t)
0348       reward=str2double(string(t{1}(5,1):t{1}(5,2)));
0349     else % probably the reward is on the next line
0350       % catch 'R: <action> : <start-state> : <end-state> :
0351       % <observation>'
0352       pat='R\s*:\s*(\S+)\s*:\s*(\S+)\s*:\s*(\S+)\s*:\s*(\S+)\s*';
0353       [s,f,t]=regexp(string,pat);
0354       reward=parseNextLine(file,i+1,1,1);
0355     end
0356     
0357     action=expandAction(POMDPData,string(t{1}(1,1):t{1}(1,2)));
0358     from=expandState(POMDPData,string(t{1}(2,1):t{1}(2,2)));
0359     to=expandState(POMDPData,string(t{1}(3,1):t{1}(3,2)));
0360     observation=expandObservation(POMDPData,string(t{1}(4,1):t{1}(4,2)));
0361     
0362     if length(observation)~=POMDPData.nObs
0363       error('Reward should be independent of the observation');
0364     end  
0365     if length(from)~=POMDPData.nStates
0366       error('Reward should be independent of the departing state');
0367     end
0368 
0369     for i=1:size(action,1)
0370       POMDPData.R{action(i)}(to)=reward; 
0371     end
0372   else
0373     error('Not yet implemented.');
0374   end
0375 
0376   
0377 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0378 function values = parseNextLine(file, i, nrCols, nrRows)
0379   if strmatch('uniform',file{i})
0380     values=ones(nrRows,nrCols)/nrCols;
0381   elseif strmatch('identity',file{i})
0382     values=eye(nrCols);
0383   else
0384     [s,f,t]=regexp(file{i},'([-\d\.]+)');
0385     [foo,d]=size(t);
0386     if d~=nrCols
0387       error(['Not the correct number of probabilities on the next ' ...
0388           'line.']);
0389     end
0390     % check whether this is just a single line of numbers or a full
0391     % matrix
0392     if i<length(file)
0393       numbers=sscanf(file{i+1},'%f');
0394     else
0395       numbers=[];
0396     end
0397     if any(size(numbers)==0)
0398       values=zeros(1,d);
0399       string=file{i};
0400       for j=1:d
0401         values(j)=str2double(string(t{j}(1):t{j}(2)));
0402       end
0403     else
0404       % find out how many lines
0405       i1=i;
0406       numbers=sscanf(file{i1+1},'%f');
0407       running=1;
0408       while running
0409         numbers=sscanf(file{i1+1},'%f');
0410         if any(size(numbers)~=0)
0411           i1=i1+1;
0412         else
0413           running=0;
0414         end
0415       end
0416       values=zeros(i1+1-i,d);
0417       % parse them all
0418       for k=i:i1
0419         [s,f,t]=regexp(file{k},'([-\d\.]+)');
0420         string=file{k};
0421         for j=1:d
0422           values(k+1-i,j)=str2double(string(t{j}(1):t{j}(2)));
0423         end
0424       end
0425     end
0426   end
0427 
0428 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0429 function r = expandState(POMDPData,c)
0430   r=expandString(c,POMDPData.nStates,POMDPData.StateNames);
0431 
0432 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0433 function r = expandAction(POMDPData,c)
0434   r=expandString(c,POMDPData.nActions,POMDPData.ActionNames);
0435 
0436 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0437 function r = expandObservation(POMDPData,c)
0438   r=expandString(c,POMDPData.nObs,POMDPData.ObsNames);
0439 
0440 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0441 function r = expandString(c,nr,members)
0442   if strcmp(c,'*')
0443     r=cumsum(ones(nr,1));
0444   else
0445     r=strmatch(c,members,'exact');
0446     if isempty(r) % apparently c is a numbered state
0447       r=str2double(c)+1; % Matlab starts at 1, not 0
0448     end
0449   end
0450   
0451 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


Institut de Robòtica i Informàtica Industrial

Generated on Wed 05-Aug-2009 15:05:21 by m2html © 2003