test_lwpr_nD_RmodelIterOpti

Learning Inverse Kinematic
Optimization
Doing the single-input learning with optimization method
Doing single input learning without the optimization method
Conclusion

Learning Inverse Kinematic

%{
Ref:    [1] "Learning inverse Kinematics, Aaron D'Souzam Sethu Vijayakunmar and Stefen Schaal"
        [2] "inverse kinematics,https://en.wikipedia.org/wiki/Inverse_kinematics"
        [3] " Stefan Schaal link:  http://www-clmc.usc.edu/software/lwpr/"
Terminology:
        DOF : Degree of Freedom - The number of free variables in the model

By definition provided in [2], inverse kinematics refers to
the use of the kinematics equation of a robot to determine
the joint parameters that provide a desired position of the end-
effector[2]. The researchers have used a powerful learning
algorithm as a supervised learning algorithm for learning
inverse kinematics. Learning Inverse Kinematic is defined as
real-time control of end-effector which in case of the paper is
for resolved motion rate control (RMRC). Their experiment
is based on augmenting the input space dimension which
is compared to analytical pseudo-inverse with optimization.
Inverse Kinematics problem as mentioned in the paper is the
transformation from kinematics plans in external coordinates to
internal coordinates. Which this inverse transformation results
in ill-posed[1]. As in forward kinematics, the joint coordinates
are in n-dimensional vector of joint angle and the end-effector
as m-dimensional vector. The equation is: x = f(?) where
instead of this forward kinematic function the ? = f?1(x)
which we need to find a optimal solution for later equation
among all possible solutions. In [1] they have proposed a
convex function g = g(?) as criterion so it has global optimum
answer. Finding ? could be solved by 2 approaches as global
approach which is computationally expensive versus local
methods. Local approach is feasible in real time running as
declared in [1] by calculation on ? using x and integrating
? generates entire joint space path [1]. RMRC is this local
approach to the problem. The change of x is described by
Jacobian (J) of the forward Kinematics. x ? = J(?)? ?. In their
project, they have neglected 4 Degree of Freedom (DOF) of
eye and just considered 26 DOF for the humanoid reaching
an object. One solution to minimize g is the a pseudo-inverse
solution: ?? = J#x ? ? ?(I ? J#J) where the general form is:
?? = J^(?1)_E x ?aug.  [1]
------------------------------------------------------------
LWPR Algorithm:
Learning Inverse Kinematics has 2 main advantages. One in case when the
computational complexity of analytical solutions becomes high and second is,
it will not produce ill-conditioned matrix inversion and impossible posture
for humanoid[1]. The major problem is mapping x ? to ? ? which is not a convex
function, however, lots of solutions have been experimented.
Spatially localized learning algorithm is a proper solution.
One supervised method Locally Weighted Projection Regression (LWPR) is a
supervised learining algorithm which takes advantage of piecewise linear
models and each region of validity, receptive field, is derived from a
Gaussian function as:
w_k = exp(?1/2(x ? c_k)T D_k(x ? c_k)) (1)

c_k is center of kth linear model, and D_k is the distance metric that
determines the shape ad size of the regoin of validity of the linear model[1].
The input is x and the output is y k which is weighted mean of all linear
models by equation (2) from [1]:
yHat= (?k=1:K ?kyk)/(?k=1:K ?k) (2)
In LWPR it reformulates the learning function and each step is updated
similar to Newton method.
This code is based on a library, lwpr.m, and test code from following link:
http://www-clmc.usc.edu/software/lwpr/. specially for learning phase of the
process.
---------------------------------------------------------------

%}

function [s,R] = test_lwpr_nD_RmodelIterOpti(s,R)

  global lwprs;

  % Setting up the variables
  d = 5; % input space mapping
  n = 500; % number of samples for LWPR
  inD = 3; % Dimensions of the input
  outD = 2; % Dimensions of the output

  % Random training set using the non-linear function from R3-->R2
  X = (rand(n,inD))*2*pi; % Generation of input space
  Y = [cos(X(:,1)).*sin(X(:,2)), sin(X(:,1)).*cos(X(:,3))];  % Generation of output space
  YnoNoise = Y; % Added noise for storing original function
  Y = Y + randn(n,outD)*0.01; %  Added noise to simulate sensor noise function output
  Yorg = Y;
  disp('data generated');

  % rotate input data into d-dimensional space
  R = randn(d); % Create random set of numbers
  R = R'*R; %  Make it square matrix
  R = orth(R); % make it orthogonal basis as it needed for LWPR library

  Xorg = X;
  X = [X zeros(n,d-3)]*R; % Rotate the input to the new space
  disp('Creating test data');

  % a systematic test set on a grid
  Xt = []; % setting up the trajectory data
  for i=0:0.05:2*pi,
    Xt = [Xt; i pi/4 pi/4];
  end

  Yt = [cos(Xt(:,1)).*sin(Xt(:,2)), sin(Xt(:,1)).*cos(Xt(:,3))]; % setting up the testing dataset

  % rotate the test data according to R matrix
  Xtorg = Xt;
  Xt = [Xt zeros(length(Xt),d-3)]*R;

  % initialize LWPR, use only diagonal distance metric
  ID = 1; % LWPR param
  if ~exist('s') | isempty(s)
    lwpr('Init',ID,d,2,1,0,0,1e-7,50,ones(d,1),[1],'lwpr_test');
  else
    lwpr('Init',ID,s);
  end

  % set some initial parameters setting for the LWPR algorithm from the original
  % sample code
  kernel = 'Gaussian';
  lwpr('Change',ID,'init_D',eye(d)*25);
  lwpr('Change',ID,'init_alpha',ones(d)*100);
  lwpr('Change',ID,'w_gen',0.2);
  lwpr('Change',ID,'meta',1);
  lwpr('Change',ID,'meta_rate',100);
  lwpr('Change',ID,'init_lambda',0.995);
  lwpr('Change',ID,'final_lambda',0.9999);
  lwpr('Change',ID,'tau_lambda',0.9999);
  disp('Training the model');

  % train the model
  for j=1:20
    inds = randperm(n); % Start from random Testing indees
    mse = 0; % re-setting mse variable
    for i=1:n,
      [yp,w,np] = lwpr('Update',ID,X(inds(i),:)',Y(inds(i),:)'); % LWPR for each training point using Update param
      mse = mse + (Y(inds(i),:)'-yp).^2; % calculate mean square error
    end
    nMSE = mse/n/var(Y,1)';	% calculate mse of our data set
    disp(sprintf('#Data=%d #rfs=%d nMSE=%5.3f #proj=%4.2f (TrainingSet)',lwprs(ID).n_data,length(lwprs(ID).rfs),nMSE,np));
  end

  disp('Predicting information');
  % create predictions for the test data (predicted function to show what the model is)
  Yp = zeros(size(Yt)); % initialize output dataset
  for i=1:length(Xt),
    [yp,w]=lwpr('Predict',ID,Xt(i,:)',0.001); % use LWPR for prediction
    Yp(i,:) = yp; % append to output set
  end
  ep   = Yt-Yp; % Difference between actual function and predicted funciton
  mse  = mean(ep.^2); % Mean Square Error
  nmse = mse/var(Y,1);
  disp(sprintf('#Data=%d #rfs=%d nMSE=%5.3f #proj=%4.2f (TestSet)',lwprs(ID).n_data,length(lwprs(ID).rfs),nmse,np));

  figure(1);
  clf;

  % plot the raw noisy data
  subplot(2,2,1);
  axis([-1 1 -1 1]);
  plot(Yorg(:,1),Yorg(:,2),'-');
  title('Noisy data samples');

  % plot the fitted function
  subplot(2,2,2);
  axis([-1 1 -1 1]);
  plot(Yp(:,1),Yp(:,2),'-');
  title(sprintf('Mapped Function Set: nMSE=%5.3f',nmse));

data generated
Creating test data
Training the model
#Data=500 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=9.422855e-01 #rfs=1.004568e+00 nMSE=2.000 #proj=
#Data=1000 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=8.446115e-01 #rfs=9.220652e-01 nMSE=2.000 #proj=
#Data=1500 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=4.858839e-01 #rfs=4.926104e-01 nMSE=2.000 #proj=
#Data=2000 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=2.577122e-01 #rfs=2.454644e-01 nMSE=2.000 #proj=
#Data=2500 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=1.390693e-01 #rfs=1.480553e-01 nMSE=2.000 #proj=
#Data=3000 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=1.207992e-01 #rfs=1.339724e-01 nMSE=2.000 #proj=
#Data=3500 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=6.342840e-02 #rfs=6.727516e-02 nMSE=2.000 #proj=
#Data=4000 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=4.760527e-02 #rfs=6.315051e-02 nMSE=2.000 #proj=
#Data=4500 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=4.744065e-02 #rfs=6.298782e-02 nMSE=2.000 #proj=
#Data=5000 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=4.728728e-02 #rfs=6.284395e-02 nMSE=2.000 #proj=
#Data=5500 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=4.716327e-02 #rfs=6.271979e-02 nMSE=2.000 #proj=
#Data=6000 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=4.702750e-02 #rfs=6.259912e-02 nMSE=2.000 #proj=
#Data=6500 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=2.493273e-03 #rfs=2.253737e-03 nMSE=2.000 #proj=
#Data=7000 #rfs=419 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=2.384653e-03 #rfs=2.149709e-03 nMSE=2.000 #proj=
#Data=7500 #rfs=420 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=2.289735e-03 #rfs=2.054802e-03 nMSE=2.000 #proj=
#Data=8000 #rfs=420 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=2.195507e-03 #rfs=1.970137e-03 nMSE=2.000 #proj=
#Data=8500 #rfs=420 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=2.104682e-03 #rfs=1.891604e-03 nMSE=2.000 #proj=
#Data=9000 #rfs=420 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=2.033538e-03 #rfs=1.821009e-03 nMSE=2.000 #proj=
#Data=9500 #rfs=420 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=1.957919e-03 #rfs=1.747626e-03 nMSE=2.000 #proj=
#Data=10000 #rfs=420 nMSE=0.000 #proj=0.00 (TrainingSet)#Data=1.891221e-03 #rfs=1.681543e-03 nMSE=2.000 #proj=
Predicting information
#Data=10000 #rfs=420 nMSE=0.041 #proj=2.00 (TestSet)

Optimization

%{
   In order to minimize the cost function, as in equation (4), which minimizes
 the distance between optimal theta and given theta gives the
particular solution from [1]. In (4) the given weight determines the
contribution of each DOF in cost function (4).
%
% $$\begin{equation}
    Q = \frac{1}{2} (\dot{\theta}-\hat{\dot{\theta}})^T(\dot{\theta}-\hat{\dot{\theta}}) +\\
        \frac{1}{2}\alpha(\hat{\dot{\theta}}-\frac{\Delta\theta}{\Delta t})^T W(\hat{\dot{\theta}}-\frac{\Delta\theta}{\Delta t})
    \end{equation}
  $$ (4)
%
where in (4) ?? = ?_opt ? ? and W is diagonal weight
matrix where ? is the current prediction given input $z=(x ?,?) $ [1]
cost function is minimized by using equation (5):
% $$\begin{equation}
    \dot{\theta}_{target} = \dot{\theta} - \alpha W (\hat{\dot{\theta}} - \Delta\theta)
    \end{equation}
  $$ (5)
%
  %}

Doing the single-input learning with optimization method

Now plotting additional points Update the model by adding additional data points.

  samples = 500; %number of samples of points
  alpha = 0.6; %The alpha from (12)
  Ypi = zeros(samples,2);

  for i = 1:samples
      idealXi = [i/samples*8*pi pi/4 pi/4]; %set up the optimized case
      Xi = [i/samples*2*pi pi/4 pi/4]; %setup learning data for single point
      Yi = [cos(Xi(1)).*sin(Xi(2)), sin(Xi(1)).*cos(Xi(3))];

      Xaug = Xi  + transpose(alpha.*eye(3)*(Xi - idealXi)'); %Augment point with optimization method
      Xaug = [Xaug zeros(1, d-3)]*R; %rotate input data to d-dimensional space
      [yp,w,np] = lwpr('Update',ID,Xaug(:),Yi(:)); %Update LWPR
      Ypi(i,:) = yp; %Append to output set
  end

  %Plot the data
  subplot(2,2,4);
  axis([-1 1 -1 1]);
  plot(Ypi(:,1),Ypi(:,2),'-');
  title('Function with optimizer');

Doing single input learning without the optimization method

  Ypi = zeros(samples,2);
  for i = 1:samples
      Xi = [i/samples*2*pi pi/4 pi/4]; %Mapping trajectory to teach
      Yi = [cos(Xi(1)).*sin(Xi(2)), sin(Xi(1)).*cos(Xi(3))];
      % rotate input data into d-dimensional space
      Xi = [Xi zeros(1,d-3)]*R;
      [yp,w,np] = lwpr('Update',ID,Xi(:),Yi(:));
      Ypi(i,:) = yp;
  end

  %Plot the information.
  subplot(2,2,3);
  axis([-1 1 -1 1]);
  plot(Ypi(:,1),Ypi(:,2),'-');
  title('No optimized function');

Conclusion

%{
As the conclusion of the project, by using batch training the results
are not close to optimal desired figure, which in this project the
desired figure is simple circle. However, using single input learning
results in rough figure better than batch learning. By using
optimization cost function where it is biased toward optimal point
($\theta_{opt}$) the result converges to optimal point more accurately.

The following flow-chart is an example of how two tests are operating as
a comparison of between using optimization and without optimizaation:
%                  ________________
%                 | Batch Learning |
%                 |________________|
%                /           \
%               /             \__________
%       _______/____________________    __\_________________________
%   |->|Trajectory Single Input Gen |  |Trajectory Single Input Gen |<-|
%   |  |____________________________|  |____________________________|  |
%   |           |                              |                       |
%   |   ________|__________________     _______|_________              |
%   |_ |LWPR Single Input Learning |   |Optimzer Function|             |
%      |___________________________|   |_________________|             |
%                                       ________|__________________    |
%                                      |LWPR Single Input Learning |___|
%                                      |___________________________|
%

%}

Contents

Learning Inverse Kinematic

Optimization

Doing the single-input learning with optimization method

Doing single input learning without the optimization method

Conclusion