Dekalog Blog: Process Noise Covariance Matrix Q for a Kalman Filter

Since my last post I have been working on the process noise covariance matrix Q, with a view to optimising both the Q and R matrices for an Extended Kalman filter to model the cyclic component of price action as a Sine wave. However, my work to date has produced unsatisfactory results and I have decided to give up trying to make it work.

The reasons for this failure are unclear to me, and I don't intend to spend any more time investigating, but some educated guesses are that the underlying model of sinusoid is mismatched and my estimation of the process and/or measurement noise variances is lacking; either way, the end result is that the EKF is diverging and my earlier leading signal 1, leading signal 2 and leading signal 3 posts outline what I think will be a more promising line of investigation in the future.

Nevertheless, below I provide the rough working code that I have been using in my EKF work, and maybe readers will find something of value in it. There is a lot of commenting and some blocked out code and I'm afraid readers will have to wade through this as best they can.

clear all ;
1 ;

pkg load signal ;

% function declarations
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function Y = ekf_sine_h( x )
% Measurement model function for the sine signal.
% Takes the state input vector x of sine, phase, angular frequency and amplitude and 
% calculates the current value of the sine given the state vector values.

f = x( 2 , : ) ;    % phase value in radians
a = x( 4 , : ) ;    % amplitude
Y = a .* sin( f ) ; % the sine value

endfunction
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [ cost ] = sine_ekf_Q_optim( Q_mult , data , targets , Q , R )
  
% initial state from noisy data input measurements
s = data( : , 1 ) ;

n = size( data , 1 ) ;

% initial state covariance 
P = eye( n ) ;

% allocate memory
xV = zeros( size( data ) ) ;  % record estmates of states

% basic Jacobian of state transition matrix 
A = [ 0 0 0 0 ;   % sine value
      0 1 1 0 ;   % phase
      0 0 1 0 ;   % angular frequency
      0 0 0 1 ] ; % amplitude
      
H = eye( n ) ; % measurement matrix

Q = Q_mult .* Q ;

  for k = 2 : size( data , 2 )

  % do ekf
  % nonlinear update x and linearisation at current state s
  x = s ;
  x( 2 ) = x( 2 ) + x( 3 ) ;        % advance phase value by angular frequency
  x( 2 ) = mod( x( 2 ) , 2 * pi ) ; % limit phase values to range 0 --- 2 * pi
  x( 1 ) = x( 4 ) * sin( x( 2 ) ) ; % sine value calculation

  % update the 1st row of the jacobian matrix at state vector s values
  A( 1 , : ) = [ 0 s( 4 ) * cos( s( 2 ) ) 0 sin( s( 2 ) ) ] ;

  P = A * P * A' + Q ; % state transition model update of covariance matrix P

  measurement_residual = [ data( 1 , k ) - x( 1 ) ;   % sine residual
                           data( 2 , k ) - x( 2 ) ;   % phase residual
                           data( 3 , k ) - x( 3 ) ;   % angular frequency residual
                           data( 4 , k ) - x( 4 ) ] ; % amplitude residual
                           
  innovation_residual_covariance = H * P * H' + R ;
  kalman_gain = P * H' / innovation_residual_covariance ;

  % update the state vector s with kalman_gain 
  s = x + kalman_gain * measurement_residual ;

  % some reality based post hoc adjustments
  s( 2 ) = abs( s( 2 ) ) ; % no negative phase values
  s( 3 ) = abs( s( 3 ) ) ; % no negative angular frequencies
  s( 4 ) = abs( s( 4 ) ) ; % no negative amplitudes

  % update the state covariance matrix P
  % NOTE
  % The Joseph formula is given by P+ = ( I − KH ) P− ( I − KH )' + KRK', where I is the identity matrix,
  % K is the gain, H is the measurement mapping matrix, R is the measurement noise covariance matrix, 
  % and P−, P+ are the pre and post measurement update estimation error covariance matrices, respectively.  
  % The optimal linear unbiased estimator (equivalently the optimal linear minimum mean square error estimator)  
  % or Kalman filter often utilizes simplified covariance update equations such as P+ = (I−KH)P− and P+ = P− −K(HP−H'+R)K'.  
  % While these alternative formulations require fewer computations than the Joseph formula, they are only valid 
  % when K is chosen as the optimal Kalman gain. In engineering applications, situations arise where the optimal 
  % Kalman gain is not utilized and the Joseph formula must be employed to update the estimation error covariance.  
  % Two examples of such a scenario are underweighting measurements and considering states. 
  % Even when the optimal gain is used, the Joseph formulation is still preferable because it possesses 
  % greater numerical accuracy than the simplified equations.
  P = ( eye( n ) - kalman_gain * H ) * P * ( eye( n ) - kalman_gain * H )' + kalman_gain * R * kalman_gain' ;
   
  xV( : , k ) = s ; % save estimated updated states
   
  endfor
  
Y = ekf_sine_h( xV ) ;
cost = mean( ( Y( 4 : end - 3 ) .- targets( 4 : end - 3 ) ).^2 ) ;
%cost = mean( ( Y .- targets ).^2 ) ;
%output = xV( 1 , : ) ;

endfunction
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function output = run_sine_ekf_Q_optim( data , Q , R )
  
% initial state from noisy data input measurements
s = data( : , 1 ) ;

n = size( data , 1 ) ;

% initial state covariance 
P = eye( n ) ;

% allocate memory
xV = zeros( size( data ) ) ;  % record estmates of states

% basic Jacobian of state transition matrix 
A = [ 0 0 0 0 ;   % sine value
      0 1 1 0 ;   % phase
      0 0 1 0 ;   % angular frequency
      0 0 0 1 ] ; % amplitude
      
H = eye( n ) ; % measurement matrix

  for k = 2 : size( data , 2 )

  % do ekf
  % nonlinear update x and linearisation at current state s
  x = s ;
  x( 2 ) = x( 2 ) + x( 3 ) ;        % advance phase value by angular frequency
  x( 2 ) = mod( x( 2 ) , 2 * pi ) ; % limit phase values to range 0 --- 2 * pi
  x( 1 ) = x( 4 ) * sin( x( 2 ) ) ; % sine value calculation

  % update the 1st row of the jacobian matrix at state vector s values
  A( 1 , : ) = [ 0 s( 4 ) * cos( s( 2 ) ) 0 sin( s( 2 ) ) ] ;

  P = A * P * A' + Q ; % state transition model update of covariance matrix P

  measurement_residual = [ data( 1 , k ) - x( 1 ) ;   % sine residual
                           data( 2 , k ) - x( 2 ) ;   % phase residual
                           data( 3 , k ) - x( 3 ) ;   % angular frequency residual
                           data( 4 , k ) - x( 4 ) ] ; % amplitude residual
                           
  innovation_residual_covariance = H * P * H' + R ;
  kalman_gain = P * H' / innovation_residual_covariance ;

  % update the state vector s with kalman_gain 
  s = x + kalman_gain * measurement_residual ;

  % some reality based post hoc adjustments
  s( 2 ) = abs( s( 2 ) ) ; % no negative phase values
  s( 3 ) = abs( s( 3 ) ) ; % no negative angular frequencies
  s( 4 ) = abs( s( 4 ) ) ; % no negative amplitudes

  % update the state covariance matrix P
  % NOTE
  % The Joseph formula is given by P+ = ( I − KH ) P− ( I − KH )' + KRK', where I is the identity matrix,
  % K is the gain, H is the measurement mapping matrix, R is the measurement noise covariance matrix, 
  % and P−, P+ are the pre and post measurement update estimation error covariance matrices, respectively.  
  % The optimal linear unbiased estimator (equivalently the optimal linear minimum mean square error estimator)  
  % or Kalman filter often utilizes simplified covariance update equations such as P+ = (I−KH)P− and P+ = P− −K(HP−H'+R)K'.  
  % While these alternative formulations require fewer computations than the Joseph formula, they are only valid 
  % when K is chosen as the optimal Kalman gain. In engineering applications, situations arise where the optimal 
  % Kalman gain is not utilized and the Joseph formula must be employed to update the estimation error covariance.  
  % Two examples of such a scenario are underweighting measurements and considering states. 
  % Even when the optimal gain is used, the Joseph formulation is still preferable because it possesses 
  % greater numerical accuracy than the simplified equations.
  P = ( eye( n ) - kalman_gain * H ) * P * ( eye( n ) - kalman_gain * H )' + kalman_gain * R * kalman_gain' ;
   
  xV( : , k ) = s ; % save estimated updated states
   
  endfor 

output = xV ;

endfunction
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% load data
cd /home/dekalog/Documents/octave/indices ;

raw_data = dlmread( 'raw_data_for_indices_and_strengths' ) ;
delete_hkd_crosses = [ 3 9 12 18 26 31 34 39 43 51 61 ] .+ 1 ; % +1 to account for date column
raw_data( : , delete_hkd_crosses ) = [] ;
raw_data( : , 1 ) = [] ; % delete date vector
% so now
%    1       2       3       4       5       6       7       8      9      10      11      12      13     14
% aud_cad aud_chf aud_jpy aud_nzd aud_sgd aud_usd cad_chf cad_jpy cad_sgd chf_jpy eur_aud eur_cad eur_chf eur_gbp
%
%   15      16      17      18      19      20      21      22      23      24      25      26      27      28
% eur_jpy eur_nzd eur_sgd eur_usd gbp_aud gbp_cad gbp_chf gbp_jpy gbp_nzd gbp_sgd gbp_usd nzd_cad nzd_chf nzd_jpy
%
%    29      30      31      32      33      34      35      36      37      38     39       40      41      42       
% nzd_sgd nzd_usd sgd_chf sgd_jpy usd_cad usd_chf usd_jpy usd_sgd xag_aud xag_cad xag_chf xag_eur xag_gbp xag_jpy
%
%    43      44     45      46      47      48      49      50      51      52     53       54      55
% xag_nzd xag_sgd xag_usd xau_aud xau_cad xau_chf xau_eur xau_gbp xau_jpy xau_nzd xau_sgd xau_usd xau_xag 

% aud_x = x(1) ; cad_x = x(2) ; chf_x = x(3) ; eur_x = x(4) ; gbp_x = x(5) ; hkd_x = x(6) ;
% jpy_x = x(7) ; nzd_x = x(8) ; sgd_x = x(9) ; usd_x = x(10) ; gold_x = x(11) ; silver_x = x(12) ;
all_g_c = dlmread( "all_g_mults_c" ) ; % the currency g mults
all_g_c( : , 7 ) = [] ; % delete hkd index
all_g_s = dlmread( "all_g_sv" ) ;      % the gold and silver g mults   
all_g_c = [ all_g_c all_g_s(:,2:3) ] ; % a combination of the above 2
all_g_c( : , 2 : end ) = cumprod( all_g_c( : , 2 : end) , 1 ) ;
all_g_c( : , 1 ) = [] ; % delete date vector
% so now index ix are
% aud_x = 56 ; cad_x = 57 ; chf_x = 58 ; eur_x = 59 ; gbp_x = 60 ; jpy_x = 61 ; 
% nzd_x = 62 ; sgd_x = 63 ; usd_x = 64 ; gold_x = 65 ; silver_x = 66 ;

all_raw_data = [ raw_data all_g_c ] ; clear -x all_raw_data ;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

cd /home/dekalog/Documents/octave/kalman/ekf ;

##  >> bs_phase_error = mean( all_bootstrap_means(:,1)) + 2 * std( all_bootstrap_means(:,1))
##  bs_phase_error =  0.50319
##  >> bs_period_ang_frequency_error = mean( all_bootstrap_means(:,2)) + 2 * std( all_bootstrap_means(:,2))
##  bs_period_ang_frequency_error =  0.17097
##  >> bs_amp_error = mean( all_bootstrap_means(:,3)) + 2 * std( all_bootstrap_means(:,3))
##  bs_amp_error =  0.18179

price_ix = 65 ;
price = all_raw_data( : , price_ix ) ; % plot(price) ;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% get measurements of price action
period = autocorrelation_periodogram( price ) ;
measured_angular_frequency = ( 2 * pi ) ./ period ; % plot( measured_angular_frequency ) ;
trend = sma( price , period ) ;
cycle = price .- trend ;
%smooth_cycle = sgolayfilt( cycle , 2 , 7 ) ;
smooth_cycle = smooth_2_5( cycle ) ;
% plot( price,'k',trend,'r') ;
[ ~ , ~ , ~ , ~ , ~ , ~ , measured_phase ] = sinewave_indicator( cycle ) ; % figure(1) ; plot( deg2rad( measured_phase ) ) ;
% figure(2) ; plot( sin( deg2rad(measured_phase)) ) ;
measured_phase = mod( unwrap( deg2rad( measured_phase ) ) , 2  * pi ) ; % figure(1) ; hold on ; plot( measured_phase , 'r' ) ; hold off ;
% figure(2) ; hold on ; plot( sin( measured_phase)) ; hold off ;

measured_amplitude = cycle ;
for ii = 50 : length( cycle ) ;
measured_amplitude( ii ) = sqrt( 2 ) * sqrt( mean( cycle( ii - period( ii ) : ii ).^2 ) ) ; 
endfor % end ii loop
% plot(measured_amplitude) ;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% run the ekf optimisation
ekf_cycle = cycle ;

% complete values for R matrix for this data set
data = [ smooth_cycle( 101 : end ) measured_phase( 101 : end ) measured_angular_frequency( 101 : end ) measured_amplitude( 101 : end ) ]' ; 
%measured_sine = sgolayfilt( data( 1 , : ) , 2 , 7 ) ; % figure(1) ; plot( data( 1 , : ) , 'k' , measured_sine , 'r' ) ;
measured_sine = smooth_cycle( 101 : end )' ;
R( 1 , 1 ) = mean( ( data( 1 , : ) .- measured_sine ).^2 ) ; % variance of sine wave value measurements
R( 3 , 3 ) = mean( ( data( 3 , : ) .* 0.17 ).^2 ) ;          % variance of angular frequency measurements
R( 4 , 4 ) = mean( ( data( 4 , : ) .* 0.18 ).^2 ) ;          % variance of amplitude measurements

% get initial process covariance matrix Q
Q = analytical_shrinkage( data' ) ;
Q = 0.1 .* Q ;

lookback = 200 ;

%for ii = 301 : length( price )
  
% complete values for R matrix for this data set
data = [ smooth_cycle(ii-lookback:ii) measured_phase(ii-lookback:ii) measured_angular_frequency(ii-lookback:ii) measured_amplitude(ii-lookback:ii) ]' ; 
%measured_sine = sgolayfilt( data( 1 , : ) , 2 , 7 ) ; % figure(1) ; plot( data( 1 , : ) , 'k' , measured_sine , 'r' ) ;
measured_sine = cycle( ii - lookback : ii )' ;
R( 1 , 1 ) = mean( ( data( 1 , : ) .- measured_sine ).^2 ) ; % variance of sine wave value measurements
R( 3 , 3 ) = mean( ( data( 3 , : ) .* 0.17 ).^2 ) ;          % variance of angular frequency measurements
R( 4 , 4 ) = mean( ( data( 4 , : ) .* 0.18 ).^2 ) ;          % variance of amplitude measurements
  
%  Q = analytical_shrinkage( data' ) ;
   
% optimise the Q matrix for this data set
% declare optimisation function
f = @( Q_mult ) sine_ekf_Q_optim( Q_mult , data , smooth_cycle( ii - lookback : ii )' , Q , R ) ;
% Set options for fminunc
options = optimset( "MaxIter" , 50 ) ;
% initial value
Q_mult = 1 ;
Q_mult = fminunc( f , Q_mult , options ) ;
% adjust Q by optimised Q_mult
Q = Q_mult .* Q ;

  output = run_sine_ekf_Q_optim( data , Q , R ) ;
  %ekf_cycle( ii ) = ekf_sine_h( output( : , end ) ) ;
  ekf_cycle = ekf_sine_h( output ) ;
  
figure(1) ; subplot( 4 , 1 , 1 ) ; plot( data( 1 , : ) , 'k' , 'linewidth' , 2 , ekf_sine_h( output ) , 'r' , 'linewidth' , 2 ) ;
title( 'High Pass' ) ; legend( 'Measured' , 'EKF Estimated' ) ;
figure(1) ; subplot( 4 , 1 , 2 ) ; plot( data( 2 , : ) , 'k' , 'linewidth' , 2 , output( 2 , : ) , 'r' , 'linewidth' , 2 ) ;
title( 'High Pass Phase' ) ; legend( 'Measured' , 'EKF Estimated' ) ;
figure(1) ; subplot( 4 , 1 , 3 ) ; plot( data( 3 , : ) , 'k' , 'linewidth' , 2 , output( 3 , : ) , 'r' , 'linewidth' , 2 ) ;
title( 'High Pass Angular Frequency' ) ; legend( 'Measured' , 'EKF Estimated' ) ;
figure(1) ; subplot( 4 , 1 , 4 ) ; plot( data( 4 , : ) , 'k' , 'linewidth' , 2 , output( 4 , : ) , 'r' , 'linewidth' , 2 ) ;
title( 'High Pass Amplitude' ) ; legend( 'Measured' , 'EKF Estimated' ) ;
  
%endfor

figure(1) ; plot( cycle(101:end)' , 'k' , 'linewidth' , 2 , ekf_cycle , 'r' , 'linewidth' , 2 , output(1,:) , 'b' ) ;
figure(2) ; plot( price(101:end) , 'k' , 'linewidth' , 2 , trend(101:end).+ekf_cycle' , 'r' , 'linewidth' , 2 ) ;

One thing I will point out is the use of a function called analytical_shrinkage, which I have taken directly from a recent paper, Analytical Nonlinear Shrinkage of Large-Dimensional Covariance Matrices, the MATLAB code being provided as an appendix in the paper. Readers may find this to be particularly useful outside the use I have tried putting it to.

Next week I shall go on my customary, summer working holiday and be away from home until August: therefore there will be a hiatus in blog posts until my return.

2 comments:

Tires24 June 2019 at 04:13
I think the stock market data series are just not applicable to the function domain covered by the Kalman filter
Anonymous24 June 2019 at 05:54
Try to highpass the price data first to remove the longer frequencies. However, if your aim is to trend trade, then perhaps a sine wave is the wrong model to begin with. Look at fractional differencing the data first to get a stationary time series but one that retains trend information.

Dekalog Blog

Pages

Friday, 21 June 2019

Process Noise Covariance Matrix Q for a Kalman Filter

2 comments: