Friday, 21 June 2019

Process Noise Covariance Matrix Q for a Kalman Filter

Since my last post I have been working on the process noise covariance matrix Q, with a view to optimising both the Q and R matrices for an Extended Kalman filter to model the cyclic component of price action as a Sine wave. However, my work to date has produced unsatisfactory results and I have decided to give up trying to make it work.

The reasons for this failure are unclear to me, and I don't intend to spend any more time investigating, but some educated guesses are that the underlying model of sinusoid is mismatched and my estimation of the process and/or measurement noise variances is lacking; either way, the end result is that the EKF is diverging and my earlier leading signal 1, leading signal 2 and leading signal 3 posts outline what I think will be a more promising line of investigation in the future.

Nevertheless, below I provide the rough working code that I have been using in my EKF work, and maybe readers will find something of value in it. There is a lot of commenting and some blocked out code and I'm afraid readers will have to wade through this as best they can.
clear all ;
1 ;

pkg load signal ;

% function declarations
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function Y = ekf_sine_h( x )
% Measurement model function for the sine signal.
% Takes the state input vector x of sine, phase, angular frequency and amplitude and 
% calculates the current value of the sine given the state vector values.

f = x( 2 , : ) ;    % phase value in radians
a = x( 4 , : ) ;    % amplitude
Y = a .* sin( f ) ; % the sine value

endfunction
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [ cost ] = sine_ekf_Q_optim( Q_mult , data , targets , Q , R )
  
% initial state from noisy data input measurements
s = data( : , 1 ) ;

n = size( data , 1 ) ;

% initial state covariance 
P = eye( n ) ;

% allocate memory
xV = zeros( size( data ) ) ;  % record estmates of states

% basic Jacobian of state transition matrix 
A = [ 0 0 0 0 ;   % sine value
      0 1 1 0 ;   % phase
      0 0 1 0 ;   % angular frequency
      0 0 0 1 ] ; % amplitude
      
H = eye( n ) ; % measurement matrix

Q = Q_mult .* Q ;

  for k = 2 : size( data , 2 )

  % do ekf
  % nonlinear update x and linearisation at current state s
  x = s ;
  x( 2 ) = x( 2 ) + x( 3 ) ;        % advance phase value by angular frequency
  x( 2 ) = mod( x( 2 ) , 2 * pi ) ; % limit phase values to range 0 --- 2 * pi
  x( 1 ) = x( 4 ) * sin( x( 2 ) ) ; % sine value calculation

  % update the 1st row of the jacobian matrix at state vector s values
  A( 1 , : ) = [ 0 s( 4 ) * cos( s( 2 ) ) 0 sin( s( 2 ) ) ] ;

  P = A * P * A' + Q ; % state transition model update of covariance matrix P

  measurement_residual = [ data( 1 , k ) - x( 1 ) ;   % sine residual
                           data( 2 , k ) - x( 2 ) ;   % phase residual
                           data( 3 , k ) - x( 3 ) ;   % angular frequency residual
                           data( 4 , k ) - x( 4 ) ] ; % amplitude residual
                           
  innovation_residual_covariance = H * P * H' + R ;
  kalman_gain = P * H' / innovation_residual_covariance ;

  % update the state vector s with kalman_gain 
  s = x + kalman_gain * measurement_residual ;

  % some reality based post hoc adjustments
  s( 2 ) = abs( s( 2 ) ) ; % no negative phase values
  s( 3 ) = abs( s( 3 ) ) ; % no negative angular frequencies
  s( 4 ) = abs( s( 4 ) ) ; % no negative amplitudes

  % update the state covariance matrix P
  % NOTE
  % The Joseph formula is given by P+ = ( I − KH ) P− ( I − KH )' + KRK', where I is the identity matrix,
  % K is the gain, H is the measurement mapping matrix, R is the measurement noise covariance matrix, 
  % and P−, P+ are the pre and post measurement update estimation error covariance matrices, respectively.  
  % The optimal linear unbiased estimator (equivalently the optimal linear minimum mean square error estimator)  
  % or Kalman filter often utilizes simplified covariance update equations such as P+ = (I−KH)P− and P+ = P− −K(HP−H'+R)K'.  
  % While these alternative formulations require fewer computations than the Joseph formula, they are only valid 
  % when K is chosen as the optimal Kalman gain. In engineering applications, situations arise where the optimal 
  % Kalman gain is not utilized and the Joseph formula must be employed to update the estimation error covariance.  
  % Two examples of such a scenario are underweighting measurements and considering states. 
  % Even when the optimal gain is used, the Joseph formulation is still preferable because it possesses 
  % greater numerical accuracy than the simplified equations.
  P = ( eye( n ) - kalman_gain * H ) * P * ( eye( n ) - kalman_gain * H )' + kalman_gain * R * kalman_gain' ;
   
  xV( : , k ) = s ; % save estimated updated states
   
  endfor
  
Y = ekf_sine_h( xV ) ;
cost = mean( ( Y( 4 : end - 3 ) .- targets( 4 : end - 3 ) ).^2 ) ;
%cost = mean( ( Y .- targets ).^2 ) ;
%output = xV( 1 , : ) ;

endfunction
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function output = run_sine_ekf_Q_optim( data , Q , R )
  
% initial state from noisy data input measurements
s = data( : , 1 ) ;

n = size( data , 1 ) ;

% initial state covariance 
P = eye( n ) ;

% allocate memory
xV = zeros( size( data ) ) ;  % record estmates of states

% basic Jacobian of state transition matrix 
A = [ 0 0 0 0 ;   % sine value
      0 1 1 0 ;   % phase
      0 0 1 0 ;   % angular frequency
      0 0 0 1 ] ; % amplitude
      
H = eye( n ) ; % measurement matrix

  for k = 2 : size( data , 2 )

  % do ekf
  % nonlinear update x and linearisation at current state s
  x = s ;
  x( 2 ) = x( 2 ) + x( 3 ) ;        % advance phase value by angular frequency
  x( 2 ) = mod( x( 2 ) , 2 * pi ) ; % limit phase values to range 0 --- 2 * pi
  x( 1 ) = x( 4 ) * sin( x( 2 ) ) ; % sine value calculation

  % update the 1st row of the jacobian matrix at state vector s values
  A( 1 , : ) = [ 0 s( 4 ) * cos( s( 2 ) ) 0 sin( s( 2 ) ) ] ;

  P = A * P * A' + Q ; % state transition model update of covariance matrix P

  measurement_residual = [ data( 1 , k ) - x( 1 ) ;   % sine residual
                           data( 2 , k ) - x( 2 ) ;   % phase residual
                           data( 3 , k ) - x( 3 ) ;   % angular frequency residual
                           data( 4 , k ) - x( 4 ) ] ; % amplitude residual
                           
  innovation_residual_covariance = H * P * H' + R ;
  kalman_gain = P * H' / innovation_residual_covariance ;

  % update the state vector s with kalman_gain 
  s = x + kalman_gain * measurement_residual ;

  % some reality based post hoc adjustments
  s( 2 ) = abs( s( 2 ) ) ; % no negative phase values
  s( 3 ) = abs( s( 3 ) ) ; % no negative angular frequencies
  s( 4 ) = abs( s( 4 ) ) ; % no negative amplitudes

  % update the state covariance matrix P
  % NOTE
  % The Joseph formula is given by P+ = ( I − KH ) P− ( I − KH )' + KRK', where I is the identity matrix,
  % K is the gain, H is the measurement mapping matrix, R is the measurement noise covariance matrix, 
  % and P−, P+ are the pre and post measurement update estimation error covariance matrices, respectively.  
  % The optimal linear unbiased estimator (equivalently the optimal linear minimum mean square error estimator)  
  % or Kalman filter often utilizes simplified covariance update equations such as P+ = (I−KH)P− and P+ = P− −K(HP−H'+R)K'.  
  % While these alternative formulations require fewer computations than the Joseph formula, they are only valid 
  % when K is chosen as the optimal Kalman gain. In engineering applications, situations arise where the optimal 
  % Kalman gain is not utilized and the Joseph formula must be employed to update the estimation error covariance.  
  % Two examples of such a scenario are underweighting measurements and considering states. 
  % Even when the optimal gain is used, the Joseph formulation is still preferable because it possesses 
  % greater numerical accuracy than the simplified equations.
  P = ( eye( n ) - kalman_gain * H ) * P * ( eye( n ) - kalman_gain * H )' + kalman_gain * R * kalman_gain' ;
   
  xV( : , k ) = s ; % save estimated updated states
   
  endfor 

output = xV ;

endfunction
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% load data
cd /home/dekalog/Documents/octave/indices ;

raw_data = dlmread( 'raw_data_for_indices_and_strengths' ) ;
delete_hkd_crosses = [ 3 9 12 18 26 31 34 39 43 51 61 ] .+ 1 ; % +1 to account for date column
raw_data( : , delete_hkd_crosses ) = [] ;
raw_data( : , 1 ) = [] ; % delete date vector
% so now
%    1       2       3       4       5       6       7       8      9      10      11      12      13     14
% aud_cad aud_chf aud_jpy aud_nzd aud_sgd aud_usd cad_chf cad_jpy cad_sgd chf_jpy eur_aud eur_cad eur_chf eur_gbp
%
%   15      16      17      18      19      20      21      22      23      24      25      26      27      28
% eur_jpy eur_nzd eur_sgd eur_usd gbp_aud gbp_cad gbp_chf gbp_jpy gbp_nzd gbp_sgd gbp_usd nzd_cad nzd_chf nzd_jpy
%
%    29      30      31      32      33      34      35      36      37      38     39       40      41      42       
% nzd_sgd nzd_usd sgd_chf sgd_jpy usd_cad usd_chf usd_jpy usd_sgd xag_aud xag_cad xag_chf xag_eur xag_gbp xag_jpy
%
%    43      44     45      46      47      48      49      50      51      52     53       54      55
% xag_nzd xag_sgd xag_usd xau_aud xau_cad xau_chf xau_eur xau_gbp xau_jpy xau_nzd xau_sgd xau_usd xau_xag 

% aud_x = x(1) ; cad_x = x(2) ; chf_x = x(3) ; eur_x = x(4) ; gbp_x = x(5) ; hkd_x = x(6) ;
% jpy_x = x(7) ; nzd_x = x(8) ; sgd_x = x(9) ; usd_x = x(10) ; gold_x = x(11) ; silver_x = x(12) ;
all_g_c = dlmread( "all_g_mults_c" ) ; % the currency g mults
all_g_c( : , 7 ) = [] ; % delete hkd index
all_g_s = dlmread( "all_g_sv" ) ;      % the gold and silver g mults   
all_g_c = [ all_g_c all_g_s(:,2:3) ] ; % a combination of the above 2
all_g_c( : , 2 : end ) = cumprod( all_g_c( : , 2 : end) , 1 ) ;
all_g_c( : , 1 ) = [] ; % delete date vector
% so now index ix are
% aud_x = 56 ; cad_x = 57 ; chf_x = 58 ; eur_x = 59 ; gbp_x = 60 ; jpy_x = 61 ; 
% nzd_x = 62 ; sgd_x = 63 ; usd_x = 64 ; gold_x = 65 ; silver_x = 66 ;

all_raw_data = [ raw_data all_g_c ] ; clear -x all_raw_data ;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

cd /home/dekalog/Documents/octave/kalman/ekf ;

##  >> bs_phase_error = mean( all_bootstrap_means(:,1)) + 2 * std( all_bootstrap_means(:,1))
##  bs_phase_error =  0.50319
##  >> bs_period_ang_frequency_error = mean( all_bootstrap_means(:,2)) + 2 * std( all_bootstrap_means(:,2))
##  bs_period_ang_frequency_error =  0.17097
##  >> bs_amp_error = mean( all_bootstrap_means(:,3)) + 2 * std( all_bootstrap_means(:,3))
##  bs_amp_error =  0.18179

price_ix = 65 ;
price = all_raw_data( : , price_ix ) ; % plot(price) ;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% get measurements of price action
period = autocorrelation_periodogram( price ) ;
measured_angular_frequency = ( 2 * pi ) ./ period ; % plot( measured_angular_frequency ) ;
trend = sma( price , period ) ;
cycle = price .- trend ;
%smooth_cycle = sgolayfilt( cycle , 2 , 7 ) ;
smooth_cycle = smooth_2_5( cycle ) ;
% plot( price,'k',trend,'r') ;
[ ~ , ~ , ~ , ~ , ~ , ~ , measured_phase ] = sinewave_indicator( cycle ) ; % figure(1) ; plot( deg2rad( measured_phase ) ) ;
% figure(2) ; plot( sin( deg2rad(measured_phase)) ) ;
measured_phase = mod( unwrap( deg2rad( measured_phase ) ) , 2  * pi ) ; % figure(1) ; hold on ; plot( measured_phase , 'r' ) ; hold off ;
% figure(2) ; hold on ; plot( sin( measured_phase)) ; hold off ;

measured_amplitude = cycle ;
for ii = 50 : length( cycle ) ;
measured_amplitude( ii ) = sqrt( 2 ) * sqrt( mean( cycle( ii - period( ii ) : ii ).^2 ) ) ; 
endfor % end ii loop
% plot(measured_amplitude) ;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% run the ekf optimisation
ekf_cycle = cycle ;

% complete values for R matrix for this data set
data = [ smooth_cycle( 101 : end ) measured_phase( 101 : end ) measured_angular_frequency( 101 : end ) measured_amplitude( 101 : end ) ]' ; 
%measured_sine = sgolayfilt( data( 1 , : ) , 2 , 7 ) ; % figure(1) ; plot( data( 1 , : ) , 'k' , measured_sine , 'r' ) ;
measured_sine = smooth_cycle( 101 : end )' ;
R( 1 , 1 ) = mean( ( data( 1 , : ) .- measured_sine ).^2 ) ; % variance of sine wave value measurements
R( 3 , 3 ) = mean( ( data( 3 , : ) .* 0.17 ).^2 ) ;          % variance of angular frequency measurements
R( 4 , 4 ) = mean( ( data( 4 , : ) .* 0.18 ).^2 ) ;          % variance of amplitude measurements

% get initial process covariance matrix Q
Q = analytical_shrinkage( data' ) ;
Q = 0.1 .* Q ;

lookback = 200 ;

%for ii = 301 : length( price )
  
% complete values for R matrix for this data set
data = [ smooth_cycle(ii-lookback:ii) measured_phase(ii-lookback:ii) measured_angular_frequency(ii-lookback:ii) measured_amplitude(ii-lookback:ii) ]' ; 
%measured_sine = sgolayfilt( data( 1 , : ) , 2 , 7 ) ; % figure(1) ; plot( data( 1 , : ) , 'k' , measured_sine , 'r' ) ;
measured_sine = cycle( ii - lookback : ii )' ;
R( 1 , 1 ) = mean( ( data( 1 , : ) .- measured_sine ).^2 ) ; % variance of sine wave value measurements
R( 3 , 3 ) = mean( ( data( 3 , : ) .* 0.17 ).^2 ) ;          % variance of angular frequency measurements
R( 4 , 4 ) = mean( ( data( 4 , : ) .* 0.18 ).^2 ) ;          % variance of amplitude measurements
  
%  Q = analytical_shrinkage( data' ) ;
   
% optimise the Q matrix for this data set
% declare optimisation function
f = @( Q_mult ) sine_ekf_Q_optim( Q_mult , data , smooth_cycle( ii - lookback : ii )' , Q , R ) ;
% Set options for fminunc
options = optimset( "MaxIter" , 50 ) ;
% initial value
Q_mult = 1 ;
Q_mult = fminunc( f , Q_mult , options ) ;
% adjust Q by optimised Q_mult
Q = Q_mult .* Q ;

  output = run_sine_ekf_Q_optim( data , Q , R ) ;
  %ekf_cycle( ii ) = ekf_sine_h( output( : , end ) ) ;
  ekf_cycle = ekf_sine_h( output ) ;
  
figure(1) ; subplot( 4 , 1 , 1 ) ; plot( data( 1 , : ) , 'k' , 'linewidth' , 2 , ekf_sine_h( output ) , 'r' , 'linewidth' , 2 ) ;
title( 'High Pass' ) ; legend( 'Measured' , 'EKF Estimated' ) ;
figure(1) ; subplot( 4 , 1 , 2 ) ; plot( data( 2 , : ) , 'k' , 'linewidth' , 2 , output( 2 , : ) , 'r' , 'linewidth' , 2 ) ;
title( 'High Pass Phase' ) ; legend( 'Measured' , 'EKF Estimated' ) ;
figure(1) ; subplot( 4 , 1 , 3 ) ; plot( data( 3 , : ) , 'k' , 'linewidth' , 2 , output( 3 , : ) , 'r' , 'linewidth' , 2 ) ;
title( 'High Pass Angular Frequency' ) ; legend( 'Measured' , 'EKF Estimated' ) ;
figure(1) ; subplot( 4 , 1 , 4 ) ; plot( data( 4 , : ) , 'k' , 'linewidth' , 2 , output( 4 , : ) , 'r' , 'linewidth' , 2 ) ;
title( 'High Pass Amplitude' ) ; legend( 'Measured' , 'EKF Estimated' ) ;
  
%endfor

figure(1) ; plot( cycle(101:end)' , 'k' , 'linewidth' , 2 , ekf_cycle , 'r' , 'linewidth' , 2 , output(1,:) , 'b' ) ;
figure(2) ; plot( price(101:end) , 'k' , 'linewidth' , 2 , trend(101:end).+ekf_cycle' , 'r' , 'linewidth' , 2 ) ;
One thing I will point out is the use of a function called analytical_shrinkage, which I have taken directly from a recent paper, Analytical Nonlinear Shrinkage of Large-Dimensional Covariance Matrices, the MATLAB code being provided as an appendix in the paper. Readers may find this to be particularly useful outside the use I have tried putting it to.

Next week I shall go on my customary, summer working holiday and be away from home until August: therefore there will be a hiatus in blog posts until my return.

Thursday, 6 June 2019

Determining the Noise Covariance Matrix R for a Kalman Filter

An important part of getting a Kalman filter to work well is tuning the process noise covariance matrix Q and the measurement noise covariance matrix R. This post is about obtaining the R matrix, with a post about the Q matrix to come in due course.

In my last post about the alternative version extended kalman filter to model a sine wave I explained that the 4 measurements are the sine wave value itself, and the phase, angular frequency and amplitude of the sine wave. The phase and angular frequencies are "measured" using outputs of the sinewave indicator and the amplitude is calculated as an RMS value over a measured period. The "problem" with this is that the accuracies of such measurements on real data are unknown, and therefore so is the R matrix, because the underlying cycle properties are unknown. My solution to this is offline Monte Carlo simulation.

As a first step the Octave code in the box immediately below
clear all ;
pkg load statistics ;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% load data

raw_data = dlmread( 'raw_data_for_indices_and_strengths' ) ;
delete_hkd_crosses = [ 3 9 12 18 26 31 34 39 43 51 61 ] .+ 1 ; % +1 to account for date column
raw_data( : , delete_hkd_crosses ) = [] ;
raw_data( : , 1 ) = [] ; % delete data vector 

all_g_c = dlmread( "all_g_mults_c" ) ; % the currency g mults
all_g_c( : , 7 ) = [] ; % delete hkd index

all_g_s = dlmread( "all_g_sv" ) ;      % the gold and silver g mults   
all_g_c = [ all_g_c all_g_s(:,2:3) ] ; % a combination of the above 2
all_g_c( : , 2 : end ) = cumprod( all_g_c( : , 2 : end) , 1 ) ;
all_g_c( : , 1 ) = [] ; % delete date vector

all_raw_data = [ raw_data all_g_c ] ;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% set up recording vectors
amplitude = zeros( size( all_raw_data , 1 ) , 1 ) ;
all_amplitudes = zeros( size( all_raw_data ) ) ;
all_periods = all_amplitudes ;

for ii = 1 : size( all_raw_data , 2 )
  
  price = all_raw_data( : , ii ) ;
  period = autocorrelation_periodogram( price ) ;
  all_periods( : , ii ) = period ;
  [ hp , ~ ] = highpass_filter_basic( price ) ; hp( 1 : 99 ) = 0 ;

  for jj = 100 : size( all_raw_data , 1 )
    amplitude( jj ) = sqrt( 2 ) * sqrt( mean( hp( round( jj - period( jj ) / 2 ) : jj , 1 ).^2 ) ) ;
  endfor
  
  % store information about amplitude changes as a multiplicative constant
  all_amplitudes( : , ii ) = amplitude ./ shift( amplitude , 1 ) ;

endfor

% truncate
all_amplitudes( 1 : 101 , : ) = [] ;
all_periods( 1 : 101 , : ) = [] ;

max_period = max( max( all_periods ) ) ;
min_period = min( min( all_periods ) ) ;

% unroll
all_amplitudes = all_amplitudes( : ) ; all_periods = all_periods( : ) ;

amplitude_changes = cell( max_period , 1 ) ;

for ii = min_period : max_period
  ix = find( all_periods == ii ) ;
  amplitude_changes{ ii , 1 } = all_amplitudes( ix , 1 ) ;
endfor

save all_amplitude_changes amplitude_changes ;
is used to get information about cycle amplitudes from real data and store it in a cell array.

This next box shows code that uses this cell array, plus earlier hidden markov modelling code for sine wave modelling, to produce synthetic price data
clear all ;
pkg load signal ;
pkg load statistics ;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% load data

raw_data = dlmread( 'raw_data_for_indices_and_strengths' ) ;
delete_hkd_crosses = [ 3 9 12 18 26 31 34 39 43 51 61 ] .+ 1 ; % +1 to account for date column
raw_data( : , delete_hkd_crosses ) = [] ;
raw_data( : , 1 ) = [] ; % delete date vector
all_g_c = dlmread( "all_g_mults_c" ) ; % the currency g mults
all_g_c( : , 7 ) = [] ; % delete hkd index
all_g_s = dlmread( "all_g_sv" ) ;      % the gold and silver g mults   
all_g_c = [ all_g_c all_g_s(:,2:3) ] ; % a combination of the above 2
all_g_c( : , 2 : end ) = cumprod( all_g_c( : , 2 : end) , 1 ) ;
all_g_c( : , 1 ) = [] ; % delete date vector
all_raw_data = [ raw_data all_g_c ] ; clear -x all_raw_data ;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

load all_hmm_periods_daily ;
load all_amplitude_changes ;

% choose an ix for all_raw_data
ix = input( "ix? " ) ;
price = all_raw_data( : , ix ) ;
period = autocorrelation_periodogram( price ) ;
[ hp , trend ] = highpass_filter_basic( price ) ;

% estimate variance of cycle noise
hp_filt = sgolayfilt( hp , 2 , 7 ) ;
noise = hp( 101 : end ) .- hp_filt( 101 : end ) ; noise_var = var( noise ) ;
% zero pad beginning of noise vector
noise = [ zeros( 100 , 1 ) ; noise ] ;

% set for plotting purposes
hp( 1 : 50 ) = 0 ; trend( 1 : 50 ) = price( 1 : 50 ) ;

% create sine wave with known phase and period/angular frequency 
[ gen_period , ~ ] = hmmgenerate( length( price ) , transprobest , outprobest ) ;
gen_period = gen_period .+ hmm_min_period_add ;
phase = cumsum( ( 2 * pi ) ./ gen_period ) ; phase = mod( phase , 2 * pi ) ;
gen_sine = sin( phase ) ;

% amplitude vector
amp = ones( size( hp ) ) ;

% RMS initial amplitude 
amp( 100 ) = sqrt( 2 ) * sqrt( mean( hp( 100 - period( 100 ) : 100 ).^2 ) ) ;

% alter sine wave with known amplitude
for ii = 101 : length( hp )
  % get multiple for this period
  random_amp_mults = amplitude_changes{ gen_period( ii ) } ;
  rand_mult = random_amp_mults( randi( size( random_amp_mults , 1 ) , 1 ) ) ;
  amp( ii ) = rand_mult * sqrt( 2 ) * sqrt( mean( hp( ii - 1 - period( ii - 1 ) : ii - 1 ).^2 ) ) ; ;
  gen_sine( ii ) = amp( ii ) * gen_sine( ii ) ; 
endfor

% estimate variance of generated cycle noise
gen_filt = sgolayfilt( gen_sine , 2 , 7 ) ;
gen_noise = gen_sine( 101 : end ) .- gen_filt( 101 : end ) ; gen_noise_var = var( gen_noise ) ;

% a crude adjustment of generated cycle noise to match "real" noise by adding
% noise
gen_sine = gen_sine' .+ noise ;

% set beginning of gen_sine to real cycle for plotting purposes
gen_sine( 1 : 100 ) = hp( 1 : 100 ) ;
new_price = trend .+ gen_sine ;

% a simple correlation check
correlation_cycle_gen_sine = corr( hp , gen_sine ) ;
correlation_price_gen_price = corr( price , new_price ) ;

figure(1) ; plot( gen_sine , 'k' , 'linewidth' , 2 ) ; grid minor on ; title( 'The Generated Cycle' ) ; 
figure(2) ; plot( price , 'b' , 'linewidth' , 2 , trend , 'k' , 'linewidth' , 1 , new_price , 'r' , 'linewidth', 2 ) ; 
title( 'Comparison of Real and Generated Prices' ) ; legend( 'Real Price' , 'Real Trend' , 'Generated Price' ) ; grid minor on ;
figure(3) ; plot( hp , 'k' , 'linewidth' , 2 , gen_sine , 'r' , 'linewidth' , 2 ) ; title( 'Generated Cycle compared with Real Cycle' ) ; 
legend( 'Real Cycle' , 'Generated Cycle' ) ; grid minor on ;
Some typical chart output of this code is the synthetic price series in red compared with the real blue price series
and the underlying cycles.
For this particular run the correlation between the synthetic price series and the real price series is 0.99331, whilst the correlation between the relevant cycles is only 0.19988, over 1500 data points ( not all are shown in the above charts. )

Because the synthetic cycle is definitely a sine wave ( by construction ) with a known phase, angular frequency and amplitude I can run the above sinewave_indicator and RMS measurements on the synthetic prices, compare with the known synthetic cycle, and estimate the measurement noise covariance matrix R.

More in due course. 

Friday, 31 May 2019

Extended Kalman Filter, Alternative Version

Below is alternative code for an Extended Kalman filter for a sine wave, which has 4 states: the sine wave value, the phase, the angular frequency and amplitude and measurements thereof. I have found it necessary to implement this version because I couldn't adjust my earlier version code to accept and measure the additional states without the Cholesky decomposition function chol() exiting and giving errors about the input not being a positive definite matrix.
clear all ;
1 ;

function Y = ekf_sine_h( x )
% Measurement model function for the sine signal.
% Takes the state input vector x of sine, phase, angular frequency and amplitude and 
% calculates the current value of the sine given the state vector values.

f = x( 2 , : ) ;    % phase value in radians
a = x( 4 , : ) ;    % amplitude
Y = a .* sin( f ) ; % the sine value

endfunction

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
pkg load statistics ;
load all_snr ;
load all_hmm_periods_daily ;

N = 250 ; % total dynamic steps i.e.length of data to generate
[ gen_period , gen_states ] = hmmgenerate( N , transprobest , outprobest ) ;
gen_period = gen_period .+ hmm_min_period_add ;
angular_freq_rads = ( 2 * pi ) ./ gen_period ;
real_phase = cumsum( angular_freq_rads ) ; real_phase = mod( real_phase , 2 * pi ) ;
gen_sine = sin( real_phase ) ;

noise_val = mean( [ all_snr(:,1) ; all_snr(:,2) ] ) ;
my_sine_noisy = awgn( gen_sine , noise_val ) ;

[ ~ , ~ , ~ , ~ , ~ , ~ , phase ] = sinewave_indicator( my_sine_noisy ) ;
phase = deg2rad( phase' ) ;
period = autocorrelation_periodogram( my_sine_noisy ) ;
measured_angular_frequency = ( 2 * pi ) ./ period' ;
measured_angular_frequency( 1 : 50 ) = 2 * pi / 20 ;

% for for amplitude
measured_amplitude = ones( 1 , N ) ;
for ii = 50 : N
  measured_amplitude( ii ) = sqrt( 2 ) * sqrt( mean( my_sine_noisy( ii - period(ii) : ii ).^2 ) ) ;  
endfor

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
h = @ekf_sine_h ; 

n = 4 ; % number of states - sine value, phase, angular frequency and amplitude

% matrix for covariance of process
Q = eye( n ) ; 
Q( 1 , 1 ) = var( my_sine_noisy ) ;
Q( 2 , 2 ) = var( phase( 50 : end ) ) ;
Q( 3 , 3 ) = var( measured_angular_frequency( 50 : end ) ) ;
Q( 4 , 4 ) = var( measured_amplitude( 50 : end ) ) ;
% scale the covariance matrix, a tunable factor?
Q = 0.001 .* Q ; 

% measurement noise variance matrix
R = zeros( 4 , 4 ) ; 
R( 1 , 1 ) = var( gen_sine .- my_sine_noisy ) ;                                               % the sine values
R( 2 , 2 ) = var( real_phase( 50 : end ) .- phase( 50 : end ) ) ;                             % the phase values
R( 3 , 3 ) = var( angular_freq_rads( 50 : end ) .- measured_angular_frequency( 50 : end ) ) ; % angular frequency values
R( 4 , 4 ) = var( measured_amplitude( 50 : end ) .- 1 ) ;                                     % amplitude values

% initial state from noisy measurements
s = [ my_sine_noisy( 50 ) ;              % sine
      phase( 50 ) ;                      % phase ( Rads )
      measured_angular_frequency( 50 ) ; % angular_freq_rads
      measured_amplitude( 50 ) ] ;       % amplitude               
 
% initial state covariance 
P = eye( n ) ; 
 
% allocate memory
xV = zeros( n , N ) ;  % record estmates of states        
pV1 = zeros( n , N ) ; % projection measurments
pV2 = zeros( n , N ) ; % projection measurments
pV3 = zeros( n , N ) ; % projection measurments

% basic Jacobian of state transition matrix 
A = [ 0 0 0 0 ;   % sine value
      0 1 1 0 ;   % phase
      0 0 1 0 ;   % angular frequency
      0 0 0 1 ] ; % amplitude
      
H = eye( n ) ; % measurement matrix
      
for k = 51 : N

% do ekf
% nonlinear update x and linearisation at current state s
x = s ;
x( 2 ) = x( 2 ) + x( 3 ) ;        % advance phase value by angular frequency
x( 2 ) = mod( x( 2 ) , 2 * pi ) ; % limit phase values to range 0 --- 2 * pi
x( 1 ) = x( 4 ) * sin( x( 2 ) ) ; % sine value calculation

% update the 1st row of the jacobian matrix at state vector s values
A( 1 , : ) = [ 0 s( 4 ) * cos( s( 2 ) ) 0 sin( s( 2 ) ) ] ;

P = A * P * A' + Q ; % state transition model update of covariance matrix P

measurement_residual = [ my_sine_noisy( k ) - x( 1 ) ;              % sine residual
                         phase( k ) - x( 2 ) ;                      % phase residual
                         measured_angular_frequency( k ) - x( 3 ) ; % angular frequency residual
                         measured_amplitude( k ) - x( 4 ) ] ;       % amplitude residual
                         
innovation_residual_covariance = H * P * H' + R ;
kalman_gain = P * H' / innovation_residual_covariance ;

% update the state vector s with kalman_gain 
s = x + kalman_gain * measurement_residual ;

% some reality based post hoc adjustments
s( 2 ) = abs( s( 2 ) ) ; % no negative phase values
s( 3 ) = abs( s( 3 ) ) ; % no negative angular frequencies
s( 4 ) = abs( s( 4 ) ) ; % no negative amplitudes

% update the state covariance matrix P
% NOTE
% The Joseph formula is given by P+ = ( I − KH ) P− ( I − KH )' + KRK', where I is the identity matrix,
% K is the gain, H is the measurement mapping matrix, R is the measurement noise covariance matrix, 
% and P−, P+ are the pre and post measurement update estimation error covariance matrices, respectively.  
% The optimal linear unbiased estimator (equivalently the optimal linear minimum mean square error estimator)  
% or Kalman filter often utilizes simplified covariance update equations such as P+ = (I−KH)P− and P+ = P− −K(HP−H'+R)K'.  
% While these alternative formulations require fewer computations than the Joseph formula, they are only valid 
% when K is chosen as the optimal Kalman gain. In engineering applications, situations arise where the optimal 
% Kalman gain is not utilized and the Joseph formula must be employed to update the estimation error covariance.  
% Two examples of such a scenario are underweighting measurements and considering states. 
% Even when the optimal gain is used, the Joseph formulation is still preferable because it possesses 
% greater numerical accuracy than the simplified equations.
P = ( eye( n ) - kalman_gain * H ) * P * ( eye( n ) - kalman_gain * H )' + kalman_gain * R * kalman_gain' ;
 
xV( : , k ) = s ;                                                 % save estimated updated states
pV1( : , k ) = s ; pV1( 2 , k ) = pV1( 2 , k ) + pV1( 3 , k ) ;   % for plotting projections
pV2( : , k ) = s ; pV2( 2 , k ) = pV2( 2 , k ) + 2*pV2( 3 , k ) ; % for plotting projections
pV3( : , k ) = s ; pV3( 2 , k ) = pV3( 2 , k ) + 3*pV3( 3 , k ) ; % for plotting projections
 
endfor

% Plotting
figure(1) ; subplot(3,1,1) ; plot( real_phase , 'k', 'linewidth' , 2 , phase , 'b' , 'linewidth' , 2 , xV( 2 , : ) , 'r' , 'linewidth' , 2 ) ; grid minor on ;
title( 'Phase State' ) ; legend( 'Actual Phase State' , 'Measured Phase State' , 'Estimated Phase Sate' ) ; ylabel( 'Radians' ) ;

figure(1) ; subplot(3,1,2) ; plot(angular_freq_rads, 'k', 'linewidth' , 2 , measured_angular_frequency , 'b' , 'linewidth' , 2 , xV( 3 , : ) , 'r' , 'linewidth' , 2 ) ; 
grid minor on ; 
title( 'Angular Frequency State in Radians' ) ; legend( 'Actual Angular Frequency' , 'Measured Angular Frequency' , 'Estimated Angular Frequency' ) ;

figure(1) ; subplot(3,1,3) ; plot( ones(1,N) , 'k', 'linewidth' , 1 , measured_amplitude , 'b' , 'linewidth' , 2 , xV( 4 , : ) , 'r' , 'linewidth' , 2 ) ; grid minor on ;
title( 'Amplitude State' ) ; legend( 'Actual Amplitude' , 'Measured Amplitude' , 'Estimated Amplitude' ) ;

sine_est = h( xV ) ;
sine_1 = h( pV1 ) ; sine_1 = shift( sine_1 , 1 ) ;
sine_2 = h( pV2 ) ; sine_2 = shift( sine_2 , 2 ) ;
sine_3 = h( pV3 ) ; sine_3 = shift( sine_3 , 3 ) ;

figure(2) ; plot( gen_sine , 'k' , 'linewidth' , 2 , my_sine_noisy , '*b' , 'linewidth' , 2 , sine_est , 'r' , 'linewidth' , 2 ) ; grid minor on ;
title( 'Underlying and its Estimate' ) ; legend( 'Real Underlying Sine' , 'Noisy Measured Sine' , 'EKF Estimate of Real Underlying Sine' ) ;

figure(3) ; plot( gen_sine , 'k' , 'linewidth' , 2 , my_sine_noisy , '*b' , 'linewidth' , 2 , sine_1 , 'r' , 'linewidth' , 2 , sine_2 , 'g' , 'linewidth' , 2 , ...
sine_3 , 'b' , 'linewidth' , 2 ) ; grid minor on ;
title( 'Plot of the actual sine wave vs projected estimated sines' ) ; 
legend( 'Actual' , 'Noisy Measured Sine' , 'EKF Projected 1' , 'EKF Projected 2' , 'EKF Projected 3' ) ;
The measurement for the phase is an output of the sinewave indicator function, the period an output of the autocorrelation periodogram function and the amplitude the square root of 2 multiplied by the Root mean square of the sine wave over the autocorrelation periodogram measured period.

As usual the code is liberally commented. More soon. 

Monday, 27 May 2019

Extended Kalman Filter

In the code box below I provide code for an Extended Kalman filter to model a sine wave. This is a mashup of code from a couple of toolboxes I have found online, namely learning-the-extended-kalman-filter and EKF/UKF Tollbox for Matlab/Octave. The modelled states are the phase, angular frequency and amplitude of the sine wave and the measurement is the ( noisy ) sine wave value itself.
clear all ;
1;

function x_n = ekf_sine_f( x )
% Dynamical model function for the sine signal.
% Takes the input vector of phase, angular frequency in radians and 
% amplitude and calculates the next values of the states given the 
% current vector.

% transition matrix
A = [ 1 1 0 ;   % phase in Rads
      0 1 0 ;   % angular_freq_rads
      0 0 1 ] ; % amplitude
      
x_n = A * x ; 
x_n( 1 ) = mod( x_n( 1 ) , 2 * pi ) ;

endfunction

function Y = ekf_sine_h( x )
% Measurement model function for the sine signal.
% Takes the input vector of phase, angular frequency and amplitude and 
% calculates the current value of the sine given the input vector.

f = x( 1 , : ) ;    % phase in radians
a = x( 3 , : ) ;    % amplitude
Y = a .* sin( f ) ; % the sine

endfunction

function [ z , A ] = jacobian_transition( fun , x )
z = fun( x ) ;
A = [ 1 1 0 ;
      0 1 0 ;
      0 0 1 ] ;
endfunction

function [ z , A ] = jacobian_measurement( fun , x )
z = fun( x ) ;
A = [ x( 3 ) * cos( x( 1 ) ) 0 sin( x( 1 ) ) ] ;
endfunction

%%%%%%%%%%%%%%%%%%% Generate a sine wave %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
pkg load statistics ;
load all_snr ;
load all_hmm_periods_daily ;

N = 250 ; % total dynamic steps i.e.length of data to generate
[ gen_period , gen_states ] = hmmgenerate( N , transprobest , outprobest ) ;
gen_period = gen_period .+ hmm_min_period_add ;
angular_freq_rads = ( 2 * pi ) ./ gen_period ;
real_phase = cumsum( angular_freq_rads ) ; real_phase = mod( real_phase , 2 * pi ) ;
gen_sine = sin( real_phase ) ;
noise_val = mean( [ all_snr(:,1) ; all_snr(:,2) ] ) ;
my_sine_noisy = awgn( gen_sine , noise_val ) ;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
f = @ekf_sine_f ;
h = @ekf_sine_h ; 

n = 3 ;        % number of states - phase, angular frequency and amplitude
Q = eye( n ) ; % matrix for covariance of process

Q( 1 , 1 ) = var( real_phase ) ;
Q( 2 , 2 ) = var( angular_freq_rads ) ;
Q( 3 , 3 ) = var( sqrt(2) * sqrt( mean( my_sine_noisy.^2 ) ) ) ;
Q = 1 .* Q ; % scale the covariance matrix, a tunable factor ? 

R = var( gen_sine .- my_sine_noisy ) ; % measurement noise variance

% initial state
s = [ real_phase( 6 ) ;                                        % phase ( Rads )
      mean( angular_freq_rads( 1 : 20 ) ) ;                    % angular_freq_rads
      sqrt(2) * sqrt( mean( my_sine_noisy( 1 : 20 ).^2 ) ) ] ; % amplitude               
    
P = eye( n ) ;  % initial state covariance

% allocate memory
xV = zeros( n , N ) ;  % record estmates of states        
sV = ones( 1 , N ) ;   % real amplitudes 
pV1 = zeros( n , N ) ; % projection measurments
pV2 = zeros( n , N ) ; % projection measurments
pV3 = zeros( n , N ) ; % projection measurments

for k = 7 : N
 
z = my_sine_noisy( k ) ; % noisy measurement of sine

% do ekf
[ x1 , A ] = jacobian_transition( f , s ) ;   % nonlinear update and linearisation at current state
P = A * P * A' + Q ;                          % partial update
[ z1 , H ] = jacobian_measurement( h , x1 ) ; % nonlinear measurement and linearisation
P12 = P * H' ;                                % cross covariance
R = chol( H * P12 + R ) ;                     % Cholesky factorisation
U = P12 / R ;                                 % K = U / R' ; Faster because of back substitution
s = x1 + U * ( R' \ ( z - z1 ) ) ;            % Back substitution to get state update
P = P - U * U' ;                              % Covariance update, U * U' = P12 / R / R' * P12' = K * P12  
 
xV( : , k ) = s ;                             % save estimated updated states
pV1( : , k ) = s ; pV1( 1 , k ) = pV1( 1 , k ) + pV1( 2 , k ) ;   % for plotting
pV2( : , k ) = s ; pV2( 1 , k ) = pV2( 1 , k ) + 2*pV2( 2 , k ) ; % for plotting
pV3( : , k ) = s ; pV3( 1 , k ) = pV3( 1 , k ) + 3*pV3( 2 , k ) ; % for plotting
 
endfor

% Plotting
figure(1) ; subplot(3,1,1) ; plot( real_phase , 'k', 'linewidth' , 2 , xV( 1 , : ) , 'r' , 'linewidth' , 2 ) ; grid minor on ;
title( 'Phase State' ) ; legend( 'Actual Phase State' , 'Estimated Phase Sate' ) ; ylabel( 'Radians' ) ;

figure(1) ; subplot(3,1,2) ; plot(angular_freq_rads, 'k', 'linewidth' , 2 , xV(2,:), 'r' , 'linewidth' , 2 ) ; grid minor on ; 
title( 'Angular Frequency State in Radians' ) ; legend( 'Actual Angular Frequency' , 'Estimated Angular Frequency' ) ;

figure(1) ; subplot(3,1,3) ; plot( sV , 'k', 'linewidth' , 1 , xV(3,:), 'r' , 'linewidth' , 2 ) ; grid minor on ;
title( 'Amplitude State' ) ; legend( 'Actual Amplitude' , 'Estimated Amplitude' ) ;

sine_est = h( xV ) ;
sine_1 = h( pV1 ) ; sine_1 = shift( sine_1 , 1 ) ;
sine_2 = h( pV2 ) ; sine_2 = shift( sine_2 , 2 ) ;
sine_3 = h( pV3 ) ; sine_3 = shift( sine_3 , 3 ) ;

figure(2) ; plot( gen_sine , 'k' , 'linewidth' , 2 , my_sine_noisy , '*b' , 'linewidth' , 2 , sine_est , 'r' , 'linewidth' , 2 ) ; 
title( 'Underlying and its Estimate' ) ; legend( 'Real Underlying Sine' , 'Noisy Measured Sine' , 'EKF Estimate of Real Underlying Sine' ) ;

figure(3) ; plot( gen_sine , 'k' , 'linewidth' , 2 , sine_1 , 'r' , 'linewidth' , 2 , sine_2 , 'g' , 'linewidth' , 2 , sine_3 , 'b' , 'linewidth' , 2 ) ;
title( 'Plot of the actual sine wave vs projected estimated sines' ) ; 
legend( 'Actual' , 'EKF Projected 1' , 'EKF Projected 2' , 'EKF Projected 3' ) ;
Typical output plots are the states
the filtered estimate
and shifted, future projections using the current estimated states
The code is well commented so I won't talk further about it in this post as this is part of ongoing work. A fuller discussion in due course.

Friday, 5 April 2019

Tests of Constant and Variable Acceleration Model Kalman Filters

In my last post I said that this next post would report the results of tests on a Constant Acceleration model Kalman filter, and the results are: fail, just like the Constant Velocity model, so I won't bore readers with reporting the details of the failed tests. However, tests of a Variable Acceleration model have been more successful, and so this post is about the results of tests on this model.

The Kalman filter states in this variable acceleration model are position, velocity, acceleration and three constants that are used to calculate the acceleration; b, c and d. The state transition matrix for this model is
F = [ 1 1 0 1/2 1/6 1/12 ; % position
      0 1 0 1 1/2 1/3 ;    % velocity
      0 0 0 1 1 1 ;        % acceleration
      0 0 0 1 0 0 ;        % b
      0 0 0 0 1 0 ;        % c
      0 0 0 0 0 1 ] ;      % d
with a measurement/observation matrix of
H = [ 1 0 0 0 0 0 ; 
      0 1 0 0 0 0 ; 
      0 0 1 0 0 0 ;
      0 0 0 1 0 0 ;
      0 0 0 0 1 0 ;
      0 0 0 0 0 1 ] ; % measurement matrix
Of course there are no "measurements" of b, c and d that can be directly taken from a price series, so my solution to this is to use Octave's fminunc function to minimise this
function [ J ] = kalman_bcd_constant_minimisation( bcd , position_measurements , velocity_measurements , accel_measurements )
pos_proj = position_measurements( 1 ) + velocity_measurements( 1 ) + [ 1/2 1/6 1/12 ] * bcd ;
vel_proj = velocity_measurements( 1 ) + [ 1 1/2 1/3 ] * bcd ;
accel_proj = sum( bcd ) ;  
pos_proj_cost = ( pos_proj - position_measurements( 2 ) )^2 ;
vel_proj_cost = ( vel_proj - velocity_measurements( 2 ) )^2 ;
accel_proj_cost = ( accel_proj - accel_measurements( 2 ) )^2 ;
J = pos_proj_cost + vel_proj_cost + accel_proj_cost ;  
endfunction
where position, velocity and acceleration measurements are calculated as described in my previous post. These cost minimised b, c and d constants are used as the noisy measurements for filter input purposes.

One implementation bug I experienced is that I was unable to use the ALS package to model the process noise as the dlqe function it calls kept crashing my session - so as a result I went with the crude, hand constructed covariance matrix as per the code in my previous post.

I wrote above that the tests on this model were the most successful to date. What I mean by this is that the Innovation Magnitude Bound test, the Innovation Zero-centred tests and Innovation auto correlation plot zero-centred tests on the position, velocity and acceleration states were passed 100% and about half of the Innovation auto correlation plot bound tests passed. The Innovation Whiteness (Auto correlation) tests varied from low single figure to 60+% percentage pass rates. However, I'm not 100% certain that I have correctly implemented this last set of tests so these results are suspect. Despite this reservation I think that I can say the Variable Acceleration Model expressed as a polynomial in time is the simplest kinematic model that is suitable for use on financial time series.

More in due course. 

Monday, 1 April 2019

Test of Constant Velocity Model Kalman Filter

Following on from my previous post, this post is a more detailed description of the testing methodology to test kinematic motion models on financial time series. The rationale behind the test(s) which are described below is different from the usual backtesting in that the test(s) are to determine whether the Kalman filter model is mismatched or not, i.e. whether the model innovations match the assumption that the residuals are Gaussian.

If readers have read the estimation lecture notes linked in my earlier post you'll know that one problem with estimating the above is that it is often difficult or impossible to know the covariance matrices for the process noise and the measurement noise, and getting these wrong effects the performance of the Kalman filter even if the underlying model is accurately specified. My "solution" to this is to use the Savitzky-Golay smoothing filter to smooth the time series and at the same time to get estimates for the velocity, acceleration etc. In real time of course this cannot be done because the Savitzky-Golay filter is a "centred" smoother and thus requires knowledge of data in the future. Using this information a crude estimate of the covariance matrices is thus derived, as shown in Octave code in the code box below:
% Estimate measurements for matrix R and covariance matrix Q by using the best
% fit quadratic using above sgolay filter coeffs
R = zeros( 2 , 2 ) ; Q = R ; % position and velocity model

% variance of position measurement
R( 1 , 1 ) = var( measured_pos( 1 , 7 : end - 6 ) .- real_pos( 1 , 7 : end - 6 ) ) ;

% variance of velocity measurement
R( 2 , 2 ) = var( measured_vel( 1 , 7 : end - 6 ) .- real_vel( 1 , 7 : end - 6 ) ) ;

% covariance of position process noise
Q( 1 , 1 ) = cov( ( real_pos( 1 , 7 : end - 7 ) .+ real_vel( 1 , 7 : end - 7 ) ) .- real_pos( 1 , 8 : end - 6 ) ) ;

% covariance of velocity process noise
Q( 2 , 2 ) = cov( diff( real_vel( 1 , 7 : end - 6 ) ) ) ;

% covariance of position process noise and velocity process noise
Q( 1 , 2 ) = cov( ( real_pos( 1 , 7 : end - 7 ) .+ real_vel( 1 , 7 : end - 7 ) ) .- real_pos( 1 , 8 : end - 6 ) , ...
diff( real_vel( 1 , 7 : end - 6 ) ) ) ;
Q( 2 , 1 ) = Q( 1 , 2 ) ;
where real_pos and real_vel are calculated using the Savitzky-Golay filter and represent the "true" position and velocity of a Constant Velocity model compared to the measured_pos and measured_vel measurements that are taken in real time. These crude estimates of the covariance matrices are used as inputs to the Autocovariance Least Squares Method implemented by means of the Octave ALS package to get a hopefully much more accurate estimate of the noise covariances. Armed with these estimates, the test code as outlined in the above mentioned estimation lecture notes paper and my previous post can then be run.

To avoid having to manually judge success or failure by looking at numerous plots such as this one
I run a few simple statistical tests
% Performance Tests for an Unmatched Kalman Filter

% Test 1 ---> Innovation Magnitude Bound Test
% check the innovation is bounded by +/- 2 * sqrt( S )
test_1 = sum( ( nu( 1 , 7 : end - 3 ) < 2 * s( 1 , 7 : end - 3 ) ) .* ...
( nu( 1 , 7 : end - 3 ) > -2 * s( 1 , 7 : end - 3 ) ) ) / length( s( 1 , 7 : end - 3 ) ) ;
% greater than 0.95 is a pass, i.e. less than 5 % of values out of bounds
% record result 
all_test_results( jj , 1 ) = test_1 ; 

%  Perform a T-test of the null hypothesis 'mean (X) == M' for a
%  sample X from a normal distribution with unknown mean and unknown
%  std deviation.  Under the null, the test statistic T has a
%  Student's t distribution.  The default value of M is 0.
%  If H is 0 the null hypothesis is accepted, if it is 1 the null
%  hypothesis is rejected.
zero_mean_innovation_pos = ttest( nu( 1 , : ) , 0 ) ; % innovations are centred around zero?
% H = 0 is a pass, i.e. centred around zero
% record result 
all_test_results( jj , 2 ) = zero_mean_innovation_pos ; 

% Test 2 ---> Normalised Innovation Squared Test
%  > qchisq( c( 0.05 , 0.95 ) , df = 1 , lower.tail = FALSE )
%  [1] 3.84145882 0.00393214
%  > qchisq( c( 0.05 , 0.95 ) , df = 2 , lower.tail = FALSE )
%  [1] 5.9914645 0.1025866

d_of_freedom = sum( sum( H ) ) ;
switch ( d_of_freedom )
  case 1
  lower_CI = 0.00393214 ;
  upper_CI = 3.84145882 ;
case 2
  lower_CI = 0.1025866 ;
  upper_CI = 5.9914645 ;  
endswitch

normalised_innovation_squared_chi2_test_pos = ( sum_Q_pos > lower_CI ) * ( sum_Q_pos < upper_CI ) ;
% value of 1 is a pass
% record result 
all_test_results( jj , 3 ) = normalised_innovation_squared_chi2_test_pos ;  

% Test 3 ---> Innovation Whiteness (Autocorrelation) Test
% check the innovation is bounded by +/- 2 * std
%level = 2 * std( r_pos( N : 2 * N - 1 ) / r_pos( N ) ) ;
level = 1.96 / sqrt( N - burn_in - 2 ) ;
##test_3 = sum( ( r_pos( N : 2 * N - 1 ) / r_pos( N ) < level ) .* ( r_pos( N : 2 * N - 1 ) / r_pos( N ) > -level ) ) / ...
##length( r_pos( N : 2 * N - 1 ) / r_pos( N ) ) ;

test_3 = sum( ( r_pos( no_lags + 1 : end ) < level ) .* ( r_pos( no_lags + 1 : end ) > -level ) ) ...
/ length( r_pos( no_lags + 1 : end) ) ;

% greater than 0.95 is a pass, i.e. less than 5 % of values out of bounds
% record result 
all_test_results( jj , 4 ) = test_3 ; 

zero_mean_autocorrelation_plot_pos = ttest( r_pos( no_lags + 1 : end ) , 0 ) ; % centred around zero
% H = 0 is a pass, i.e. centred around zero
% record result 
all_test_results( jj , 5 ) = zero_mean_autocorrelation_plot_pos ; 
across a range of forex crosses and my currency indices and see if, say 95%, of them pass these tests.

As a result of my first tests thus outlined on the Constant Velocity Kinematic Model I think can confidently say that this model is "mismatched," despite it sometimes being described as the simplest "useful" model, and therefore readers should not use this model in a Kalman filter on financial time series. It is worth noting that this model is named Model One in the Trend Without Hiccups paper, which in turn is very closely related to a local linear trend model, which the authors name Model Two. Given that the underlying model(s) for both are mismatched it is not surprising that the authors report disappointing results from these models.

In my next post I shall report results on the Constant Acceleration Model.

Wednesday, 20 March 2019

Revisiting the Kalman Filter

Some time ago ( here, here and here ) I posted about the Kalman filter and recently I have been looking at Kalman filters again because of this Trend Without Hiccups paper hosted at SSRN. I also came across this Estimation Lecture paper which provides MATLAB code for the testing of Kalman filters and my Octave suitable version of this code is shown in the code box below.
clear all ;
1 ; % some function declarations
function [ x_pred , p_pred ] = predict( x , P , F , Q )
x_pred = F * x ;
p_pred = F * P *F' + Q ;
endfunction

function [ nu , S ] = innovation( x_pred , p_pred , z , H , R )
nu = z - H * x_pred ;     % innovation
S = H * p_pred * H' + R ; % innovation covariance
endfunction

function [ x_new , p_new ] = innovation_update( x_pred , p_pred , nu , S , H )
K = p_pred * H' / S ;         % Kalman gain
x_new = x_pred + K * nu ;     % new state
p_new = p_pred - K * S * K' ; % new covariance
endfunction

pkg load signal ;     % for xcorr function

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Script to generate some "true" constant velocity data for later assessment
%%% Generates:
%%%  x: the state history which evolves according to 
%%%  x(k+1) = Fx(k) + w(k)
%%%  w: the process noise history (randomly generated)
%%%  z: a set of observations on the state corrupted by noise 
%%%  v: the noise on each observation (randomly generated)
 
N = 100 ;
delT = 1 ;
F = [ 1 delT ;
      0 1 ] ;
      
H = [ 1 0 ] ;

% process and measurement noise variances
sigma2Q = 0.01 ;
sigma2R = 0.1 ;

% process covariance matrix
Q = sigma2Q * [ delT/3 delT/2 ;
                delT/2 delT ] ;
                
P = Q ;
R = sigma2R * [ 1 ] ;

x = zeros( 2 , N ) ;
w = zeros( 2 , N ) ;
z = zeros( 1 , N ) ;
v = zeros( 1 , N ) ;

for ii = 2 : N
w( : , ii ) = randn( 2 , 1 ) .* sigma2Q ;         % generate process noise
x( : , ii ) = F * x( : , ii - 1 ) + w( : , ii ) ; % update state
v( : , ii ) = randn( 1 ) * R ;                    % generate measurement noise
z( : , ii ) = H * x( : , ii ) + v( : , ii ) ;     % get "true" measurement
endfor

% visualise data
%figure( 1 ) ; plot( x( 1 , : ) , 'k' , 'linewidth' , 2  ) ;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Script to generate some "true" constant acceleration data for later assessment
%%% Generates:
%%%  x: the state history which evolves according to 
%%%  x(k+1) = Fx(k) + w(k)
%%%  w: the process noise history (randomly generated)
%%%  z: a set of observations on the state corrupted by noise 
%%%  v: the noise on each observation (randomly generated)
 
N = 100 ;
delT = 1 ;
F = [ 1 delT delT/2 ;
      0 1 delT ;
      0 0 1 ] ;

H = [ 1 0 0 ] ;

% process and measurement noise variances
sigma2Q = 0.01 ;
sigma2R = 0.1 ;

% process covariance matrix
Q = sigma2Q * [ delT/20 delT/8 delT/6 ;
                delT/8 delT/3 delT/2 ;
                delT/6 delT/2 delT ] ;
                
P = Q ;
R = sigma2R * [ 1 ] ;

x = zeros( 3 , N ) ;
w = zeros( 3 , N ) ;
z = zeros( 1 , N ) ;
v = zeros( 1 , N ) ;

for ii = 2 : N
w( : , ii ) = randn( 3 , 1 ) .* sigma2Q ;         % generate process noise
x( : , ii ) = F * x( : , ii - 1 ) + w( : , ii ) ; % update state
v( : , ii ) = randn( 1 ) * R ;                    % generate measurement noise
z( : , ii ) = H * x( : , ii ) + v( : , ii ) ;     % get "true" measurement
endfor

% visualise data
%figure( 1 ) ; plot( x( 1 , : ) , 'k' , 'linewidth' , 2  ) ;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%% Octave script to assess Kalman filter performance
%%% The script assumes the existence of a vector z of
%%% noise corrupted observations
N = length( z ) ; % number of Klamn filter iterations
Qfactor = 1 ;     % process noise mult factor
Rfactor = 1 ;     % measurement noise mult factor
delT = 1 ;        % time step

F = [ 1 delT ;
      0 1 ] ; % update matrix
      
H = [ 1 0 ] ; % measurement matrix

sigmaQ = Qfactor * sqrt( 0.01 ) ;
sigmaR = Rfactor * sqrt( 0.1 ) ;

Q = sigmaQ^2 * [ 1/3 1/2 ; 
                 1/2 1 ] ; % process noise covariance matrix
                 
P = 10 * Q ;
R = sigmaR^2 * [ 1 ] ;     % measurement noise covariance

% allocate space prior to loop
xhat = zeros( 2 , N ) ; % state estimate
nu = zeros( 1 , N ) ;   % innovation
S = zeros( 1 , N ) ;    % innovation (co)variance
q = zeros( 1 , N ) ;    % normalised innovation squared

for ii = 2 : N
  
  % predict using update matrix F and updated values of P and Q from previous ii loop
  % x_pred is the prediction of state values
  % p_pred is the prediction of the covariance matrix P given previous P and Q
  [ x_pred , p_pred ] = predict( xhat( : , ii - 1 ) , P , F , Q ) ;

  % measurement
  % "nu" is difference between predicted values and measured values of the states,
  % given the measurement matrix H and measurement noise R
  % "S" is a measurement of how the covariance matrix P is predicted to have changed 
  % during the above prediction step, given that this cannot actually be known or
  % directly measured as not all the underlying state changes can be directly measured. 
  [ nu( : , ii ) , S( : , ii ) ] = innovation( x_pred , p_pred , z( ii ) , H , R ) ; % orig
  
  % update step updates the state estimates and covariance matrix P using the Kalman gain,
  % which is internally calculated in the "innovation_update" function
  [ xhat( : , ii ) , P ] = innovation_update( x_pred , p_pred , nu( : , ii ) , S( : , ii ) , H ) ;

  % q is just a record keeping vector for later analysis of normalised innovation squared
  q( : , ii ) = nu( : , ii )' * inv( S( : , ii ) ) * nu( : , ii ) ;

endfor

sumQ = sum( q ) ; % determine Sum q which is Chiˆ2 on N d.o.f.
r = xcorr( nu ) ; % get autocorrealtion of innovation

% plot state and state estimate 
subplot( 2 , 2 , 1 ) ; plot( x( 1 , : ) , 'k' , 'linewidth' , 2 , xhat( 1 , : ) , 'r' , 'linewidth' , 2 ) ;
title( 'State and State Esimate' ) ;legend( 'State' , 'State Estimate' ) ; 

% plot innovation and 2sigma confidence interval
subplot( 2 , 2 , 2 ) ; plot( nu , 'b' , 'linewidth' , 2 , 2 * sqrt( S ) , 'r' , -2 * sqrt( S ) , 'r' , ...
zeros(1,N) , 'r.' , 'linewidth' , 1 ) ;
title( 'Innovation and 2sigma confidence intervals' ) ; legend( 'Innovation' , '2Sigma Levels' ) ; 

% plot normalised innovation squared
subplot( 2 , 2 , 3 ) ; plot( q , 'k' , 'linewidth' , 2 , mean( q ) .* ones( 1 , N ) , 'r' , 'linewidth' , 2 ) ;
title( 'Normalised innovation squared' ) ; 

% plot autocorrelation of innovation (normalised)
subplot( 2 , 2 , 4 ) ; plot( r( N : 2 * N - 1 ) / r( N ) , 'k' , 'linewidth' , 2 , zeros(1,N) , 'k.' , 'linewidth' , 1 ) ; 
title( 'Autocorrelation of innovation (normalised)' ) ;
Soon I shall be using this code, with some additions perhaps, to test various Kinematic model implementations of Kalman filters on financial time series with a view to identifying which models are suitable or not.

More in the near future.