Friday, 5 April 2019

Tests of Constant and Variable Acceleration Model Kalman Filters

In my last post I said that this next post would report the results of tests on a Constant Acceleration model Kalman filter, and the results are: fail, just like the Constant Velocity model, so I won't bore readers with reporting the details of the failed tests. However, tests of a Variable Acceleration model have been more successful, and so this post is about the results of tests on this model.

The Kalman filter states in this variable acceleration model are position, velocity, acceleration and three constants that are used to calculate the acceleration; b, c and d. The state transition matrix for this model is
F = [ 1 1 0 1/2 1/6 1/12 ; % position
      0 1 0 1 1/2 1/3 ;    % velocity
      0 0 0 1 1 1 ;        % acceleration
      0 0 0 1 0 0 ;        % b
      0 0 0 0 1 0 ;        % c
      0 0 0 0 0 1 ] ;      % d
with a measurement/observation matrix of
H = [ 1 0 0 0 0 0 ; 
      0 1 0 0 0 0 ; 
      0 0 1 0 0 0 ;
      0 0 0 1 0 0 ;
      0 0 0 0 1 0 ;
      0 0 0 0 0 1 ] ; % measurement matrix
Of course there are no "measurements" of b, c and d that can be directly taken from a price series, so my solution to this is to use Octave's fminunc function to minimise this
function [ J ] = kalman_bcd_constant_minimisation( bcd , position_measurements , velocity_measurements , accel_measurements )
pos_proj = position_measurements( 1 ) + velocity_measurements( 1 ) + [ 1/2 1/6 1/12 ] * bcd ;
vel_proj = velocity_measurements( 1 ) + [ 1 1/2 1/3 ] * bcd ;
accel_proj = sum( bcd ) ;  
pos_proj_cost = ( pos_proj - position_measurements( 2 ) )^2 ;
vel_proj_cost = ( vel_proj - velocity_measurements( 2 ) )^2 ;
accel_proj_cost = ( accel_proj - accel_measurements( 2 ) )^2 ;
J = pos_proj_cost + vel_proj_cost + accel_proj_cost ;  
endfunction
where position, velocity and acceleration measurements are calculated as described in my previous post. These cost minimised b, c and d constants are used as the noisy measurements for filter input purposes.

One implementation bug I experienced is that I was unable to use the ALS package to model the process noise as the dlqe function it calls kept crashing my session - so as a result I went with the crude, hand constructed covariance matrix as per the code in my previous post.

I wrote above that the tests on this model were the most successful to date. What I mean by this is that the Innovation Magnitude Bound test, the Innovation Zero-centred tests and Innovation auto correlation plot zero-centred tests on the position, velocity and acceleration states were passed 100% and about half of the Innovation auto correlation plot bound tests passed. The Innovation Whiteness (Auto correlation) tests varied from low single figure to 60+% percentage pass rates. However, I'm not 100% certain that I have correctly implemented this last set of tests so these results are suspect. Despite this reservation I think that I can say the Variable Acceleration Model expressed as a polynomial in time is the simplest kinematic model that is suitable for use on financial time series.

More in due course. 

Monday, 1 April 2019

Test of Constant Velocity Model Kalman Filter

Following on from my previous post, this post is a more detailed description of the testing methodology to test kinematic motion models on financial time series. The rationale behind the test(s) which are described below is different from the usual backtesting in that the test(s) are to determine whether the Kalman filter model is mismatched or not, i.e. whether the model innovations match the assumption that the residuals are Gaussian.

If readers have read the estimation lecture notes linked in my earlier post you'll know that one problem with estimating the above is that it is often difficult or impossible to know the covariance matrices for the process noise and the measurement noise, and getting these wrong effects the performance of the Kalman filter even if the underlying model is accurately specified. My "solution" to this is to use the Savitzky-Golay smoothing filter to smooth the time series and at the same time to get estimates for the velocity, acceleration etc. In real time of course this cannot be done because the Savitzky-Golay filter is a "centred" smoother and thus requires knowledge of data in the future. Using this information a crude estimate of the covariance matrices is thus derived, as shown in Octave code in the code box below:
% Estimate measurements for matrix R and covariance matrix Q by using the best
% fit quadratic using above sgolay filter coeffs
R = zeros( 2 , 2 ) ; Q = R ; % position and velocity model

% variance of position measurement
R( 1 , 1 ) = var( measured_pos( 1 , 7 : end - 6 ) .- real_pos( 1 , 7 : end - 6 ) ) ;

% variance of velocity measurement
R( 2 , 2 ) = var( measured_vel( 1 , 7 : end - 6 ) .- real_vel( 1 , 7 : end - 6 ) ) ;

% covariance of position process noise
Q( 1 , 1 ) = cov( ( real_pos( 1 , 7 : end - 7 ) .+ real_vel( 1 , 7 : end - 7 ) ) .- real_pos( 1 , 8 : end - 6 ) ) ;

% covariance of velocity process noise
Q( 2 , 2 ) = cov( diff( real_vel( 1 , 7 : end - 6 ) ) ) ;

% covariance of position process noise and velocity process noise
Q( 1 , 2 ) = cov( ( real_pos( 1 , 7 : end - 7 ) .+ real_vel( 1 , 7 : end - 7 ) ) .- real_pos( 1 , 8 : end - 6 ) , ...
diff( real_vel( 1 , 7 : end - 6 ) ) ) ;
Q( 2 , 1 ) = Q( 1 , 2 ) ;
where real_pos and real_vel are calculated using the Savitzky-Golay filter and represent the "true" position and velocity of a Constant Velocity model compared to the measured_pos and measured_vel measurements that are taken in real time. These crude estimates of the covariance matrices are used as inputs to the Autocovariance Least Squares Method implemented by means of the Octave ALS package to get a hopefully much more accurate estimate of the noise covariances. Armed with these estimates, the test code as outlined in the above mentioned estimation lecture notes paper and my previous post can then be run.

To avoid having to manually judge success or failure by looking at numerous plots such as this one
I run a few simple statistical tests
% Performance Tests for an Unmatched Kalman Filter

% Test 1 ---> Innovation Magnitude Bound Test
% check the innovation is bounded by +/- 2 * sqrt( S )
test_1 = sum( ( nu( 1 , 7 : end - 3 ) < 2 * s( 1 , 7 : end - 3 ) ) .* ...
( nu( 1 , 7 : end - 3 ) > -2 * s( 1 , 7 : end - 3 ) ) ) / length( s( 1 , 7 : end - 3 ) ) ;
% greater than 0.95 is a pass, i.e. less than 5 % of values out of bounds
% record result 
all_test_results( jj , 1 ) = test_1 ; 

%  Perform a T-test of the null hypothesis 'mean (X) == M' for a
%  sample X from a normal distribution with unknown mean and unknown
%  std deviation.  Under the null, the test statistic T has a
%  Student's t distribution.  The default value of M is 0.
%  If H is 0 the null hypothesis is accepted, if it is 1 the null
%  hypothesis is rejected.
zero_mean_innovation_pos = ttest( nu( 1 , : ) , 0 ) ; % innovations are centred around zero?
% H = 0 is a pass, i.e. centred around zero
% record result 
all_test_results( jj , 2 ) = zero_mean_innovation_pos ; 

% Test 2 ---> Normalised Innovation Squared Test
%  > qchisq( c( 0.05 , 0.95 ) , df = 1 , lower.tail = FALSE )
%  [1] 3.84145882 0.00393214
%  > qchisq( c( 0.05 , 0.95 ) , df = 2 , lower.tail = FALSE )
%  [1] 5.9914645 0.1025866

d_of_freedom = sum( sum( H ) ) ;
switch ( d_of_freedom )
  case 1
  lower_CI = 0.00393214 ;
  upper_CI = 3.84145882 ;
case 2
  lower_CI = 0.1025866 ;
  upper_CI = 5.9914645 ;  
endswitch

normalised_innovation_squared_chi2_test_pos = ( sum_Q_pos > lower_CI ) * ( sum_Q_pos < upper_CI ) ;
% value of 1 is a pass
% record result 
all_test_results( jj , 3 ) = normalised_innovation_squared_chi2_test_pos ;  

% Test 3 ---> Innovation Whiteness (Autocorrelation) Test
% check the innovation is bounded by +/- 2 * std
%level = 2 * std( r_pos( N : 2 * N - 1 ) / r_pos( N ) ) ;
level = 1.96 / sqrt( N - burn_in - 2 ) ;
##test_3 = sum( ( r_pos( N : 2 * N - 1 ) / r_pos( N ) < level ) .* ( r_pos( N : 2 * N - 1 ) / r_pos( N ) > -level ) ) / ...
##length( r_pos( N : 2 * N - 1 ) / r_pos( N ) ) ;

test_3 = sum( ( r_pos( no_lags + 1 : end ) < level ) .* ( r_pos( no_lags + 1 : end ) > -level ) ) ...
/ length( r_pos( no_lags + 1 : end) ) ;

% greater than 0.95 is a pass, i.e. less than 5 % of values out of bounds
% record result 
all_test_results( jj , 4 ) = test_3 ; 

zero_mean_autocorrelation_plot_pos = ttest( r_pos( no_lags + 1 : end ) , 0 ) ; % centred around zero
% H = 0 is a pass, i.e. centred around zero
% record result 
all_test_results( jj , 5 ) = zero_mean_autocorrelation_plot_pos ; 
across a range of forex crosses and my currency indices and see if, say 95%, of them pass these tests.

As a result of my first tests thus outlined on the Constant Velocity Kinematic Model I think can confidently say that this model is "mismatched," despite it sometimes being described as the simplest "useful" model, and therefore readers should not use this model in a Kalman filter on financial time series. It is worth noting that this model is named Model One in the Trend Without Hiccups paper, which in turn is very closely related to a local linear trend model, which the authors name Model Two. Given that the underlying model(s) for both are mismatched it is not surprising that the authors report disappointing results from these models.

In my next post I shall report results on the Constant Acceleration Model.