To that end, I have indulged in a bit of feature engineering and written the following Octave function,
## Copyright (C) 2019 dekalog
##
## This program is free software: you can redistribute it and/or modify it
## under the terms of the GNU General Public License as published by
## the Free Software Foundation, either version 3 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program. If not, see
## www.gnu.org.
## -*- texinfo -*-
## @deftypefn {} {@var{emb1}, @var{emb2} =} cyclic_embedding (@var{price}, @var{period})
##
## Inputs are a price vector and a period vector.
##
## The function normalises the price between -1 and +1 over an adaptive period lookback.
##
## The outputs are two matrices of 3 columns each - the first column is normalised price
## and the second and third are delay embeddings of Tau and 2 x Tau, Tau being equal to
## one quarter of the adaptive period vector, the theoretical ideal Tau for sinusoidal
## waveforms.
##
## The first output matrix, EMB1, normalises the Tau and 2 x Tau columns according to the
## most recent max high/min low; EMB2 Tau and 2 x Tau are normalised according to the
## max high/min low in force at the delay embedding time.
##
## @seealso{}
## @end deftypefn
## Author: dekalog
## Created: 2019-09-18
function [ emb1 , emb2 , prob_matrix ] = cyclic_embedding( price , period )
price_smooth = price ;
emb1 = repmat( price , 1 , 3 ) ;
emb2 = emb1 ;
coeffs = generalised_sgolay_filter_coeffs( 5 , 2 , 0 ) ; coeffs = coeffs' ;
## initialising loop
price_smooth( 1 : 3 ) = coeffs( 1 : 3 , : ) * price( 1 : 5 ) ;
for ii = 4 : 48
price_smooth( ii ) = coeffs( 3 , : ) * price( ii - 2 : ii + 2 ) ;
endfor
price_smooth( 49 : 50 ) = coeffs( 4 : 5 , : ) * price( 46 : 50 ) ;
## end initialising loop
coeffs( 1 : 2 , : ) = [] ;
for ii = 51 : size( price , 1 )
price_smooth( ii - 2 : ii ) = coeffs * price( ii - 4 : ii ) ;
max_r = max( price_smooth( ii - period( ii ) : ii ) ) ;
min_r = min( price_smooth( ii - period( ii ) : ii ) ) ;
## period is exactly divisable by 4 ( and 2 ), e.g. 8 12 16 20 24 28 32 36 40 44 48 etc?
if ( rem( period( ii ) , 4 ) == 0 )
emb1( ii , 1 ) = 2 * ( ( price_smooth( ii ) - min_r ) / ( max_r - min_r ) - 0.5 ) ;
emb1( ii , 2 ) = 2 * ( ( price_smooth( ii - round( period( ii ) / 4 ) ) - min_r ) / ( max_r - min_r ) - 0.5 ) ;
emb1( ii , 3 ) = 2 * ( ( price_smooth( ii - round( period( ii ) / 2 ) ) - min_r ) / ( max_r - min_r ) - 0.5 ) ;
emb2( ii , 1 ) = emb1( ii , 1 ) ;
emb2( ii , 2 ) = emb2( ii - round( period( ii ) / 4 ) , 1 ) ;
emb2( ii , 3 ) = emb2( ii - round( period( ii ) / 2 ) , 1 ) ;
## periods 10 14 18 22 26 30 34 38 42 46 50
elseif ( rem( period( ii ) , 2 ) == 0 && rem( period( ii ) , 4 ) == 2 )
emb1( ii , 1 ) = 2 * ( ( price_smooth( ii ) - min_r ) / ( max_r - min_r ) - 0.5 ) ;
emb1( ii , 2 ) = 2 * ( ( ( 0.5*price_smooth( ii - round( period( ii ) / 4 ) ) + 0.5*price_smooth( ii - round( period( ii ) / 4 ) + 1 ) ) - min_r )...
/ ( max_r - min_r ) - 0.5 ) ;
emb1( ii , 3 ) = 2 * ( ( price_smooth( ii - round( period( ii ) / 2 ) ) - min_r ) / ( max_r - min_r ) - 0.5 ) ;
emb2( ii , 1 ) = emb1( ii , 1 ) ;
emb2( ii , 2 ) = 0.5*emb2( ii - round( period( ii ) / 4 ) , 1 ) + 0.5*emb2( ii - round( period( ii ) / 4 ) + 1 , 1 ) ;
emb2( ii , 3 ) = emb2( ii - round( period( ii ) / 2 ) , 1 ) ;
## periods 9 13 17 21 25 29 33 37 41 45 49
elseif ( rem( period( ii ) , 2 ) == 1 && rem( period( ii ) , 4 ) == 1 )
emb1( ii , 1 ) = 2 * ( ( price_smooth( ii ) - min_r ) / ( max_r - min_r ) - 0.5 ) ;
emb1( ii , 2 ) = 2 * ( ( ( 0.75*price_smooth( ii - round( period( ii ) / 4 ) ) + 0.25*price_smooth( ii - round( period( ii ) / 4 ) - 1 ) ) - min_r )...
/ ( max_r - min_r ) - 0.5 ) ;
emb1( ii , 3 ) = 2 * ( ( ( 0.5*price_smooth( ii - round( period( ii ) / 2 ) ) + 0.5*price_smooth( ii - round( period( ii ) / 2 ) + 1 ) ) - min_r )...
/ ( max_r - min_r ) - 0.5 ) ;
emb2( ii , 1 ) = emb1( ii , 1 ) ;
emb2( ii , 2 ) = 0.75*emb2( ii - round( period( ii ) / 4 ) , 1 ) + 0.25*emb2( ii - round( period( ii ) / 4 ) - 1 , 1 ) ;
emb2( ii , 3 ) = 0.5*emb2( ii - round( period( ii ) / 2 ) , 1 ) + 0.5*emb2( ii - round( period( ii ) / 2 ) + 1 , 1 ) ;
## periods 11 15 19 23 27 31 35 39 43 47
elseif ( rem( period( ii ) , 2 ) == 1 && rem( period( ii ) , 4 ) == 3 )
emb1( ii , 1 ) = 2 * ( ( price_smooth( ii ) - min_r ) / ( max_r - min_r ) - 0.5 ) ;
emb1( ii , 2 ) = 2 * ( ( ( 0.75*price_smooth( ii - round( period( ii ) / 4 ) ) + 0.25*price_smooth( ii - round( period( ii ) / 4 ) + 1 ) ) - min_r )...
/ ( max_r - min_r ) - 0.5 ) ;
emb1( ii , 3 ) = 2 * ( ( ( 0.5*price_smooth( ii - round( period( ii ) / 2 ) ) + 0.5*price_smooth( ii - round( period( ii ) / 2 ) + 1 ) ) - min_r )...
/ ( max_r - min_r ) - 0.5 ) ;
emb2( ii , 1 ) = emb1( ii , 1 ) ;
emb2( ii , 2 ) = 0.75*emb2( ii - round( period( ii ) / 4 ) , 1 ) + 0.25*emb2( ii - round( period( ii ) / 4 ) + 1 , 1 ) ;
emb2( ii , 3 ) = 0.5*emb2( ii - round( period( ii ) / 2 ) , 1 ) + 0.5*emb2( ii - round( period( ii ) / 2 ) + 1 , 1 ) ;
endif
endfor ## end of embedding features creation
feature_peak = emb2 * [ 1 0 ; -1 1 ; 0 -1 ] * [ 1 ; 1 ] ;
feature_trough = zeros( size( feature_peak ) ) ;
ix = find( feature_peak < 0 ) ;
feature_trough( ix ) = abs( feature_peak( ix ) ) ;
feature_peak( feature_peak <= 0 ) = 0 ;
## https://stats.stackexchange.com/questions/132652/how-to-determine-which-distribution-fits-my-data-best
## Weibull distribution with shape = 219.68 and scale = 1.94 for turn +/- 1 bar
## Weibull distribution with shape = 85.88 and scale = 1.84 for turn +/- 2 bar
## This comes from bayesian testing of cutoff value of feature_peak/feature_trough for highs/lows +/- 1 bar.
## The function used to get these values is "bayes_train_cyclic_turn_prob_of_embedding.m" which calls
## "bayes_optim_of_cyclic_embedding_conv_function" in /home/dekalog/Documents/octave/turning_points
scale = 1.84 ; shape = 85.88 ;
prob_matrix = zeros( size( feature_peak , 1 ) , 3 ) ;
prob_matrix( : , 1 ) = wblcdf( feature_peak , scale , shape ) ;
prob_matrix( : , 2 ) = wblcdf( feature_trough , scale , shape ) ;
prob_matrix( : , 3 ) = 1 .- sum( prob_matrix( : , 1 : 2 ) , 2 ) ;
endfunction
which again is a work in progress.This takes as input a price and a period vector and outputs features as per the plots in the above linked ideal cyclic tau post plus a matrix of probabilities for price action being at a cyclic high/low. This matrix is the result of Monte Carlo Bayesian Optimisation over ideal sine wave prices to get a cutoff value for the derived feature in the above function. The probability distribution used for this probability matrix is the Weibull distribution, which has been determined by following the routine outlined in this "how to determine which distribution fits my data best" forum post and using the R statistical software platform.
The hard coded values for the cutoff in the above code are from the results of optimisation on pure sine waves. As I write this post there are optimisation routines running on sine waves with 20 db noise added.
More in due course.
No comments:
Post a Comment