A few posts ago I stated that I was reassessing my dominant cycle code and to that end I have been doing a fair bit of research on the web and found some really cool stuff, such as working Octave/MATLAB code for Empirical Mode Decomposition, available here, and a couple of toolboxes here and here. However, the purpose of this post is to introduce a new idea that is inspired by a Master's thesis, authored by Yizheng Liao.

The big idea contained in this thesis is online instantaneous frequency measurement by means of measuring zero crossings of a signal and its Hilbert transform, otherwise known as the analytic signal. A simple demonstration is shown below

where the cyan line is the original (real part) signal and red the Hilbert transform (imaginary part) of a noisy sine wave signal. The Hilbert transform of sin(x) is -cos(x), which is the equivalent of sin(x-90) using degrees rather than radian notation for the phase of x. Since we can delay the phase by any amount, it is possible to have an array of phase delays to produce more zero crossings per full cycle period than just the 4 that occur with the analytic signal only. This is shown below

where, as before, the cyan is the noisy sine wave signal, the dark blue is the original smooth sine wave signal to which the noise is added, the red is the measured sine wave extracted from the noisy signal and the green, magenta and yellow lines are sine plots of the phase of the red sine delayed by 30, 60, 90, 120, 150 and 180 degrees. This last 180 degree delay line is actually superfluous as it crosses the zero line at the same time as the original signal, but I show it just for completeness. The white cursor line shows the zero line and the beginning of a complete cycle. Apart from being able to increase the number of measurable zero crossings, another advantage I can see from using this sine wave extraction method is that the extracted sine waves are much smoother than the analytic signal, hence the zero crossings themselves can be used rather than the crossings of hysteresis lines (read the paper!), avoiding any delays due to waiting to cross said hysteresis lines.

For those who are interested, the sine wave indicator I used for this is John Ehler's sine wave indicator, a nice exposition of which is here and code available here.

More on this in due course.

# Dekalog Blog

"Trading is statistics and time series analysis." This blog details my progress in developing a systematic trading system for use on the futures and forex markets, with discussion of the various indicators and other inputs used in the creation of the system. Also discussed are some of the issues/problems encountered during this development process. Within the blog posts there are links to other web pages that are/have been useful to me.

## Monday, 16 June 2014

## Sunday, 25 May 2014

### Updated Cauchy-Schwarz Matching Algorithm

Following on from my previous post, below is a code box showing a slightly improved Cauchy-Schwarz matching algorithm, improved in the sense that this implementation has a slightly better effect size over random when the test runs of the previous post's version are compared with this version.

Firstly, when training a feedforward neural network it is normal that a certain number of samples are held out of the training set for use as a cross validation set. The point of this is to ensure that the trained NN will generalise well to as yet unseen data. In the case of my rolling training regime this does not apply. The NN that is being trained for the "current bar" will be used once to classify the "current bar" and then thrown away. The "next bar" will have a completely new NN trained specifically for it, which in its turn will be discarded, and so on and so on along the whole price history. There is no need to ensure generalisation of any specifically trained NN. This being the case, all the training set examples are used in the training and early stopping is implemented by a crude heuristic of classification accuracy on the training set: training stops when the classification error rate on the whole training set is <= 5%. Further experience with this in the future may lead me to make some adjustments, but for now this is what I am going with.

A second reason for adopting this approach stems from my reading of this book wherein it is stated that on financial time series the "traditional" machine learning error metrics can be misleading. It cites a (theoretical?) example of a profitable trading system that has been trained/optimised for maximum profit but has a counter-intuitive, negative R-squared. The explanation for this lies in the heavy tails of price distribution(s). It is in these tails that the extreme returns reside and where the big profits/losses are to be made. However, by using a more traditional error metric such as least squares a ML algorithm might concentrate on the central area of a price distribution in order to reduce the error metric on the majority of price instances and thereby ignore the tails, producing a nice, low error but a useless system. The converse can be true for a good system, in that the ML least squares metric can be rubbish but the relevant performance metric (max profit, min draw down, risk adjusted return etc.) of the system great.

It is for these reasons that I have adopted my current approach.

```
function [ top_matches ] = rolling_cauchy_schwarz_matching_algo_2( open_ch, high_ch, low_ch, close_ch, period )
% pre-allocate vectors in memory
cauchy_schwarz_values = zeros( size(close_ch,1) , 1 ) ;
top_matches = zeros( size(close_ch,1), 100 ) ;
% select price bar to train nn on
for jj = size(close_ch,1)-250 : size(close_ch,1)
lookback = period( jj ) ;
sample_to_match = [ close_ch( jj-lookback : jj )' high_ch( jj-9 : jj )' low_ch( jj-9 : jj )' ( close_ch( jj-4 : jj ).-open_ch( jj-4 : jj ) )' ] ;
norm_sample_to_match = norm( sample_to_match ) ;
% for this jj train_bar, calculate cauchy_schwarz matching values in the historical record up to index jj-2
for ii = 50 : jj - 2
cauchy_schwarz_values(ii) = abs( sample_to_match * [ close_ch( ii-lookback : ii ) ; high_ch( ii-9 : ii ) ; low_ch( ii-9 : ii ) ; ( close_ch( ii-4 : ii ).-open_ch( ii-4 : ii ) ) ] ) / ( norm_sample_to_match * norm( [ close_ch( ii-lookback : ii , 1 ) ; high_ch( ii-9 : ii ) ; low_ch( ii-9 : ii ) ; ( close_ch( ii-4 : ii ).-open_ch( ii-4 : ii ) ) ] ) ) ;
end % end of ii loop
% get the top 100 matches for this price bar
[ s, sort_index ] = sort( cauchy_schwarz_values ) ;
top_matches( jj, : ) = sort_index( end-99 : end )' ;
end % end of jj loop
end % end of function
```

The inputs are channel normalised prices, with the length of the channel being adaptive to the dominant cycle period. This function is called as part of a rolling neural net training regime to select the top n (n = 100 in this case) matches in the historical record as training data. The actual NN training code is a close adaptation of the code in my neural net walkforward training post, but with a couple of important caveats which are discussed below.Firstly, when training a feedforward neural network it is normal that a certain number of samples are held out of the training set for use as a cross validation set. The point of this is to ensure that the trained NN will generalise well to as yet unseen data. In the case of my rolling training regime this does not apply. The NN that is being trained for the "current bar" will be used once to classify the "current bar" and then thrown away. The "next bar" will have a completely new NN trained specifically for it, which in its turn will be discarded, and so on and so on along the whole price history. There is no need to ensure generalisation of any specifically trained NN. This being the case, all the training set examples are used in the training and early stopping is implemented by a crude heuristic of classification accuracy on the training set: training stops when the classification error rate on the whole training set is <= 5%. Further experience with this in the future may lead me to make some adjustments, but for now this is what I am going with.

A second reason for adopting this approach stems from my reading of this book wherein it is stated that on financial time series the "traditional" machine learning error metrics can be misleading. It cites a (theoretical?) example of a profitable trading system that has been trained/optimised for maximum profit but has a counter-intuitive, negative R-squared. The explanation for this lies in the heavy tails of price distribution(s). It is in these tails that the extreme returns reside and where the big profits/losses are to be made. However, by using a more traditional error metric such as least squares a ML algorithm might concentrate on the central area of a price distribution in order to reduce the error metric on the majority of price instances and thereby ignore the tails, producing a nice, low error but a useless system. The converse can be true for a good system, in that the ML least squares metric can be rubbish but the relevant performance metric (max profit, min draw down, risk adjusted return etc.) of the system great.

It is for these reasons that I have adopted my current approach.

Labels:
Data Mining,
Machine Learning,
Octave,
RBM

## Wednesday, 16 April 2014

### Effect Size of Cauchy-Schwarz Matching Algorithm

In my last post I talked about using the Cauchy-Schwarz Inequality to match similar periods of price history to one another. This post is about the more rigorous testing of this idea.

I decided to use the Effect size as the test of choice, for which there are nice introductions here and here. A basic description of the way I implemented the test is as follows:-

Running the code on the EURUSD forex pair and plotting histograms gives this:

where figures 1 and 2 are for the Cauchy-Schwarz values and figures 3 and 4 are Distance correlation values for comparative purposes, and which I won't discuss in this post.

On seeing this for the first time I was somewhat surprised as I had expected the effect size distribution(s) to be approximately normal because all the test calculations are based on averages. However, it was a pleasant surprise due to the peak in values at the right hand side, showing a possible substantial effect size. To make things clearer here are the percentiles of the four histograms above:

All in all a successful test, which encourages me to adopt the Cauchy-Schwarz inequality, but before I do there are one or two more tweaks I would like to test. This will be the subject of my next post.

I decided to use the Effect size as the test of choice, for which there are nice introductions here and here. A basic description of the way I implemented the test is as follows:-

- Randomly pick a section of price history, which will be used as the price history for the selection algorithm to match
- Take the 5 consecutive bars immediately following the above section of price history and store as the "target"
- Create a control group of random matches to the above "target" by randomly selecting 10 separate 5 bar pieces of price history and calculating the Cauchy-Schwarz values of these 10 compared to the target and record the average value of these values. Repeat this step N times to create a distribution of randomly matched, average target-to-random-price Cauchy-Schwarz values. By virtue of the Central limit theorem it can be expected that this distribution is approximately normal
- Using the matching algorithm (as described in the previous post) get the closest 10 matches in the price history to the random selection from step 1
- Get the 5 consecutive bars immediately following the 10 matches from step 4 and calculate their Cauchy-Schwarz values viz-a-viz the "target" and record the average value of these 10 values. This average value is the "experimental" value
- Using the mean and standard deviation of the control group distribution from step 3, calculate the effect size of the experimental value and record this effect size value
- Repeat all the above steps M times to form an effect size value distribution

```
clear all
% load price file of interest
filename = input( 'Enter filename for prices, e.g. es or esmatrix: ' , 's' ) ;
data = load( "-ascii" , filename ) ;
% get tick size
switch filename
case { "cc" }
tick = 1 ;
case { "gc" "lb" "pl" "sm" "sp" }
tick = 0.1 ;
case { "ausyen" "bo" "cl" "ct" "dx" "euryen" "gbpyen" "sb" "usdyen" }
tick = 0.01 ;
case { "c" "ng" }
tick = 0.001 ;
case { "auscad" "aususd" "euraus" "eurcad" "eurchf" "eurgbp" "eurusd" "gbpchf" "gbpusd" "ho" "rb" "usdcad" "usdchf" }
tick = 0.0001 ;
case { "c" "o" "s" "es" "nd" "w" }
tick = 0.25 ;
case { "fc" "lc" "lh" "pb" }
tick = 0.025 ;
case { "ed" }
tick = 0.0025 ;
case { "si" }
tick = 0.5 ;
case { "hg" "kc" "oj" "pa" }
tick = 0.05 ;
case { "ty" "us" }
tick = 0.015625 ;
case { "ccmatrix" }
tick = 1 ;
case { "gcmatrix" "lbmatrix" "plmatrix" "smmatrix" "spmatrix" }
tick = 0.1 ;
case { "ausyenmatrix" "bomatrix" "clmatrix" "ctmatrix" "dxmatrix" "euryenmatrix" "gbpyenmatrix" "sbmatrix" "usdyenmatrix" }
tick = 0.01 ;
case { "cmatrix" "ngmatrix" }
tick = 0.001 ;
case { "auscadmatrix" "aususdmatrix" "eurausmatrix" "eurcadmatrix" "eurchfmatrix" "eurgbpmatrix" "eurusdmatrix" "gbpchfmatrix" "gbpusdmatrix" "homatrix" "rbmatrix" "usdcadmatrix" "usdchfmatrix" }
tick = 0.0001 ;
case { "cmatrix" "omatrix" "smatrix" "esmatrix" "ndmatrix" "wmatrix" }
tick = 0.25 ;
case { "fcmatrix" "lcmatrix" "lhmatrix" "pbmatrix" }
tick = 0.025 ;
case { "edmatrix" }
tick = 0.0025 ;
case { "simatrix" }
tick = 0.5 ;
case { "hgmatrix" "kcmatrix" "ojmatrix" "pamatrix" }
tick = 0.05 ;
case { "tymatrix" "usmatrix" }
tick = 0.015625 ;
endswitch
open = data( : , 4 ) ;
high = data( : , 5 ) ;
low = data( : , 6 ) ;
close = data( : , 7 ) ;
price = vwap( open, high, low, close, tick ) ;
clear -exclusive price tick
% first, get the lookback parameters on real prices
[ sine, sinelead, period ] = sinewave_indicator( price ) ;
[ max_price, min_price, channel_price ] = adaptive_lookback_max_min( price, period, tick ) ;
smooth_price = smooth_2_5( price ) ;
[ max_smooth_price, min_smooth_price, smooth_channel_price ] = adaptive_lookback_max_min( smooth_price, period, tick ) ;
cauchy_schwarz_values = zeros( size(channel_price,1) , 1 ) ;
cauchy_schwarz_values_smooth = zeros( size(channel_price,1) , 1 ) ;
% set up all recording vectors
N = 10 ; % must be >= 10
% record these values
matches_values = zeros( N, 1 ) ;
matches_smooth_values = zeros( N, 1 ) ;
distcorr_values = zeros( N, 1 ) ;
distcorr_values_smooth = zeros( N, 1 ) ;
% vectors to record averages
random_matches_values_averages = zeros( 750, 1 ) ;
random_matches_smooth_values_averages = zeros( 750, 1 ) ;
random_distcorr_averages = zeros( 750, 1 ) ;
random_distcorr_smooth_averages = zeros( 750, 1 ) ;
% effect size vectors
effect_size = zeros( 750, 1 ) ;
effect_size_smooth = zeros( 750, 1 ) ;
effect_size_distcorr = zeros( 750, 1 ) ;
effect_size_distcorr_smooth = zeros( 750, 1 ) ;
for kk = 1 : 750
% first, get a random pick from the price history and all its associated values
sample_index = randperm( (size(price,1)-55), 1 ) .+ 50 ;
lookback = period( sample_index ) ;
sample_to_match = channel_price( sample_index-lookback : sample_index )' ;
sample_to_match_smooth = smooth_channel_price( sample_index-lookback : sample_index )' ;
projection_to_match = ( ( price( (sample_index+1):(sample_index+5) ) .- min_price(sample_index) ) ./ ( max_price(sample_index)-min_price(sample_index) ) )' ;
projection_to_match_smooth = ( ( price( (sample_index+1):(sample_index+5) ) .- min_smooth_price(sample_index) ) ./ ( max_smooth_price(sample_index)-min_smooth_price(sample_index) ) )' ;
% for this pick, calculate cauchy_schwarz_values
for ii = 50 : size( price, 1 )
cauchy_schwarz_values(ii) = abs( sample_to_match * channel_price( ii-lookback : ii ) ) / ( norm(sample_to_match) * norm( channel_price( ii-lookback : ii , 1 ) ) ) ;
cauchy_schwarz_values_smooth(ii) = abs( sample_to_match_smooth * smooth_channel_price( ii-lookback : ii ) ) / ( norm(sample_to_match_smooth) * norm( smooth_channel_price( ii-lookback : ii , 1 ) ) ) ;
end
% now set the values for sample_to_match +/- 2 to zero to avoid matching with itself
cauchy_schwarz_values( sample_index-2 : sample_index+2 ) = 0.0 ;
cauchy_schwarz_values_smooth( sample_index-2 : sample_index+2 ) = 0.0 ;
% set the last six values to zero to allow for projections
cauchy_schwarz_values( end-5 : end ) = 0.0 ;
cauchy_schwarz_values_smooth( end-5 : end ) = 0.0 ;
% get the top N matches
for ii = 1 : N
[ max_val, ix ] = max( cauchy_schwarz_values ) ;
norm_price_proj_match = ( ( price( ((ix)+1):((ix)+5) ) .- min_price(ix) ) ./ ( max_price(ix)-min_price(ix) ) ) ;
matches_values(ii) = abs( projection_to_match * norm_price_proj_match ) / ( norm(projection_to_match) * norm( norm_price_proj_match ) ) ;
cauchy_schwarz_values( ix-2 : ix+2 ) = 0.0 ;
[ max_val, ix ] = max( cauchy_schwarz_values_smooth ) ;
norm_price_smooth_proj_match = ( ( price( ((ix)+1):((ix)+5) ) .- min_smooth_price(ix) ) ./ ( max_smooth_price(ix)-min_smooth_price(ix) ) ) ;
matches_smooth_values(ii) = abs( projection_to_match_smooth * norm_price_smooth_proj_match ) / ( norm(projection_to_match_smooth) * norm( norm_price_smooth_proj_match ) ) ;
cauchy_schwarz_values_smooth( ix-2 : ix+2 ) = 0.0 ;
distcorr_values(ii) = distcorr( projection_to_match', norm_price_proj_match ) ;
distcorr_values_smooth(ii) = distcorr( projection_to_match_smooth', norm_price_smooth_proj_match ) ;
end % end of top N matches loop
% get and record averages for the top N matches
matches_values_average = mean( matches_values ) ;
matches_smooth_values_average = mean( matches_smooth_values ) ;
distcorr_average = mean( distcorr_values ) ;
distcorr_smooth_average = mean( distcorr_values_smooth ) ;
% now create a null distribution of random price projections
% randomly choosen from prices
for jj = 1 : 750
random_index = randperm( (size(price,1)-55), 10 ) .+ 50 ;
for ii = 1 : 10
norm_price_proj_match = ( ( price( (random_index(ii)+1):(random_index(ii)+5) ) .- min_price(random_index(ii)) ) ./ ( max_price(random_index(ii))-min_price(random_index(ii)) ) ) ;
matches_values(ii) = abs( projection_to_match * norm_price_proj_match ) / ( norm(projection_to_match) * norm( norm_price_proj_match ) ) ;
norm_price_smooth_proj_match = ( ( price( (random_index(ii)+1):(random_index(ii)+5) ) .- min_smooth_price(random_index(ii)) ) ./ ( max_smooth_price(random_index(ii))-min_smooth_price(random_index(ii)) ) ) ;
matches_smooth_values(ii) = abs( projection_to_match_smooth * norm_price_smooth_proj_match ) / ( norm(projection_to_match_smooth) * norm( norm_price_smooth_proj_match ) ) ;
distcorr_values(ii) = distcorr( projection_to_match', norm_price_proj_match ) ;
distcorr_values_smooth(ii) = distcorr( projection_to_match_smooth', norm_price_smooth_proj_match ) ;
end % end of random index ii loop
random_matches_values_averages(jj) = mean( matches_values ) ;
random_matches_smooth_values_averages(jj) = mean( matches_smooth_values ) ;
random_distcorr_averages(jj) = mean( distcorr_values ) ;
random_distcorr_smooth_averages(jj) = mean( distcorr_values_smooth ) ;
end % end jj loop
effect_size(kk) = ( matches_values_average - mean( random_matches_values_averages ) ) / std( random_matches_values_averages ) ;
effect_size_smooth(kk) = ( matches_smooth_values_average - mean( random_matches_smooth_values_averages ) ) / std( random_matches_smooth_values_averages ) ;
effect_size_distcorr(kk) = ( distcorr_average - mean( random_distcorr_averages ) ) / std( random_distcorr_averages ) ;
effect_size_distcorr_smooth(kk) = ( distcorr_smooth_average - mean( random_distcorr_smooth_averages ) ) / std( random_distcorr_smooth_averages ) ;
end % end kk loop
all_effect_sizes = [ effect_size, effect_size_smooth, effect_size_distcorr, effect_size_distcorr_smooth ] ;
dlmwrite( 'all_effect_sizes', all_effect_sizes )
```

__Results__Running the code on the EURUSD forex pair and plotting histograms gives this:

where figures 1 and 2 are for the Cauchy-Schwarz values and figures 3 and 4 are Distance correlation values for comparative purposes, and which I won't discuss in this post.

On seeing this for the first time I was somewhat surprised as I had expected the effect size distribution(s) to be approximately normal because all the test calculations are based on averages. However, it was a pleasant surprise due to the peak in values at the right hand side, showing a possible substantial effect size. To make things clearer here are the percentiles of the four histograms above:

```
0.00000 -5.08931 -4.79836 -3.05912 -3.65668
0.01000 -3.61724 -3.20229 -2.46932 -2.45201
0.02000 -3.39841 -2.81969 -2.21764 -2.20515
0.03000 -3.00404 -2.49009 -1.89562 -2.05380
0.04000 -2.66393 -2.35174 -1.80412 -1.91032
0.05000 -2.52514 -2.03670 -1.68800 -1.71335
0.06000 -2.22298 -1.91877 -1.59624 -1.61089
0.07000 -2.07188 -1.88256 -1.52058 -1.48763
0.08000 -1.93247 -1.79727 -1.45786 -1.42828
0.09000 -1.71065 -1.66522 -1.36500 -1.35917
0.10000 -1.59803 -1.58943 -1.31570 -1.31809
0.11000 -1.44325 -1.53087 -1.24996 -1.28199
0.12000 -1.38234 -1.44477 -1.20741 -1.21903
0.13000 -1.22440 -1.32961 -1.17397 -1.17619
0.14000 -1.14728 -1.29863 -1.12755 -1.10768
0.15000 -1.05431 -1.19564 -1.09108 -1.08591
0.16000 -0.93505 -1.10204 -1.06018 -1.04149
0.17000 -0.88272 -1.05314 -1.00478 -1.00248
0.18000 -0.79723 -1.01394 -0.96389 -0.97786
0.19000 -0.66914 -0.98012 -0.92679 -0.96108
0.20000 -0.58700 -0.88085 -0.89990 -0.91932
0.21000 -0.52548 -0.84929 -0.86971 -0.87901
0.22000 -0.44446 -0.82412 -0.83585 -0.84796
0.23000 -0.40282 -0.76732 -0.80526 -0.82919
0.24000 -0.36407 -0.68691 -0.75698 -0.80794
0.25000 -0.32960 -0.65915 -0.73488 -0.77562
0.26000 -0.21295 -0.61977 -0.64435 -0.73739
0.27000 -0.13202 -0.57937 -0.60995 -0.70502
0.28000 -0.07516 -0.50076 -0.54194 -0.67219
0.29000 -0.00845 -0.43592 -0.51490 -0.61872
0.30000 0.04592 -0.35829 -0.49879 -0.59214
0.31000 0.08091 -0.29488 -0.47284 -0.56236
0.32000 0.11649 -0.24116 -0.44727 -0.52599
0.33000 0.20059 -0.20343 -0.38769 -0.48137
0.34000 0.29594 -0.17594 -0.32956 -0.46426
0.35000 0.33832 -0.12867 -0.31033 -0.44284
0.36000 0.38473 -0.10445 -0.28196 -0.41119
0.37000 0.42759 -0.07363 -0.25178 -0.37141
0.38000 0.45809 -0.03128 -0.21921 -0.33732
0.39000 0.51545 0.00103 -0.19434 -0.30017
0.40000 0.56191 0.05818 -0.16896 -0.26556
0.41000 0.60728 0.09308 -0.15057 -0.23521
0.42000 0.63342 0.13244 -0.13961 -0.21845
0.43000 0.67951 0.17094 -0.11061 -0.20428
0.44000 0.69882 0.22192 -0.05734 -0.19437
0.45000 0.75193 0.25773 -0.03497 -0.16183
0.46000 0.79911 0.30891 -0.00695 -0.13580
0.47000 0.84183 0.35623 0.01927 -0.11969
0.48000 0.91024 0.38352 0.05030 -0.10521
0.49000 0.94791 0.42460 0.06230 -0.07570
0.50000 1.01034 0.48288 0.08379 -0.05241
0.51000 1.04269 0.54956 0.11360 -0.03448
0.52000 1.07527 0.62407 0.13003 -0.00864
0.53000 1.10908 0.65434 0.16910 0.01793
0.54000 1.12665 0.69819 0.19257 0.03546
0.55000 1.13850 0.75071 0.20893 0.05331
0.56000 1.17187 0.78859 0.24099 0.08191
0.57000 1.19397 0.82243 0.25359 0.10432
0.58000 1.22162 0.87152 0.26988 0.13012
0.59000 1.24032 0.91341 0.29813 0.16376
0.60000 1.26567 0.96977 0.32279 0.20620
0.61000 1.29286 1.00221 0.36456 0.23991
0.62000 1.32750 1.03669 0.37966 0.28647
0.63000 1.35170 1.07326 0.43526 0.31652
0.64000 1.38017 1.12882 0.45922 0.35653
0.65000 1.39101 1.15719 0.47552 0.37813
0.66000 1.41716 1.17241 0.49585 0.41064
0.67000 1.44582 1.21725 0.50760 0.42996
0.68000 1.46310 1.26081 0.56082 0.44876
0.69000 1.47664 1.27710 0.58793 0.49889
0.70000 1.49066 1.31164 0.60148 0.54122
0.71000 1.49891 1.34165 0.64747 0.57689
0.72000 1.50470 1.36688 0.67315 0.59469
0.73000 1.51436 1.38746 0.70662 0.63938
0.74000 1.52604 1.41351 0.75330 0.66263
0.75000 1.54430 1.43842 0.78925 0.67884
0.76000 1.55633 1.46536 0.81250 0.69540
0.77000 1.56282 1.48012 0.84801 0.72899
0.78000 1.57245 1.49574 0.86657 0.73934
0.79000 1.58277 1.51564 0.90696 0.76147
0.80000 1.59149 1.53226 0.93265 0.81038
0.81000 1.59883 1.54450 0.97456 0.85287
0.82000 1.60587 1.55777 1.00809 0.90534
0.83000 1.61216 1.56334 1.02570 0.96566
0.84000 1.61803 1.57583 1.05052 1.02102
0.85000 1.62568 1.58589 1.07218 1.03485
0.86000 1.63091 1.59593 1.11747 1.09383
0.87000 1.64307 1.60745 1.14659 1.16075
0.88000 1.65033 1.61638 1.17268 1.21484
0.89000 1.65691 1.62442 1.21196 1.24922
0.90000 1.66307 1.63321 1.25644 1.30013
0.91000 1.67429 1.64781 1.30644 1.33641
0.92000 1.68702 1.66001 1.34919 1.37382
0.93000 1.69829 1.67226 1.39081 1.41904
0.94000 1.70893 1.68142 1.47874 1.48799
0.95000 1.72625 1.70083 1.62107 1.58719
0.96000 1.73656 1.71328 1.82299 1.63232
0.97000 1.77279 1.74188 1.99231 1.72630
0.98000 1.89750 1.79882 2.19662 1.94227
0.99000 2.34395 2.06873 2.34937 2.24499
1.00000 3.73384 4.27923 4.11659 2.74557
```

where the first column contains the percentiles, and the 2nd, 3rd, 4th and 5th columns correspond to figures 1, 2, 3 and 4 above, and contain the effect size values. Looking at the 1st column it can be seen that if Cohen's "scale" is applied, over 50% of the effect size values can be describe as "large," with an approximate further 15% being "medium" effect.All in all a successful test, which encourages me to adopt the Cauchy-Schwarz inequality, but before I do there are one or two more tweaks I would like to test. This will be the subject of my next post.

## Sunday, 6 April 2014

### The Cauchy-Schwarz Inequality

In my previous post I said I was looking into my code for the dominant cycle, mostly with a view to either improving my code or perhaps replacing/augmenting it with some other method of calculating the cycle period. To this end I have recently enrolled on a discrete time signals and systems course offered by edx. One of the lectures was about the Cauchy-Schwarz inequality, which is what this post is about.

The basic use I have in mind is to use the inequality to select sections of price history that are most similar to one another and use these as training cases for neural net training. My initial Octave code is given in the code box below:-

First off, although the above code randomly selects a section of price history to match, I deliberately hand chose a section to match for illustrative purposes in this post. Below is the section

where the section ends at the point where the vertical cursor crosses the price and begins at the high just below the horizontal cursor, for a look back period of 16 bars. For context, here is a zoomed out view.

I chose this section because it represents a "difficult" set of prices, i.e. moving sideways at the end of a retracement and perhaps reacting to a previous low acting as resistance, as well as being in a Fibonacci retracement zone.

The first set of code outputs is this chart

which shows the Cauchy-Schwarz values for the whole range of the price series, with the upper pane being values for the raw price matching and the lower pane being the smoothed price matching. Note that in the code the values are set to zero after the max function has selected the best match and so the spikes down to zero show the points in time where the top N, in this case 10, matches were taken from.

The next chart output shows the the normalised prices that the matching is done against, with the cyan being the original sample (the same in all subplots), the red being the raw price matches and the yellow being the smoothed price matches.

The closest match is the top left subplot, and then reading horizontally and down to the 10th best in the bottom right subplot.

The next plot shows the price matches un-normalised, for the raw price matching, with the original sample being blue,

and next for the smoothed matching,

and finally, side by side for easy visual comparison.

After plotting all the above, the code prints to terminal some details thus:

The basic use I have in mind is to use the inequality to select sections of price history that are most similar to one another and use these as training cases for neural net training. My initial Octave code is given in the code box below:-

```
clear all
% load price file of interest
filename = 'eurusdmatrix' ; %input( 'Enter filename for prices, e.g. es or esmatrix: ' , 's' ) ;
data = load( "-ascii" , filename ) ;
% get tick size
switch filename
case { "cc" }
tick = 1 ;
case { "gc" "lb" "pl" "sm" "sp" }
tick = 0.1 ;
case { "ausyen" "bo" "cl" "ct" "dx" "euryen" "gbpyen" "sb" "usdyen" }
tick = 0.01 ;
case { "c" "ng" }
tick = 0.001 ;
case { "auscad" "aususd" "euraus" "eurcad" "eurchf" "eurgbp" "eurusd" "gbpchf" "gbpusd" "ho" "rb" "usdcad" "usdchf" }
tick = 0.0001 ;
case { "c" "o" "s" "es" "nd" "w" }
tick = 0.25 ;
case { "fc" "lc" "lh" "pb" }
tick = 0.025 ;
case { "ed" }
tick = 0.0025 ;
case { "si" }
tick = 0.5 ;
case { "hg" "kc" "oj" "pa" }
tick = 0.05 ;
case { "ty" "us" }
tick = 0.015625 ;
case { "ccmatrix" }
tick = 1 ;
case { "gcmatrix" "lbmatrix" "plmatrix" "smmatrix" "spmatrix" }
tick = 0.1 ;
case { "ausyenmatrix" "bomatrix" "clmatrix" "ctmatrix" "dxmatrix" "euryenmatrix" "gbpyenmatrix" "sbmatrix" "usdyenmatrix" }
tick = 0.01 ;
case { "cmatrix" "ngmatrix" }
tick = 0.001 ;
case { "auscadmatrix" "aususdmatrix" "eurausmatrix" "eurcadmatrix" "eurchfmatrix" "eurgbpmatrix" "eurusdmatrix" "gbpchfmatrix" "gbpusdmatrix" "homatrix" "rbmatrix" "usdcadmatrix" "usdchfmatrix" }
tick = 0.0001 ;
case { "cmatrix" "omatrix" "smatrix" "esmatrix" "ndmatrix" "wmatrix" }
tick = 0.25 ;
case { "fcmatrix" "lcmatrix" "lhmatrix" "pbmatrix" }
tick = 0.025 ;
case { "edmatrix" }
tick = 0.0025 ;
case { "simatrix" }
tick = 0.5 ;
case { "hgmatrix" "kcmatrix" "ojmatrix" "pamatrix" }
tick = 0.05 ;
case { "tymatrix" "usmatrix" }
tick = 0.015625 ;
endswitch
open = data( : , 4 ) ;
high = data( : , 5 ) ;
low = data( : , 6 ) ;
close = data( : , 7 ) ;
period = data( : , 12 ) ;
price = vwap( open, high, low, close, tick ) ;
[ max_adj, min_adj, channel_price ] = adaptive_lookback_max_min( price, period, tick ) ;
smooth_price = smooth_2_5( price ) ;
[ max_adj, min_adj, smooth_channel_price ] = adaptive_lookback_max_min( smooth_price, period, tick ) ;
clear -exclusive channel_price smooth_channel_price period price
% randomly choose vwap prices to match
sample_index = randperm( size(channel_price,1), 1 )
lookback = period( sample_index )
sample_to_match = channel_price( sample_index-lookback : sample_index )' ;
sample_to_match_smooth = smooth_channel_price( sample_index-lookback : sample_index )' ;
cauchy_schwarz_values = zeros( size(channel_price,1) , 1 ) ;
cauchy_schwarz_values_smooth = zeros( size(channel_price,1) , 1 ) ;
for ii = 50 : size(channel_price,1)
% match_size = size( channel_price( ii-lookback : ii ) )
cauchy_schwarz_values(ii) = abs( sample_to_match * channel_price( ii-lookback : ii ) ) / ( norm(sample_to_match) * norm( channel_price( ii-lookback : ii , 1 ) ) ) ;
cauchy_schwarz_values_smooth(ii) = abs( sample_to_match_smooth * smooth_channel_price( ii-lookback : ii ) ) / ( norm(sample_to_match_smooth) * norm( smooth_channel_price( ii-lookback : ii , 1 ) ) ) ;
end
% now set the values for sample_to_match +/- 2 to zero to avoid matching with itself
cauchy_schwarz_values( sample_index-2 : sample_index+2 ) = 0.0 ;
cauchy_schwarz_values_smooth( sample_index-2 : sample_index+2 ) = 0.0 ;
N = 10 ; % must be >= 10
% get index values of the top N matches
matches = zeros( N, 1 ) ;
matches_smooth = zeros( N, 1 ) ;
% record these values
matches_values = zeros( N, 1 ) ;
matches_smooth_values = zeros( N, 1 ) ;
for ii = 1: N
[ max_val, ix ] = max( cauchy_schwarz_values ) ;
matches(ii) = ix ;
matches_values(ii) = cauchy_schwarz_values(ix) ;
cauchy_schwarz_values( ix-2 : ix+2 ) = 0.0 ;
[ max_val, ix ] = max( cauchy_schwarz_values_smooth ) ;
matches_smooth(ii) = ix ;
matches_smooth_values(ii) = cauchy_schwarz_values_smooth(ix) ;
cauchy_schwarz_values_smooth( ix-2 : ix+2 ) = 0.0 ;
end
% Plot for visual inspection
clf ;
% the matched index values
figure(1) ;
subplot(2,1,1) ; plot( cauchy_schwarz_values, 'c' ) ;
subplot(2,1,2) ; plot( cauchy_schwarz_values_smooth, 'c' ) ;
set( gcf() , 'color' , [0 0 0] )
% the top N matched price sequences
figure(2) ;
subplot(5, 2, 1) ; plot( sample_to_match, 'c', channel_price( matches(1)-lookback : matches(1) ), 'r', channel_price( matches_smooth(1)-lookback : matches_smooth(1) ), 'y' ) ;
subplot(5, 2, 2) ; plot( sample_to_match, 'c', channel_price( matches(2)-lookback : matches(2) ), 'r', channel_price( matches_smooth(2)-lookback : matches_smooth(2) ), 'y' ) ;
subplot(5, 2, 3) ; plot( sample_to_match, 'c', channel_price( matches(3)-lookback : matches(3) ), 'r', channel_price( matches_smooth(3)-lookback : matches_smooth(3) ), 'y' ) ;
subplot(5, 2, 4) ; plot( sample_to_match, 'c', channel_price( matches(4)-lookback : matches(4) ), 'r', channel_price( matches_smooth(4)-lookback : matches_smooth(4) ), 'y' ) ;
subplot(5, 2, 5) ; plot( sample_to_match, 'c', channel_price( matches(5)-lookback : matches(5) ), 'r', channel_price( matches_smooth(5)-lookback : matches_smooth(5) ), 'y' ) ;
subplot(5, 2, 6) ; plot( sample_to_match, 'c', channel_price( matches(6)-lookback : matches(6) ), 'r', channel_price( matches_smooth(6)-lookback : matches_smooth(6) ), 'y' ) ;
subplot(5, 2, 7) ; plot( sample_to_match, 'c', channel_price( matches(7)-lookback : matches(7) ), 'r', channel_price( matches_smooth(7)-lookback : matches_smooth(7) ), 'y' ) ;
subplot(5, 2, 8) ; plot( sample_to_match, 'c', channel_price( matches(8)-lookback : matches(8) ), 'r', channel_price( matches_smooth(8)-lookback : matches_smooth(8) ), 'y' ) ;
subplot(5, 2, 9) ; plot( sample_to_match, 'c', channel_price( matches(9)-lookback : matches(9) ), 'r', channel_price( matches_smooth(9)-lookback : matches_smooth(9) ), 'y' ) ;
subplot(5, 2, 10) ; plot( sample_to_match, 'c', channel_price( matches(10)-lookback : matches(10) ), 'r', channel_price( matches_smooth(10)-lookback : matches_smooth(10) ), 'y' ) ;
set( gcf() , 'color' , [0 0 0] )
figure(3)
subplot(5, 2, 1) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(1)-lookback : matches(1) ) ) ;
subplot(5, 2, 2) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(2)-lookback : matches(2) ) ) ;
subplot(5, 2, 3) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(3)-lookback : matches(3) ) ) ;
subplot(5, 2, 4) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(4)-lookback : matches(4) ) ) ;
subplot(5, 2, 5) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(5)-lookback : matches(5) ) ) ;
subplot(5, 2, 6) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(6)-lookback : matches(6) ) ) ;
subplot(5, 2, 7) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(7)-lookback : matches(7) ) ) ;
subplot(5, 2, 8) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(8)-lookback : matches(8) ) ) ;
subplot(5, 2, 9) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(9)-lookback : matches(9) ) ) ;
subplot(5, 2, 10) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(10)-lookback : matches(10) ) ) ;
figure(4)
subplot(5, 2, 1) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(1)-lookback : matches_smooth(1) ) ) ;
subplot(5, 2, 2) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(2)-lookback : matches_smooth(2) ) ) ;
subplot(5, 2, 3) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(3)-lookback : matches_smooth(3) ) ) ;
subplot(5, 2, 4) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(4)-lookback : matches_smooth(4) ) ) ;
subplot(5, 2, 5) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(5)-lookback : matches_smooth(5) ) ) ;
subplot(5, 2, 6) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(6)-lookback : matches_smooth(6) ) ) ;
subplot(5, 2, 7) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(7)-lookback : matches_smooth(7) ) ) ;
subplot(5, 2, 8) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(8)-lookback : matches_smooth(8) ) ) ;
subplot(5, 2, 9) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(9)-lookback : matches_smooth(9) ) ) ;
subplot(5, 2, 10) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(10)-lookback : matches_smooth(10) ) ) ;
% Print results to terminal
results = zeros( N , 2 ) ;
for ii = 1 : N
results(ii,1) = distcorr( sample_to_match', channel_price( matches(ii)-lookback : matches(ii) ) ) ;
results(ii,2) = distcorr( sample_to_match', channel_price( matches_smooth(ii)-lookback : matches_smooth(ii) ) ) ;
end
results = [ matches_values matches_smooth_values results ]
```

After some basic "housekeeping" code to load the price file of interest and normalise the prices, a random section of the price history is selected and then, in a loop, the top N matches in the history are found using the inequality as the metric for matching. A value of 0 means that the price series being compared are orthogonal, and hence as dissimilar to each other as possible, whilst a value of 1 means the opposite. There are two types of matching; the raw price matched with raw price, and a smoothed price matched with smoothed price.First off, although the above code randomly selects a section of price history to match, I deliberately hand chose a section to match for illustrative purposes in this post. Below is the section

where the section ends at the point where the vertical cursor crosses the price and begins at the high just below the horizontal cursor, for a look back period of 16 bars. For context, here is a zoomed out view.

I chose this section because it represents a "difficult" set of prices, i.e. moving sideways at the end of a retracement and perhaps reacting to a previous low acting as resistance, as well as being in a Fibonacci retracement zone.

The first set of code outputs is this chart

which shows the Cauchy-Schwarz values for the whole range of the price series, with the upper pane being values for the raw price matching and the lower pane being the smoothed price matching. Note that in the code the values are set to zero after the max function has selected the best match and so the spikes down to zero show the points in time where the top N, in this case 10, matches were taken from.

The next chart output shows the the normalised prices that the matching is done against, with the cyan being the original sample (the same in all subplots), the red being the raw price matches and the yellow being the smoothed price matches.

The closest match is the top left subplot, and then reading horizontally and down to the 10th best in the bottom right subplot.

The next plot shows the price matches un-normalised, for the raw price matching, with the original sample being blue,

and next for the smoothed matching,

and finally, side by side for easy visual comparison.

*N.b. For all the smoothed plots above, although the matching is done on smoothed prices, the unsmoothed, raw prices for these matches are plotted.*After plotting all the above, the code prints to terminal some details thus:

lookback = 16

results =

0.95859 0.98856 0.89367 0.86361

0.95733 0.98753 0.93175 0.86839

0.95589 0.98697 0.87398 0.67945

0.95533 0.98538 0.85346 0.83079

0.95428 0.98293 0.91212 0.77225

0.94390 0.98292 0.79350 0.66563

0.93908 0.98150 0.71753 0.77458

0.93894 0.97992 0.86839 0.72492

0.93345 0.97969 0.74456 0.79060

0.93286 0.97940 0.86361 0.61103

results =

0.95859 0.98856 0.89367 0.86361

0.95733 0.98753 0.93175 0.86839

0.95589 0.98697 0.87398 0.67945

0.95533 0.98538 0.85346 0.83079

0.95428 0.98293 0.91212 0.77225

0.94390 0.98292 0.79350 0.66563

0.93908 0.98150 0.71753 0.77458

0.93894 0.97992 0.86839 0.72492

0.93345 0.97969 0.74456 0.79060

0.93286 0.97940 0.86361 0.61103

which, column wise, are the Cauchy-Schwarz values for the raw price matching and the smoothed price matching, and the Distance correlation values for the raw price matching and the smoothed price matching respectively.

The code used to calculate the Distance correlation is given below.

```
% Copyright (c) 2013, Shen Liu
% All rights reserved.
% Redistribution and use in source and binary forms, with or without
% modification, are permitted provided that the following conditions are
% met:
% * Redistributions of source code must retain the above copyright
% notice, this list of conditions and the following disclaimer.
% * Redistributions in binary form must reproduce the above copyright
% notice, this list of conditions and the following disclaimer in
% the documentation and/or other materials provided with the distribution
% THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
% AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
% IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
% ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
% LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
% CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
% SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
% INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
% CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
% ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
% POSSIBILITY OF SUCH DAMAGE.
function dcor = distcorr(x,y)
% This function calculates the distance correlation between x and y.
% Reference: http://en.wikipedia.org/wiki/Distance_correlation
% Date: 18 Jan, 2013
% Author: Shen Liu (shen.liu@hotmail.com.au)
% Check if the sizes of the inputs match
if size(x,1) ~= size(y,1) ;
error('Inputs must have the same number of rows')
end
% Delete rows containing unobserved values
N = any([isnan(x) isnan(y)],2) ;
x(N,:) = [] ;
y(N,:) = [] ;
% Calculate doubly centered distance matrices for x and y
a = pdist([x,x]) ; % original MATLAB call is to pdist2( x, x )
mcol = mean(a) ;
mrow = mean(a,2) ;
ajbar = ones(size(mrow))*mcol ;
akbar = mrow*ones(size(mcol)) ;
abar = mean(mean(a))*ones(size(a)) ;
A = a - ajbar - akbar + abar ;
b = pdist([y,y]) ;
mcol = mean(b) ;
mrow = mean(b,2) ;
bjbar = ones(size(mrow))*mcol ;
bkbar = mrow*ones(size(mcol)) ;
bbar = mean(mean(b))*ones(size(b)) ;
B = b - bjbar - bkbar + bbar ;
% Calculate squared sample distance covariance and variances
dcov = sum(sum(A.*B))/(size(mrow,1)^2) ;
dvarx = sum(sum(A.*A))/(size(mrow,1)^2) ;
dvary = sum(sum(B.*B))/(size(mrow,1)^2) ;
% Calculate the distance correlation
dcor = sqrt(dcov/sqrt(dvarx*dvary)) ;
```

These results show promise, and I intend to apply a more rigorous test to them for the subject of a future post.## Thursday, 20 March 2014

### Update on Recent Work

It has been almost two months since my last post and during this time I have been working on a few different things, all related in one way or another to my desire to create a rolling NN training regime. First off, I have been giving some thought as to the exact methodology to use, and two had come to mind

- a rolling look back period of n bars, similar to a moving average
- selecting non consecutive periods of price history with similarity to the most recent history

As can be seen there is nice separation between the market types and the SVM achieves over 98% cross-validation accuracy on this training set. Despite this, when applied to real market data I am yet again disappointed by the performance and choose for now to no longer pursue this avenue of investigation.

In addition to all the above, I have "discovered" the Octave sourceforge nan package, which I may begin to investigate more fully in due course. I have also been working through another Massive Open Online Course, this time Statistical Learning, which is in its last week at the moment. In this last week of the course I have been alerted to a possible new area of investigation, Distance correlation, which I had heard of before but not fully appreciated.

Finally, I have also been reassessing the code I use for calculating dominant cycle periods. It is these last two, distance correlation and the period code, that I'm going to look at more fully over the coming days.

## Monday, 27 January 2014

### Creating Neural Net Training Features

In my last post I wrote about how I had successfully coded a basic walk forward neural net training regime and that I was going to then work on creating useful training features. To that end, for the past few weeks, I have been doing a lot of online searching and luckily I have come across the Comp-Engine timeseries website. On this site there is a code tab with a plethora of ways of extracting time series features for classification purposes, and what's more the code is MATLAB code, which is almost fully compatible with the Octave software I do my development in. The only drawback I can see is that some of the code references or depends on MATLAB toolboxes, which might not be freely available to non MATLAB license holders. It is my intention for the immediate future to investigate this resource more fully.

Labels:
Feature extraction,
Neural nets,
Octave

## Thursday, 9 January 2014

### Neural Net Walk Forward Octave Training Code

Following on from my last post I have now completed the basic refactoring of my existing NN Octave code to a "Walk Forward" training regime. After some simple housekeeping code to load/extract data and create inputs the basic nuts and bolts of this new, walk forward Octave code is given in the code box below.

```
% get the corresponding targets - map the target labels as binary vectors of 1's and 0's for the class labels
n_classes = 3 ; % 3 outputs in softmax layer
targets = eye( n_classes )( l_s_n_pos_vec, : ) ; % there are 3 labels
% get NN features for all data
[ binary_features , scaled_features ] = windowed_nn_input_features( open, high, low, close, tick ) ;
tic ;
% a uniformly distributed randomness source for seeding & reproductibility purposes
global randomness_source
load randomness_source.mat
% vector to hold final NN classification results
nn_classification = zeros( length(close) , 2 ) ;
ii_begin = 2000 ;
ii_end = 2500 ;
for ii = ii_begin : ii_end
% get the look back period
lookback_period = max( period(ii-2,1)+2, period(ii,1) ) ;
% extract features and targets from windowed_nn_input_features output
training_binary_features = binary_features( ii-lookback_period:ii , : ) ;
training_scaled_features = scaled_features( ii-lookback_period:ii , : ) ;
training_targets = targets( ii-lookback_period:ii , : ) ;
% set some default values for NN training
n_hid = 2.0 * ( size( training_binary_features, 2 ) + size( training_scaled_features, 2 ) ) ; % units in hidden layer
lr_rbm = 0.02 ; % default value for learning rate for rbm training
learning_rate = 0.5 ; % default value for learning rate for back prop training
n_iterations = 500 ; % number of training iterations to perform
% get the rbm trained weights for the input layer to hidden layer for training_binary_features
rbm_w_binary = train_rbm_binary_features( training_binary_features, n_hid, lr_rbm, n_iterations ) ;
% and now get the rbm trained weights of the input layer to hidden layer for training_scaled_features
rbm_w_scaled = train_rbm_scaled_features( training_scaled_features, n_hid, lr_rbm, n_iterations ) ;
% now stack these to form the initial input layer to hidden layer weight matrix
initial_input_weights = [ rbm_w_binary ; rbm_w_scaled ] ;
% now feedforward through hidden layer to get the hidden layer output
hidden_layer_output = logistic( [ training_binary_features training_scaled_features ] * initial_input_weights ) ;
% now add the hidden layer bias unit to the above output of the hidden layer and RBM train
hidden_layer_output = [ ones( size(hidden_layer_output,1), 1 ) hidden_layer_output ] ;
initial_softmax_weights = train_rbm_output_w( hidden_layer_output, n_classes, lr_rbm, n_iterations ) ;
% now using the above rbm trained weight matrices as initial weights instead of random initialisation,
% do the backpropagation training with targets by dropping the last two sets of features for which
% there are no targets and adding in a column of ones for the input layer bias unit
% First, extract the relevant features
all_features = [ training_binary_features(1:end-2,:) training_scaled_features(1:end-2,:) ] ;
training_targets = training_targets(1:end-2,:) ;
% and do the backprop training
[ final_input_w, final_softmax_w ] = rbm_backprop_training( all_features, training_targets, initial_input_weights, initial_softmax_weights, n_hid, n_iterations, learning_rate, 0.9, false) ;
% get the NN classification of the ii-th bar for this iteration
val_1 = logistic( [ training_binary_features(end-1,:) training_scaled_features(end-1,:) ] * final_input_w ) ;
val_2 = [ 1 val_1 ] * final_softmax_w ;
class_normalizer = log_sum_exp_over_cols( val_2 ) ;
log_class_prob = val_2 - repmat( class_normalizer , [ 1 , size( val_2 , 2 ) ] ) ;
class_prob = exp( log_class_prob ) ;
[ dump , choice_1 ] = max( class_prob , [] , 2 ) ;
if ( choice_1 == 3 )
choice_1 = ff_pos_vec(ii-2) ;
end
val_1 = logistic( [ training_binary_features(end,:) training_scaled_features(end,:) ] * final_input_w ) ;
val_2 = [ 1 val_1 ] * final_softmax_w ;
class_normalizer = log_sum_exp_over_cols( val_2 ) ;
log_class_prob = val_2 - repmat( class_normalizer , [ 1 , size( val_2 , 2 ) ] ) ;
class_prob = exp( log_class_prob ) ;
[ dump , choice ] = max( class_prob , [] , 2 ) ;
nn_classification( ii , 1 ) = choice ;
if ( choice == 3 )
choice = choice_1 ;
end
nn_classification( ii , 2 ) = choice ;
end % end if ii loop
toc ;
% write to file for plotting in Gnuplot
axis = (ii_begin:1:ii_end)';
output_v2 = [axis,open(ii_begin:ii_end,1),high(ii_begin:ii_end,1),low(ii_begin:ii_end,1),close(ii_begin:ii_end,1),nn_classification(ii_begin:ii_end,2),ff_pos_vec(ii_begin:ii_end,:) ] ;
dlmwrite('output_v2',output_v2)
```

This code is quite heavily commented, but to make things clearer here is what it does:-- creates a matrix of binary features and a matrix of scaled features ( in the range 0 to 1 ) by calling the C++ .oct function "windowed_nn_input_features." The binary features matrix includes the input layer bias unit
- RBM trains separately on each of the above features matrices to get weight matrices
- horizontally stacks the two weight matrices from step 2 to create a single, initial input layer to hidden layer weight matrix
- matrix multiplies the input features by the combined RBM trained matrix from step 3 and feeds forward through the logistic function hidden layer to create a hidden layer output
- adds a bias unit to the hidden layer output from step 4 and then RBM trains to get a Softmax weight matrix
- uses the initial weight matrices from steps 3 and 5 instead of random initialisation for backpropagation training of a Feedforward neural network via the "rbm_backprop_training" function
- uses the trained NN from step 6 to make prediction/classify the most recent candlestick bar and records this NN prediction/classification. Steps 2 to 7 are contained in a for loop which slides a moving window across the input data
- finally writes output to file

Subscribe to:
Posts (Atom)