I decided to use the Effect size as the test of choice, for which there are nice introductions here and here. A basic description of the way I implemented the test is as follows:-
- Randomly pick a section of price history, which will be used as the price history for the selection algorithm to match
- Take the 5 consecutive bars immediately following the above section of price history and store as the "target"
- Create a control group of random matches to the above "target" by randomly selecting 10 separate 5 bar pieces of price history and calculating the Cauchy-Schwarz values of these 10 compared to the target and record the average value of these values. Repeat this step N times to create a distribution of randomly matched, average target-to-random-price Cauchy-Schwarz values. By virtue of the Central limit theorem it can be expected that this distribution is approximately normal
- Using the matching algorithm (as described in the previous post) get the closest 10 matches in the price history to the random selection from step 1
- Get the 5 consecutive bars immediately following the 10 matches from step 4 and calculate their Cauchy-Schwarz values viz-a-viz the "target" and record the average value of these 10 values. This average value is the "experimental" value
- Using the mean and standard deviation of the control group distribution from step 3, calculate the effect size of the experimental value and record this effect size value
- Repeat all the above steps M times to form an effect size value distribution
clear all
% load price file of interest
filename = input( 'Enter filename for prices, e.g. es or esmatrix: ' , 's' ) ;
data = load( "-ascii" , filename ) ;
% get tick size
switch filename
case { "cc" }
tick = 1 ;
case { "gc" "lb" "pl" "sm" "sp" }
tick = 0.1 ;
case { "ausyen" "bo" "cl" "ct" "dx" "euryen" "gbpyen" "sb" "usdyen" }
tick = 0.01 ;
case { "c" "ng" }
tick = 0.001 ;
case { "auscad" "aususd" "euraus" "eurcad" "eurchf" "eurgbp" "eurusd" "gbpchf" "gbpusd" "ho" "rb" "usdcad" "usdchf" }
tick = 0.0001 ;
case { "c" "o" "s" "es" "nd" "w" }
tick = 0.25 ;
case { "fc" "lc" "lh" "pb" }
tick = 0.025 ;
case { "ed" }
tick = 0.0025 ;
case { "si" }
tick = 0.5 ;
case { "hg" "kc" "oj" "pa" }
tick = 0.05 ;
case { "ty" "us" }
tick = 0.015625 ;
case { "ccmatrix" }
tick = 1 ;
case { "gcmatrix" "lbmatrix" "plmatrix" "smmatrix" "spmatrix" }
tick = 0.1 ;
case { "ausyenmatrix" "bomatrix" "clmatrix" "ctmatrix" "dxmatrix" "euryenmatrix" "gbpyenmatrix" "sbmatrix" "usdyenmatrix" }
tick = 0.01 ;
case { "cmatrix" "ngmatrix" }
tick = 0.001 ;
case { "auscadmatrix" "aususdmatrix" "eurausmatrix" "eurcadmatrix" "eurchfmatrix" "eurgbpmatrix" "eurusdmatrix" "gbpchfmatrix" "gbpusdmatrix" "homatrix" "rbmatrix" "usdcadmatrix" "usdchfmatrix" }
tick = 0.0001 ;
case { "cmatrix" "omatrix" "smatrix" "esmatrix" "ndmatrix" "wmatrix" }
tick = 0.25 ;
case { "fcmatrix" "lcmatrix" "lhmatrix" "pbmatrix" }
tick = 0.025 ;
case { "edmatrix" }
tick = 0.0025 ;
case { "simatrix" }
tick = 0.5 ;
case { "hgmatrix" "kcmatrix" "ojmatrix" "pamatrix" }
tick = 0.05 ;
case { "tymatrix" "usmatrix" }
tick = 0.015625 ;
endswitch
open = data( : , 4 ) ;
high = data( : , 5 ) ;
low = data( : , 6 ) ;
close = data( : , 7 ) ;
price = vwap( open, high, low, close, tick ) ;
clear -exclusive price tick
% first, get the lookback parameters on real prices
[ sine, sinelead, period ] = sinewave_indicator( price ) ;
[ max_price, min_price, channel_price ] = adaptive_lookback_max_min( price, period, tick ) ;
smooth_price = smooth_2_5( price ) ;
[ max_smooth_price, min_smooth_price, smooth_channel_price ] = adaptive_lookback_max_min( smooth_price, period, tick ) ;
cauchy_schwarz_values = zeros( size(channel_price,1) , 1 ) ;
cauchy_schwarz_values_smooth = zeros( size(channel_price,1) , 1 ) ;
% set up all recording vectors
N = 10 ; % must be >= 10
% record these values
matches_values = zeros( N, 1 ) ;
matches_smooth_values = zeros( N, 1 ) ;
distcorr_values = zeros( N, 1 ) ;
distcorr_values_smooth = zeros( N, 1 ) ;
% vectors to record averages
random_matches_values_averages = zeros( 750, 1 ) ;
random_matches_smooth_values_averages = zeros( 750, 1 ) ;
random_distcorr_averages = zeros( 750, 1 ) ;
random_distcorr_smooth_averages = zeros( 750, 1 ) ;
% effect size vectors
effect_size = zeros( 750, 1 ) ;
effect_size_smooth = zeros( 750, 1 ) ;
effect_size_distcorr = zeros( 750, 1 ) ;
effect_size_distcorr_smooth = zeros( 750, 1 ) ;
for kk = 1 : 750
% first, get a random pick from the price history and all its associated values
sample_index = randperm( (size(price,1)-55), 1 ) .+ 50 ;
lookback = period( sample_index ) ;
sample_to_match = channel_price( sample_index-lookback : sample_index )' ;
sample_to_match_smooth = smooth_channel_price( sample_index-lookback : sample_index )' ;
projection_to_match = ( ( price( (sample_index+1):(sample_index+5) ) .- min_price(sample_index) ) ./ ( max_price(sample_index)-min_price(sample_index) ) )' ;
projection_to_match_smooth = ( ( price( (sample_index+1):(sample_index+5) ) .- min_smooth_price(sample_index) ) ./ ( max_smooth_price(sample_index)-min_smooth_price(sample_index) ) )' ;
% for this pick, calculate cauchy_schwarz_values
for ii = 50 : size( price, 1 )
cauchy_schwarz_values(ii) = abs( sample_to_match * channel_price( ii-lookback : ii ) ) / ( norm(sample_to_match) * norm( channel_price( ii-lookback : ii , 1 ) ) ) ;
cauchy_schwarz_values_smooth(ii) = abs( sample_to_match_smooth * smooth_channel_price( ii-lookback : ii ) ) / ( norm(sample_to_match_smooth) * norm( smooth_channel_price( ii-lookback : ii , 1 ) ) ) ;
end
% now set the values for sample_to_match +/- 2 to zero to avoid matching with itself
cauchy_schwarz_values( sample_index-2 : sample_index+2 ) = 0.0 ;
cauchy_schwarz_values_smooth( sample_index-2 : sample_index+2 ) = 0.0 ;
% set the last six values to zero to allow for projections
cauchy_schwarz_values( end-5 : end ) = 0.0 ;
cauchy_schwarz_values_smooth( end-5 : end ) = 0.0 ;
% get the top N matches
for ii = 1 : N
[ max_val, ix ] = max( cauchy_schwarz_values ) ;
norm_price_proj_match = ( ( price( ((ix)+1):((ix)+5) ) .- min_price(ix) ) ./ ( max_price(ix)-min_price(ix) ) ) ;
matches_values(ii) = abs( projection_to_match * norm_price_proj_match ) / ( norm(projection_to_match) * norm( norm_price_proj_match ) ) ;
cauchy_schwarz_values( ix-2 : ix+2 ) = 0.0 ;
[ max_val, ix ] = max( cauchy_schwarz_values_smooth ) ;
norm_price_smooth_proj_match = ( ( price( ((ix)+1):((ix)+5) ) .- min_smooth_price(ix) ) ./ ( max_smooth_price(ix)-min_smooth_price(ix) ) ) ;
matches_smooth_values(ii) = abs( projection_to_match_smooth * norm_price_smooth_proj_match ) / ( norm(projection_to_match_smooth) * norm( norm_price_smooth_proj_match ) ) ;
cauchy_schwarz_values_smooth( ix-2 : ix+2 ) = 0.0 ;
distcorr_values(ii) = distcorr( projection_to_match', norm_price_proj_match ) ;
distcorr_values_smooth(ii) = distcorr( projection_to_match_smooth', norm_price_smooth_proj_match ) ;
end % end of top N matches loop
% get and record averages for the top N matches
matches_values_average = mean( matches_values ) ;
matches_smooth_values_average = mean( matches_smooth_values ) ;
distcorr_average = mean( distcorr_values ) ;
distcorr_smooth_average = mean( distcorr_values_smooth ) ;
% now create a null distribution of random price projections
% randomly choosen from prices
for jj = 1 : 750
random_index = randperm( (size(price,1)-55), 10 ) .+ 50 ;
for ii = 1 : 10
norm_price_proj_match = ( ( price( (random_index(ii)+1):(random_index(ii)+5) ) .- min_price(random_index(ii)) ) ./ ( max_price(random_index(ii))-min_price(random_index(ii)) ) ) ;
matches_values(ii) = abs( projection_to_match * norm_price_proj_match ) / ( norm(projection_to_match) * norm( norm_price_proj_match ) ) ;
norm_price_smooth_proj_match = ( ( price( (random_index(ii)+1):(random_index(ii)+5) ) .- min_smooth_price(random_index(ii)) ) ./ ( max_smooth_price(random_index(ii))-min_smooth_price(random_index(ii)) ) ) ;
matches_smooth_values(ii) = abs( projection_to_match_smooth * norm_price_smooth_proj_match ) / ( norm(projection_to_match_smooth) * norm( norm_price_smooth_proj_match ) ) ;
distcorr_values(ii) = distcorr( projection_to_match', norm_price_proj_match ) ;
distcorr_values_smooth(ii) = distcorr( projection_to_match_smooth', norm_price_smooth_proj_match ) ;
end % end of random index ii loop
random_matches_values_averages(jj) = mean( matches_values ) ;
random_matches_smooth_values_averages(jj) = mean( matches_smooth_values ) ;
random_distcorr_averages(jj) = mean( distcorr_values ) ;
random_distcorr_smooth_averages(jj) = mean( distcorr_values_smooth ) ;
end % end jj loop
effect_size(kk) = ( matches_values_average - mean( random_matches_values_averages ) ) / std( random_matches_values_averages ) ;
effect_size_smooth(kk) = ( matches_smooth_values_average - mean( random_matches_smooth_values_averages ) ) / std( random_matches_smooth_values_averages ) ;
effect_size_distcorr(kk) = ( distcorr_average - mean( random_distcorr_averages ) ) / std( random_distcorr_averages ) ;
effect_size_distcorr_smooth(kk) = ( distcorr_smooth_average - mean( random_distcorr_smooth_averages ) ) / std( random_distcorr_smooth_averages ) ;
end % end kk loop
all_effect_sizes = [ effect_size, effect_size_smooth, effect_size_distcorr, effect_size_distcorr_smooth ] ;
dlmwrite( 'all_effect_sizes', all_effect_sizes )
ResultsRunning the code on the EURUSD forex pair and plotting histograms gives this:
where figures 1 and 2 are for the Cauchy-Schwarz values and figures 3 and 4 are Distance correlation values for comparative purposes, and which I won't discuss in this post.
On seeing this for the first time I was somewhat surprised as I had expected the effect size distribution(s) to be approximately normal because all the test calculations are based on averages. However, it was a pleasant surprise due to the peak in values at the right hand side, showing a possible substantial effect size. To make things clearer here are the percentiles of the four histograms above:
0.00000 -5.08931 -4.79836 -3.05912 -3.65668
0.01000 -3.61724 -3.20229 -2.46932 -2.45201
0.02000 -3.39841 -2.81969 -2.21764 -2.20515
0.03000 -3.00404 -2.49009 -1.89562 -2.05380
0.04000 -2.66393 -2.35174 -1.80412 -1.91032
0.05000 -2.52514 -2.03670 -1.68800 -1.71335
0.06000 -2.22298 -1.91877 -1.59624 -1.61089
0.07000 -2.07188 -1.88256 -1.52058 -1.48763
0.08000 -1.93247 -1.79727 -1.45786 -1.42828
0.09000 -1.71065 -1.66522 -1.36500 -1.35917
0.10000 -1.59803 -1.58943 -1.31570 -1.31809
0.11000 -1.44325 -1.53087 -1.24996 -1.28199
0.12000 -1.38234 -1.44477 -1.20741 -1.21903
0.13000 -1.22440 -1.32961 -1.17397 -1.17619
0.14000 -1.14728 -1.29863 -1.12755 -1.10768
0.15000 -1.05431 -1.19564 -1.09108 -1.08591
0.16000 -0.93505 -1.10204 -1.06018 -1.04149
0.17000 -0.88272 -1.05314 -1.00478 -1.00248
0.18000 -0.79723 -1.01394 -0.96389 -0.97786
0.19000 -0.66914 -0.98012 -0.92679 -0.96108
0.20000 -0.58700 -0.88085 -0.89990 -0.91932
0.21000 -0.52548 -0.84929 -0.86971 -0.87901
0.22000 -0.44446 -0.82412 -0.83585 -0.84796
0.23000 -0.40282 -0.76732 -0.80526 -0.82919
0.24000 -0.36407 -0.68691 -0.75698 -0.80794
0.25000 -0.32960 -0.65915 -0.73488 -0.77562
0.26000 -0.21295 -0.61977 -0.64435 -0.73739
0.27000 -0.13202 -0.57937 -0.60995 -0.70502
0.28000 -0.07516 -0.50076 -0.54194 -0.67219
0.29000 -0.00845 -0.43592 -0.51490 -0.61872
0.30000 0.04592 -0.35829 -0.49879 -0.59214
0.31000 0.08091 -0.29488 -0.47284 -0.56236
0.32000 0.11649 -0.24116 -0.44727 -0.52599
0.33000 0.20059 -0.20343 -0.38769 -0.48137
0.34000 0.29594 -0.17594 -0.32956 -0.46426
0.35000 0.33832 -0.12867 -0.31033 -0.44284
0.36000 0.38473 -0.10445 -0.28196 -0.41119
0.37000 0.42759 -0.07363 -0.25178 -0.37141
0.38000 0.45809 -0.03128 -0.21921 -0.33732
0.39000 0.51545 0.00103 -0.19434 -0.30017
0.40000 0.56191 0.05818 -0.16896 -0.26556
0.41000 0.60728 0.09308 -0.15057 -0.23521
0.42000 0.63342 0.13244 -0.13961 -0.21845
0.43000 0.67951 0.17094 -0.11061 -0.20428
0.44000 0.69882 0.22192 -0.05734 -0.19437
0.45000 0.75193 0.25773 -0.03497 -0.16183
0.46000 0.79911 0.30891 -0.00695 -0.13580
0.47000 0.84183 0.35623 0.01927 -0.11969
0.48000 0.91024 0.38352 0.05030 -0.10521
0.49000 0.94791 0.42460 0.06230 -0.07570
0.50000 1.01034 0.48288 0.08379 -0.05241
0.51000 1.04269 0.54956 0.11360 -0.03448
0.52000 1.07527 0.62407 0.13003 -0.00864
0.53000 1.10908 0.65434 0.16910 0.01793
0.54000 1.12665 0.69819 0.19257 0.03546
0.55000 1.13850 0.75071 0.20893 0.05331
0.56000 1.17187 0.78859 0.24099 0.08191
0.57000 1.19397 0.82243 0.25359 0.10432
0.58000 1.22162 0.87152 0.26988 0.13012
0.59000 1.24032 0.91341 0.29813 0.16376
0.60000 1.26567 0.96977 0.32279 0.20620
0.61000 1.29286 1.00221 0.36456 0.23991
0.62000 1.32750 1.03669 0.37966 0.28647
0.63000 1.35170 1.07326 0.43526 0.31652
0.64000 1.38017 1.12882 0.45922 0.35653
0.65000 1.39101 1.15719 0.47552 0.37813
0.66000 1.41716 1.17241 0.49585 0.41064
0.67000 1.44582 1.21725 0.50760 0.42996
0.68000 1.46310 1.26081 0.56082 0.44876
0.69000 1.47664 1.27710 0.58793 0.49889
0.70000 1.49066 1.31164 0.60148 0.54122
0.71000 1.49891 1.34165 0.64747 0.57689
0.72000 1.50470 1.36688 0.67315 0.59469
0.73000 1.51436 1.38746 0.70662 0.63938
0.74000 1.52604 1.41351 0.75330 0.66263
0.75000 1.54430 1.43842 0.78925 0.67884
0.76000 1.55633 1.46536 0.81250 0.69540
0.77000 1.56282 1.48012 0.84801 0.72899
0.78000 1.57245 1.49574 0.86657 0.73934
0.79000 1.58277 1.51564 0.90696 0.76147
0.80000 1.59149 1.53226 0.93265 0.81038
0.81000 1.59883 1.54450 0.97456 0.85287
0.82000 1.60587 1.55777 1.00809 0.90534
0.83000 1.61216 1.56334 1.02570 0.96566
0.84000 1.61803 1.57583 1.05052 1.02102
0.85000 1.62568 1.58589 1.07218 1.03485
0.86000 1.63091 1.59593 1.11747 1.09383
0.87000 1.64307 1.60745 1.14659 1.16075
0.88000 1.65033 1.61638 1.17268 1.21484
0.89000 1.65691 1.62442 1.21196 1.24922
0.90000 1.66307 1.63321 1.25644 1.30013
0.91000 1.67429 1.64781 1.30644 1.33641
0.92000 1.68702 1.66001 1.34919 1.37382
0.93000 1.69829 1.67226 1.39081 1.41904
0.94000 1.70893 1.68142 1.47874 1.48799
0.95000 1.72625 1.70083 1.62107 1.58719
0.96000 1.73656 1.71328 1.82299 1.63232
0.97000 1.77279 1.74188 1.99231 1.72630
0.98000 1.89750 1.79882 2.19662 1.94227
0.99000 2.34395 2.06873 2.34937 2.24499
1.00000 3.73384 4.27923 4.11659 2.74557
where the first column contains the percentiles, and the 2nd, 3rd, 4th and 5th columns correspond to figures 1, 2, 3 and 4 above, and contain the effect size values. Looking at the 1st column it can be seen that if Cohen's "scale" is applied, over 50% of the effect size values can be describe as "large," with an approximate further 15% being "medium" effect.All in all a successful test, which encourages me to adopt the Cauchy-Schwarz inequality, but before I do there are one or two more tweaks I would like to test. This will be the subject of my next post.