Thursday 31 October 2013

Exploratory Data Analysis of Brownian Motion Model

For my first steps in investigating the earlier proposed Brownian motion model I thought I would use some Exploratory Data Analysis techniques. I have initially decided to use look back periods of 5 bars and 21 bars; the 5 bar on daily data being a crude attempt to capture the essence of weekly price movement and the 21 bar to capture the majority of dominant cycle periods in daily data, with both parameters also being non optimised Fibonacci numbers. Using the two channels over price bars that these create it is possible to postulate nine separate and distinct price "regimes" using combinations of the criteria of being above, within or below each channel.

The use of these "regimes" is to bin normalised price change at time t into nine separate distributions, the normalising factor being the 5 bar simple moving average of absolute natural log differences at time t-1 and the choice of regime being determined by the position of the close at t-1. The rationale behind basing these metrics on preceding bars is explained in this earlier post and the implementation made clear in the code box at the end of this post. The following 4 charts are Box plots of these nine distributions, in consecutive order, for the EURUSD, GBPUSD, USDCHF and USDYEN  forex pairs:



Looking at these, I'm struck by two things. Firstly, the middle three box plots are for regimes where price is in the 21 bar channel, and numbering from the left, boxes 4, 5 and 6 are below the 5 bar channel, in it and above it. These are essentially different degrees of a sideways market and it can be seen that there are far more outliers here than in the other box plots, perhaps indicating the tendency for high volatility breakouts to occur more frequently in sideways markets.

The second thing that is striking is, again numbering from the left, box plots 3 and 7. These are regimes below the 21 bar and above the 5 bar; and above the 21 bar and below the 5 bar - essentially the distributions for reactions against prevailing trends on the 21 bar time scale. Here it can be seen that there are far fewer outliers and generally shorter whiskers, indicating the tendency for reactions against trends to be less volatile in nature.

It is encouraging to see that there are such differences in these box plots as it suggests that my idea of binning does separate out the different price change characteristics. However, I believe things can be improved a bit in this regard and this will form the subject matter of my next post.
clear all

data = load("-ascii","usdyen") ;
close = data(:,7) ;
logdiff = [ 0 ; diff( log(close) ) ] ;
abslogdiff = abs( logdiff ) ;

sq_rt = sqrt( 5 ) ;
sma5 = sma(abslogdiff,5) ;
ub5 = exp(shift(log(close),5).+(sma5.*sq_rt)) ;
lb5 = exp(shift(log(close),5).-(sma5.*sq_rt)) ;

sq_rt = sqrt( 21 ) ;
sma21 = sma(abslogdiff,21) ;
ub21 = exp(shift(log(close),21).+(sma21.*sq_rt)) ;
lb21 = exp(shift(log(close),21).-(sma21.*sq_rt)) ;

% allocate close bar to a market_type
market_type = zeros( size(close,1) , 1 ) ;

% above ub21 **********************
val_1 = close > ub21 ;
val_2 = close > ub5 ;
matches = val_1 .* val_2 ;
ix = find( matches == 1 ) ;
market_type( ix , 1 ) = 4 ;

val_1 = close > ub21 ;
val_2 = close <= ub5 ;
val_3 = close >= lb5 ;
matches = val_1 .* val_2 .* val_3 ;
ix = find( matches == 1 ) ;
market_type( ix , 1 ) = 3 ;

val_1 = close > ub21 ;
val_2 = close < lb5 ;
matches = val_1 .* val_2 ;
ix = find( matches == 1 ) ;
market_type( ix , 1 ) = 2 ;
%**********************************

% below lb21 **********************
val_1 = close < lb21 ;
val_2 = close < lb5 ;
matches = val_1 .* val_2 ;
ix = find( matches == 1 ) ;
market_type( ix , 1 ) = -4 ;

val_1 = close < lb21 ;
val_2 = close <= ub5 ;
val_3 = close >= lb5 ;
matches = val_1 .* val_2 .* val_3 ;
ix = find( matches == 1 ) ;
market_type( ix , 1 ) = -3 ;

val_1 = close < lb21 ;
val_2 = close > ub5 ;
matches = val_1 .* val_2 ;
ix = find( matches == 1 ) ;
market_type( ix , 1 ) = -2 ;
%**********************************

% between ub21 & lb21**************
val_1 = close <= ub21 ;
val_2 = close >= lb21 ;
val_3 = close > ub5 ;
matches = val_1 .* val_2 .* val_3 ;
ix = find( matches == 1 ) ;
market_type( ix , 1 ) = 1 ;

val_1 = close <= ub21 ;
val_2 = close >= lb21 ;
val_3 = close < lb5 ;
matches = val_1 .* val_2 .* val_3 ;
ix = find( matches == 1 ) ;
market_type( ix , 1 ) = -1 ;
%**********************************

% now allocate to bins
normalised_logdiffs = logdiff ./ shift( sma5 , 1 ) ;

% shift market_type to agree with normalised_logdiffs
shifted_market_type = shift( market_type , 1 ) ;

bin_4 = normalised_logdiffs( find( shifted_market_type(30:end,1) == 4 ) , 1 ) ;
bin_3 = normalised_logdiffs( find( shifted_market_type(30:end,1) == 3 ) , 1 ) ;
bin_2 = normalised_logdiffs( find( shifted_market_type(30:end,1) == 2 ) , 1 ) ;
bin_1 = normalised_logdiffs( find( shifted_market_type(30:end,1) == 1 ) , 1 ) ;
bin_0 = normalised_logdiffs( find( shifted_market_type(30:end,1) == 0 ) , 1 ) ;
bin_1m = normalised_logdiffs( find( shifted_market_type(30:end,1) == -1 ) , 1 ) ;
bin_2m = normalised_logdiffs( find( shifted_market_type(30:end,1) == -2 ) , 1 ) ;
bin_3m = normalised_logdiffs( find( shifted_market_type(30:end,1) == -3 ) , 1 ) ;
bin_4m = normalised_logdiffs( find( shifted_market_type(30:end,1) == -4 ) , 1 ) ;

y_axis_max = max( normalised_logdiffs(30:end,1) ) ;
y_axis_min = min( normalised_logdiffs(30:end,1) ) ;

clf ;
subplot(191) ; boxplot(bin_4m,NOTCHED=1) ; ylim([y_axis_min y_axis_max]) ;
subplot(192) ; boxplot(bin_3m,NOTCHED=1) ; ylim([y_axis_min y_axis_max]) ;
subplot(193) ; boxplot(bin_2m,NOTCHED=1) ; ylim([y_axis_min y_axis_max]) ;
subplot(194) ; boxplot(bin_1m,NOTCHED=1) ; ylim([y_axis_min y_axis_max]) ;
subplot(195) ; boxplot(bin_0,NOTCHED=1) ; ylim([y_axis_min y_axis_max]) ;
subplot(196) ; boxplot(bin_1,NOTCHED=1) ; ylim([y_axis_min y_axis_max]) ;
subplot(197) ; boxplot(bin_2,NOTCHED=1) ; ylim([y_axis_min y_axis_max]) ;
subplot(198) ; boxplot(bin_3,NOTCHED=1) ; ylim([y_axis_min y_axis_max]) ;
subplot(199) ; boxplot(bin_4,NOTCHED=1) ; ylim([y_axis_min y_axis_max]) ;

No comments: