Wednesday 26 November 2014

Test of the MFE/MAE Indicator

Continuing from my last post, wherein I stated I was going to conduct a more pertinent statistical test of the returns of the bars(s) immediately following the N best, Cauchy Schwarz matching algorithm matched bars in the price history, readers may recall that the basic premise behind this algorithm is that by matching current price action to the N best matches, the price action after these matches can be used to infer what will occur after the current price action. However, rather than test the price action directly I have decided to apply the test to the MFE/MAE indicator. There are several reasons for this, which are enumerated below.
  1. I intend to use the indicator as the target function for future Neural net training
  2. the indicator represents a reward to risk ratio, which indirectly reflects price action itself, but without the noise of said action
  3. this reward to risk ratio is of much more direct concern, from a trading perspective, than accurately predicting price
  4. since the indicator is now included as a feature in the matching algorithm, testing the indicator is, very indirectly, a test of the matching algorithm too
The test I have in mind is a hybrid of a hypothesis test, Cross validation and the application of Statistical process control.  Consider the chart below:
This shows two sampling distributions of the mean for Long MFE/MAE indicator values > 0.5, the upper pane for sample sizes of 20 and the lower pane for 75. For simplicity I shall only discuss the Long > 0.5 version of the indicator, but everything that follows applies equally to the Short version. As expected the upper pane shows greater variance, and for the envisioned test a whole series of these sampling distributions will be produced for different sampling rates. The way I intend it to work is as follows:
  • take a single bar in the history and see what the value of the MFE/MAE indicator value is 3 bars later (assume > 0.5 for this exposition, so we compare to long sampling distributions only)
  • get the top 20 matched bars for the above selected bar and the corresponding 20 indicator values for 3 bars later and take the mean of these 20 indicator values
  • check if this mean falls within the sampling distribution of the mean of 20, as shown in the upper pane above by the vertical black line at 0.8 on the x axis. If it does fall with the sampling distribution, we accept the null hypothesis that the 20 best matches in history future indicator values and the value of the indicator after the bar to be matched come from the same distribution
  • repeat the immediately preceding step for means of 21, 22, ... etc until such time as the null hypothesis can be rejected, shown in the lower pane above. At this point, we then then declare an upper bound on the historical number of matches for the bar to be predicted
For any single bar to be predicted we can then produce the following chart, which is completely artificial and just for illustrative purposes:
where the cyan and red lines are the +/- 2 standard deviations above/below a notional mean value for the whole distribution of approximately 0.85, and the chart can be considered to be a type of control chart. The upper and lower control lines converge towards the right, reflecting the decreasing variance of increasingly large N sample means, as shown in the first chart above. The green line represents the cumulative N sample mean of the best N historical matches' future values. I have shown it as decreasing as it is to be expected that as more N matches are included, the greater the chance that incorrect matches, unexpected price reversals etc. will be caught up in this mean calculation, resulting in the mean value moving into the left tail of the sampling distribution. This effect combines with the shrinking variance to reach a critical point (rejection of the null hypothesis) at which the green line exits below the lower control line.

The purpose of all the above is provide a principled manner to choose the number N matches from the Cauchy-Schwarz matching algorithm to supply instances of training data to the envisioned neural net training. An incidental benefit of this approach is that it is indirectly a hypothesis test of the fundamental assumption underlying the matching algorithm; namely that past price action has predictive ability for future price action, and furthermore, it is a test of the MFE/MAE indicator. Discussion of the results of these tests in a future post.       

Wednesday 12 November 2014

First Use for the MFE/MAE Indicator

This first use is as an input to my Cauchy-Schwarz matching algorithm, previous posts about which can be read herehere and here. The screen shot below shows what I would characterise as a "good" set of matches: 
The top left pane shows the original section of the price series to be matched, and the panes labelled #1, #5, etc. are the best match, 5th best match and so on respectively. The last 3 rightmost bars in each pane are "future" price bars, i.e. the 4th bar in from the right is the target bar that is being matched, matched over all the bars to the left or in the past of this target bar.

I consider the above to be a set of "good" matches because, for the #1 through #25 matches for "future" bars:
  • if one considers the logic of the mfe/mae indicator each pane gives indicator readings of "long," which all agree with the original "future" bars
  • similarly the mae (maximum adverse excursion) occurs on the day immediately following the matched day
  • the mfe (maximum favourable excursion) occurs on the 3rd "future" bar, with the slight exception of pane #10
  • the marked to market returns of an entry at the open of the 1st "future" bar to the close of the 3rd "future" bar all show a profit, as does the original pane
However, it can be seen that the above noted "goodness" breaks down for panes #25 and #30, which leads me to postulate that there is an upper bound on the number of matches for which there is predictive ability for "future" returns.

In the above linked posts the test statistic used to judge the predictive efficacy of the matching algorithm was effect size. However, I think a more pertinent test statistic to use would be the average bar return over the bars immediately following a matched bar, and a discussion of this will be the subject of my next post.  

Wednesday 5 November 2014

A New MFE/MAE Indicator.

After stopping my investigation of tools for spectral analysis over the last few weeks I have been doing another mooc, this time Learning from Data, and also working on the idea of one of my earlier posts.

In the above linked post there is a video showing the idea as a "paint bar" study. However, I thought it would be a good idea to render it as an indicator, the C++ Octave .oct code for which is shown in the code box below.
DEFUN_DLD ( adjustable_mfe_mae_from_open_indicator, args, nargout,
"-*- texinfo -*-\n\
@deftypefn {Function File} {} adjustable_mfe_mae_from_open_indicator (@var{open,high,low,close,lookback_length})\n\
This function takes four input series for the OHLC and a value for lookback length. The main outputs are\n\
two indicators, long and short, that show the ratio of the MFE over the MAE from the open of the specified\n\
lookback in the past. The indicators are normalised to the range 0 to 1 by a sigmoid function and a MFE/MAE\n\
ratio of 1:1 is shifted in the sigmoid function to give a 'neutral' indicator reading of 0.5. A third output\n\
is the max high - min low range over the lookback_length normalised by the range of the daily support and\n\
resistance levels S1 and R1 calculated for the first bar of the lookback period. This is also normalised to\n\
give a reading of 0.5 in the sigmoid function if the ratio is 1:1. The point of this third output is to give\n\
some relative scale to the unitless MFE/MAE ratio and to act as a measure of strength or importance of the\n\
MFE/MAE ratio.\n\
@end deftypefn" )

{
octave_value_list retval_list ;
int nargin = args.length () ;

// check the input arguments
if ( nargin != 5 )
   {
   error ( "Invalid arguments. Arguments are price series for open, high, low and close and value for lookback length." ) ;
   return retval_list ;
   }

if ( args(4).length () != 1 )
   {
   error ( "Invalid argument. Argument 5 is a scalar value for the lookback length." ) ;
   return retval_list ;
   }
   
   int lookback_length = args(4).int_value() ;

if ( args(0).length () < lookback_length )
   {
   error ( "Invalid argument lengths. Argument lengths for open, high, low and close vectors should be >= lookback length." ) ;
   return retval_list ;
   }
   
if ( args(1).length () != args(0).length () )
   {
   error ( "Invalid argument lengths. Argument lengths for open, high, low and close vectors should be equal." ) ;
   return retval_list ;
   }
   
if ( args(2).length () != args(0).length () )
   {
   error ( "Invalid argument lengths. Argument lengths for open, high, low and close vectors should be equal." ) ;
   return retval_list ;
   }
   
if ( args(3).length () != args(0).length () )
   {
   error ( "Invalid argument lengths. Argument lengths for open, high, low and close vectors should be equal." ) ;
   return retval_list ;
   }   

if (error_state)
   {
   error ( "Invalid arguments. Arguments are price series for open, high, low and close and value for lookback length." ) ;
   return retval_list ;
   }
// end of input checking
  
// inputs
ColumnVector open = args(0).column_vector_value () ;
ColumnVector high = args(1).column_vector_value () ;
ColumnVector low = args(2).column_vector_value () ;
ColumnVector close = args(3).column_vector_value () ;
// outputs
ColumnVector long_mfe_mae = args(0).column_vector_value () ;
ColumnVector short_mfe_mae = args(0).column_vector_value () ;
ColumnVector range = args(0).column_vector_value () ;

// variables
double max_high = *std::max_element( &high(0), &high( lookback_length ) ) ;
double min_low = *std::min_element( &low(0), &low( lookback_length ) ) ;
double pivot_point = ( high(0) + low(0) + close(0) ) / 3.0 ;
double s1 = 2.0 * pivot_point - high(0) ;
double r1 = 2.0 * pivot_point - low(0) ;

for ( octave_idx_type ii (0) ; ii < lookback_length ; ii++ ) // initial ii loop
    {

      // long_mfe_mae
      if ( open(0) > min_low ) // the "normal" situation
      {
 long_mfe_mae(ii) = 1.0 / ( 1.0 + exp( -( ( max_high - open(0) ) / ( open(0) - min_low ) - 1.0 ) ) ) ;
      }
      else if ( open(0) == min_low )
      {
 long_mfe_mae(ii) = 1.0 ;
      }
      else
      {
 long_mfe_mae(ii) = 0.5 ;
      }
           
      // short_mfe_mae
      if ( open(0) < max_high ) // the "normal" situation
      {
 short_mfe_mae(ii) = 1.0 / ( 1.0 + exp( -( ( open(0) - min_low ) / ( max_high - open(0) ) - 1.0 ) ) ) ;
      }
      else if ( open(0) == max_high )
      {
 short_mfe_mae(ii) = 1.0 ;
      }
      else
      {
 short_mfe_mae(ii) = 0.5 ;
      }
      
      range(ii) = 1.0 / ( 1.0 + exp( -( ( max_high - min_low ) / ( r1 - s1 ) - 1.0 ) ) ) ;
      
    } // end of initial ii loop

for ( octave_idx_type ii ( lookback_length ) ; ii < args(0).length() ; ii++ ) // main ii loop
    { 
    // assign variable values  
    max_high = *std::max_element( &high( ii - lookback_length + 1 ), &high( ii + 1 ) ) ;
    min_low = *std::min_element( &low( ii - lookback_length + 1 ), &low( ii + 1 ) ) ;
    pivot_point = ( high(ii-lookback_length) + low(ii-lookback_length) + close(ii-lookback_length) ) / 3.0 ;
    s1 = 2.0 * pivot_point - high(ii-lookback_length) ;
    r1 = 2.0 * pivot_point - low(ii-lookback_length) ;

      // long_mfe_mae
      if ( open( ii - lookback_length + 1 ) > min_low && open( ii - lookback_length + 1 ) < max_high ) // the "normal" situation
      {
 long_mfe_mae(ii) = 1.0 / ( 1.0 + exp( -( ( max_high - open( ii - lookback_length + 1 ) ) / ( open( ii - lookback_length + 1 ) - min_low ) - 1.0 ) ) ) ;
      }
      else if ( open( ii - lookback_length + 1 ) == min_low )
      {
 long_mfe_mae(ii) = 1.0 ;
      }
      else
      {
 long_mfe_mae(ii) = 0.0 ;
      }
   
      // short_mfe_mae
      if ( open( ii - lookback_length + 1 ) > min_low && open( ii - lookback_length + 1 ) < max_high ) // the "normal" situation
      {
 short_mfe_mae(ii) = 1.0 / ( 1.0 + exp( -( ( open( ii - lookback_length + 1 ) - min_low ) / ( max_high - open( ii - lookback_length + 1 ) ) - 1.0 ) ) ) ;
      }
      else if ( open( ii - lookback_length + 1 ) == max_high )
      {
 short_mfe_mae(ii) = 1.0 ;
      }
      else
      {
 short_mfe_mae(ii) = 0.0 ;
      }
      
      range(ii) = 1.0 / ( 1.0 + exp( -( ( max_high - min_low ) / ( r1 - s1 ) - 1.0 ) ) ) ;

    } // end of main ii loop

retval_list(2) = range ;    
retval_list(1) = short_mfe_mae ;
retval_list(0) = long_mfe_mae ;

return retval_list ;
  
} // end of function
The way to interpret this is as follows:
  • if the "long" indicator reading is above 0.5, go long
  • if the "short" is above 0.5, go short
  • if both are below 0.5, go flat
An alternative, if the indicator reading is flat, is to maintain any previous non flat position. I won't show a chart of the indicator itself as it just looks like a very noisy oscillator, but the equity curve(s) of it, without the benefit of foresight, on the EURUSD forex pair are shown below.
The yellow equity curve is the cumulative, close to close, tick returns of a buy and hold strategy, the blue is the return going flat when indicated, and the red maintaining the previous position when flat is indicated. Not much to write home about. However, this second chart shows the return when one has the benefit of the "peek into the future" as discussed in my earlier post.
The colour of the curves are as before except for the addition of the green equity curve, which is the cumulative, vwap value to vwap value tick returns, a simple representation of what an equity curve with realistic slippage might look like. This second set of equity curves shows the promise of what could be achievable if a neural net to accurately predict future values of the above indicator can be trained. More in an upcoming post.