Showing posts with label Naive Bayesian Classifier. Show all posts
Showing posts with label Naive Bayesian Classifier. Show all posts

Wednesday, 5 December 2012

Neural Net Market Classifier to Replace Bayesian Market Classifier

I have now completed the cross validation test I wanted to run, which compares my current Bayesian classifier with the recently retrained "reserve neural net," the results of which are shown in the code box below. The test consists of 50,000 random iterations of my usual "ideal" 5 market types with the market classifications from both of the above classifiers being compared with the actual, known market type. There are 2 points of comparison in each iteration: the last price bar in the sequence, identified as "End," and a randomly picked price bar from the 4 immediately preceding the last bar, identified as "Random."
Number of times to loop: 50000
Elapsed time is 1804.46 seconds.

Random NN
Complete Accuracy percentage: 50.354000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 1.288000
Predicted = unr & actual = uwr: 6.950000
Predicted = dwr & actual = dnr: 1.268000
Predicted = dnr & actual = dwr: 6.668000
Predicted = uwr & actual = cyc: 3.750000
Predicted = dwr & actual = cyc: 6.668000
Predicted = cyc & actual = uwr: 2.242000
Predicted = cyc & actual = dwr: 2.032000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 2.140000
Predicted = unr & actual = dwr: 2.140000
Predicted = dwr & actual = uwr: 2.500000
Predicted = dnr & actual = uwr: 2.500000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 0.838000
Predicted = dnr & actual = unr: 0.716000

End NN
Complete Accuracy percentage: 48.280000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 1.248000
Predicted = unr & actual = uwr: 7.630000
Predicted = dwr & actual = dnr: 0.990000
Predicted = dnr & actual = dwr: 7.392000
Predicted = uwr & actual = cyc: 3.634000
Predicted = dwr & actual = cyc: 7.392000
Predicted = cyc & actual = uwr: 1.974000
Predicted = cyc & actual = dwr: 1.718000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 2.170000
Predicted = unr & actual = dwr: 2.170000
Predicted = dwr & actual = uwr: 2.578000
Predicted = dnr & actual = uwr: 2.578000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 1.050000
Predicted = dnr & actual = unr: 0.886000

Random Bayes
Complete Accuracy percentage: 19.450000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 7.554000
Predicted = unr & actual = uwr: 2.902000
Predicted = dwr & actual = dnr: 7.488000
Predicted = dnr & actual = dwr: 2.712000
Predicted = uwr & actual = cyc: 5.278000
Predicted = dwr & actual = cyc: 2.712000
Predicted = cyc & actual = uwr: 0.000000
Predicted = cyc & actual = dwr: 0.000000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 5.730000
Predicted = unr & actual = dwr: 5.730000
Predicted = dwr & actual = uwr: 5.642000
Predicted = dnr & actual = uwr: 5.642000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 0.162000
Predicted = dnr & actual = unr: 0.128000

End Bayes
Complete Accuracy percentage: 24.212000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 8.400000
Predicted = unr & actual = uwr: 2.236000
Predicted = dwr & actual = dnr: 7.866000
Predicted = dnr & actual = dwr: 1.960000
Predicted = uwr & actual = cyc: 6.142000
Predicted = dwr & actual = cyc: 1.960000
Predicted = cyc & actual = uwr: 0.000000
Predicted = cyc & actual = dwr: 0.000000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 5.110000
Predicted = unr & actual = dwr: 5.110000
Predicted = dwr & actual = uwr: 4.842000
Predicted = dnr & actual = uwr: 4.842000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 0.048000
Predicted = dnr & actual = unr: 0.040000
A Quick Analysis
  • Looking at the figures for complete accuracy it can be seen that the Bayesian classifier is not much better than randomly guessing, with 19.45% and 24.21% for "Random Bayes" and "End Bayes" respectively. The corresponding accuracy figures for the NN are 50.35% and 48.28%.
  • In the "dubious, difficult to trade" mis-classification category Bayes gets approx. 22% and 20% this wrong, whilst for the NN these figures halve to approx. 9.5% and 9.5%.
  • In the "acceptable" mis-classification category Bayes gets approx. 29% and 29%, with the NN being more or less the same.
Although this is not a completely rigorous test, I am satisfied that the NN has shown its superiority over the Bayesian classifier. Also, I believe that there is significant scope to improve the NN even more by adding additional features, changes in architecture and use of the Softmax unit etc. As a result, I have decided to gracefully retire the Bayesian classifier and deploy the NN classifier in its place.

Thursday, 2 August 2012

Results of Comparative Cross Validation Tests

As expected the NN achieved 100 % accuracy and my prediction of 20 % to 30 % accuracy for my current Naive Bayesian Classifier was more or less right - in various runs of sample sizes up to 50,000 it achieved accuracy rates of 30 % to 33 %. A screen shot of both classifiers applied to the last 200 days worth of S & P futures prices is shown below, with the Naive Bayesian in the upper pane and the NN in the lower pane.
However, despite it vastly superior performance in the tests, I don't really like the look of the NN on real data - it appears to be more erratic or noisier than the Bayesian classifier. I suspect that the NN may be overly complex, with 54 nodes in its one hidden layer. I shall try to improve the NN by reducing the number of hidden layer nodes to 25, and then seeing how that looks on real data.

Tuesday, 31 July 2012

Successful Completion of Neural Net Cross Validation Tests

In my last post I suggested that I was unsure of my coding of the cross validation test I had written so what I have done is take a new coding approach and completely rewritten the test, which I'm happy to say has been very successful. Using this newly coded implementation the out of sample accuracy of the trained neural nets is 100 %. As before, these tests were run overnight, but this time for a total of 2,400,000 separate test examples due to increased code efficiency.

The next test I'm going to code, more out of curiosity than anything else, is a concurrent cross validation test to test both my new neural net classifier algorithm and my Naive Bayesian Classifier together. I expect the NN to again obtain results similar to the above, but anticipate that the Naive Bayesian Classifier will perform quite poorly, achieving between 20 % to 30 % accuracy. I expect such low performance simply because the Naive Bayesian Classifier was developed using just 5 exemplar market type examples compared to 25 for the NN.

Thursday, 12 July 2012

Brute Force Classifier in Action

As an update to my recent post, here is a short video of the brute force similarity search classifier in action.

Non-embedded view here.
The coloured coded candlestick bars are coloured thus: purple = a cyclic market classification; green = up with retracement; blue = up with no retracement; yellow = down with retracement; red = down with no retracement. The upper price series is the classification as per the brute force algorithm and the lower is the classification as per my Naive Bayesian classifier, shown for comparative purposes.  The cyan trend line is my implementation of a Kalman filter, and where this trend line extends out at the hard right hand edge of the chart it changes to become the prediction of the Kalman filter for the next 10 bars, this prediction based on extending the pattern that was selected during the run of the brute force algorithm.

I will leave it up to readers to judge for themselves the efficacy of this new indicator, but I think it shows some promise, and I have some ideas about how it can be improved. This, however, is work for the future. For now I intend to crack on with working on my neural net classification algorithm. 

Wednesday, 6 June 2012

Creation of a Simple Benchmark Suite

For some time now I have been toying with the idea of creating a simple benchmark suite to compare my own back test system performance with that of some public domain trading systems. I decided to select a few examples from the Trading Blox Users' Guide, specifically:
  • exponential moving average crossovers of periods 10-20 and 20-50
  • triple moving average crossover system with periods 10-20-50
  • Bollinger band breakouts of periods 20 and 50 with 1 & 2 standard deviations for exits and entries
  • donchian channel breakouts with periods 20-10 and 50-25
This is a total of 7 systems, and in the Rcpp code below these form a sort of "committee" to vote to be either long/short 1 contract, or neutral.

# This function takes as inputs vectors of opening and closing prices
# and creates a basic benchmark output suite of system equity curves 
# for the following basic trend following systems
# exponential moving average crossovers of 10-20 and 20-50
# triple moving average crossovers of 10-20-50
# bollinger band breakouts of 20 and 50 with 1 & 2 standard deviations
# donchian channel breakouts of 20-10 and 50-25
# The libraries required to compile this function are
# "Rcpp," "inline" and "compiler." The function is compiled by the command
# > source("basic_benchmark_equity.r") in the console/terminal.

library(Rcpp) # load the required library
library(inline) # load the required library
library(compiler) # load the required library

src <- '
#include 
#include 

Rcpp::NumericVector open(a) ;
Rcpp::NumericVector close(b) ;
Rcpp::NumericVector market_mode(c) ;
Rcpp::NumericVector kalman(d) ;
Rcpp::NumericVector tick_size(e) ;
Rcpp::NumericVector tick_value(f) ;
int n = open.size() ;
Rcpp::NumericVector sma_10(n) ;
Rcpp::NumericVector sma_20(n) ;
Rcpp::NumericVector sma_50(n) ;
Rcpp::NumericVector std_20(n) ;
Rcpp::NumericVector std_50(n) ;

// create equity output vectors
Rcpp::NumericVector market_mode_long_eq(n) ;
Rcpp::NumericVector market_mode_short_eq(n) ;
Rcpp::NumericVector market_mode_composite_eq(n) ;
Rcpp::NumericVector sma_10_20_eq(n) ; 
Rcpp::NumericVector sma_20_50_eq(n) ; 
Rcpp::NumericVector tma_eq(n) ; 
Rcpp::NumericVector bbo_20_eq(n) ;
Rcpp::NumericVector bbo_50_eq(n) ;
Rcpp::NumericVector donc_20_eq(n) ;
Rcpp::NumericVector donc_50_eq(n) ;
Rcpp::NumericVector composite_eq(n) ; 

// position vectors for benchmark systems
Rcpp::NumericVector sma_10_20_pv(1) ;
sma_10_20_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector sma_20_50_pv(1) ;
sma_20_50_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector tma_pv(1) ;
tma_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector bbo_20_pv(1) ;
bbo_20_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector bbo_50_pv(1) ;
bbo_50_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector donc_20_pv(1) ;
donc_20_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector donc_50_pv(1) ;
donc_50_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector comp_pv(1) ;
comp_pv[0] = 0.0 ; // initialise to zero, no position

// fill the equity curve vectors with zeros for "burn in" period
// and create the initial values for all indicators

for ( int ii = 0 ; ii < 50 ; ii++ ) {
    
    if ( ii >= 40 ) {
    sma_10[49] += close[ii] ; }

    if ( ii >= 30 ) {
    sma_20[49] += close[ii] ; }

    sma_50[49] += close[ii] ;

    std_20[ii] = 0.0 ;
    std_50[ii] = 0.0 ;

    market_mode_long_eq[ii] = 0.0 ;
    market_mode_short_eq[ii] = 0.0 ;
    market_mode_composite_eq = 0.0 ;
    sma_10_20_eq[ii] = 0.0 ;
    sma_20_50_eq[ii] = 0.0 ; 
    bbo_20_eq[ii] = 0.0 ;
    bbo_50_eq[ii] = 0.0 ;
    tma_eq[ii] = 0.0 ;
    donc_20_eq[ii] = 0.0 ;
    donc_50_eq[ii] = 0.0 ; 
    composite_eq[ii] = 0.0 ; } // end of initialising loop

    sma_10[49] = sma_10[49] / 10.0 ;
    sma_20[49] = sma_20[49] / 20.0 ;
    sma_50[49] = sma_50[49] / 50.0 ;

// the main calculation loop
for ( int ii = 50 ; ii < n-2 ; ii++ ) {

    // calculate the smas
    sma_10[ii] = ( sma_10[ii-1] - sma_10[ii-10] / 10.0 ) + ( close[ii] / 10.0 ) ;
    sma_20[ii] = ( sma_20[ii-1] - sma_20[ii-20] / 20.0 ) + ( close[ii] / 20.0 ) ;
    sma_50[ii] = ( sma_50[ii-1] - sma_50[ii-50] / 50.0 ) + ( close[ii] / 50.0 ) ;

      // calculate the standard deviations
      for ( int jj = 0 ; jj < 50 ; jj++ ) {
    
      if ( jj < 20 ) {
      std_20[ii] += ( close[ii-jj] - sma_20[ii] ) * ( close[ii-jj] - sma_20[ii] )  ; } // end of jj if

      std_50[ii] += ( close[ii-jj] - sma_50[ii] ) * ( close[ii-jj] - sma_50[ii] ) ; } // end of standard deviation loop

    std_20[ii] = sqrt( std_20[ii] / 20.0 ) ;
    std_50[ii] = sqrt( std_50[ii] / 50.0 ) ;

    //-------------------------------------------------------------------------------------------------------------------

    // calculate the equity values of the market modes
    // market_mode uwr and unr long signals
    if ( market_mode[ii] == 1 || market_mode[ii] == 2 ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; }

    // market_mode dwr and dnr short signals
    if ( market_mode[ii] == 3 || market_mode[ii] == 4 ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; }

    // calculate the equity values of the market modes
    // market_mode cyc long signals
    if ( market_mode[ii] == 0 && kalman[ii] > kalman[ii-1] ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; }

    // market_mode cyc short signals
    if ( market_mode[ii] == 0 && kalman[ii] < kalman[ii-1] ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; }

    //----------------------------------------------------------------------------------------------------------------------------

    // calculate the equity values and positions of each benchmark system
    // sma_10_20_eq
    if ( sma_10[ii] > sma_20[ii] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_10_20_pv[0] = 1.0 ; } // long

    // sma_10_20_eq
    if ( sma_10[ii] < sma_20[ii] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_10_20_pv[0] = -1.0 ; } // short

    // sma_10_20_eq
    if ( sma_10[ii] == sma_20[ii] && sma_10[ii-1] > sma_20[ii-1] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_10_20_pv[0] = 1.0 ; } // long

    // sma_10_20_eq
    if ( sma_10[ii] == sma_20[ii] && sma_10[ii-1] < sma_20[ii-1] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_10_20_pv[0] = -1.0 ; } // short

    //-----------------------------------------------------------------------------------------------------------

    // sma_20_50_eq
    if ( sma_20[ii] > sma_50[ii] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_20_50_pv[0] = 1.0 ; } // long

    // sma_20_50_eq
    if ( sma_20[ii] < sma_50[ii] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_20_50_pv[0] = -1.0 ; } // short

    // sma_20_50_eq
    if ( sma_20[ii] == sma_50[ii] && sma_20[ii-1] > sma_50[ii-1] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_20_50_pv[0] = 1.0 ; } // long

    // sma_20_50_eq
    if ( sma_20[ii] == sma_50[ii] && sma_20[ii-1] < sma_50[ii-1] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_20_50_pv[0] = -1.0 ; } // short

    //-----------------------------------------------------------------------------------------------------------

    // tma_eq
    if ( tma_pv[0] == 0.0 ) {

      // default position
      tma_eq[ii] = tma_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( sma_10[ii] > sma_20[ii] && sma_20[ii] > sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      tma_pv[0] = 1.0 ; } // long

      if ( sma_10[ii] < sma_20[ii] && sma_20[ii] < sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      tma_pv[0] = -1.0 ; } // short

    } // end of tma_pv == 0.0 loop

    if ( tma_pv[0] == 1.0 ) {

      // default long position
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( sma_10[ii] < sma_20[ii] && sma_10[ii] > sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] ; 
      tma_pv[0] = 0.0 ; } // exit long, go neutral

      if ( sma_10[ii] < sma_20[ii] && sma_20[ii] < sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      tma_pv[0] = -1.0 ; } // short
    
    } // end of tma_pv == 1.0 loop

    if ( tma_pv[0] == -1.0 ) {

      // default short position
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( sma_10[ii] > sma_20[ii] && sma_10[ii] < sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] ; 
      tma_pv[0] = 0.0 ; } // exit short, go neutral

      if ( sma_10[ii] > sma_20[ii] && sma_20[ii] > sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      tma_pv[0] = 1.0 ; } // long

    } // end of tma_pv == -1.0 loop

    //------------------------------------------------------------------------------------------------------------

    // bbo_20_eq
    if ( bbo_20_pv[0] == 0.0 ) {

      // default position
      bbo_20_eq[ii] = bbo_20_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > sma_20[ii] + 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_20_pv[0] = 1.0 ; } // long

      if ( close[ii] < sma_20[ii] - 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_20_pv[0] = -1.0 ; } // short

    } // end of bbo_20_pv == 0.0 loop

    if ( bbo_20_pv[0] == 1.0 ) {

      // default long position
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < sma_20[ii] + std_20[ii] && close[ii] > sma_20[ii] - 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] ; 
      bbo_20_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < sma_20[ii] - 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_20_pv[0] = -1.0 ; } // short
    
    } // end of bbo_20_pv == 1.0 loop

    if ( bbo_20_pv[0] == -1.0 ) {

      // default short position
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > sma_20[ii] - std_20[ii] && close[ii] < sma_20[ii] + 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] ; 
      bbo_20_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > sma_20[ii] + 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_20_pv[0] = 1.0 ; } // long

    } // end of bbo_20_pv == -1.0 loop

    //-------------------------------------------------------------------------------------------------

    // bbo_50_eq
    if ( bbo_50_pv[0] == 0.0 ) {

      // default position
      bbo_50_eq[ii] = bbo_50_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > sma_50[ii] + 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_50_pv[0] = 1.0 ; } // long

      if ( close[ii] < sma_50[ii] - 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_50_pv[0] = -1.0 ; } // short

    } // end of bbo_50_pv == 0.0 loop

    if ( bbo_50_pv[0] == 1.0 ) {

      // default long position
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < sma_50[ii] + std_50[ii] && close[ii] > sma_50[ii] - 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] ; 
      bbo_50_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < sma_50[ii] - 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_50_pv[0] = -1.0 ; } // short
    
    } // end of bbo_50_pv == 1.0 loop

    if ( bbo_50_pv[0] == -1.0 ) {

      // default short position
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > sma_50[ii] - std_50[ii] && close[ii] < sma_50[ii] + 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] ; 
      bbo_50_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > sma_50[ii] + 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_50_pv[0] = 1.0 ; } // long

    } // end of bbo_50_pv == -1.0 loop

    //-----------------------------------------------------------------------------------------------------

    // donc_20_eq
    if ( donc_20_pv[0] == 0.0 ) {

      // default position
      donc_20_eq[ii] = donc_20_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_20_pv[0] = 1.0 ; } // long

      if ( close[ii] < *std::min_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_20_pv[0] = -1.0 ; } // short

    } // end of donc_20_pv == 0.0 loop

    if ( donc_20_pv[0] == 1.0 ) {

      // default long position
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < *std::min_element( &close[ii-10], &close[ii] ) && close[ii] > *std::min_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] ; 
      donc_20_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < *std::min_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_20_pv[0] = -1.0 ; } // short
    
    } // end of donc_20_pv == 1.0 loop

    if ( donc_20_pv[0] == -1.0 ) {

      // default short position
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-10], &close[ii] ) && close[ii] < *std::max_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] ; 
      donc_20_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > *std::max_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_20_pv[0] = 1.0 ; } // long

    } // end of donc_20_pv == -1.0 loop

    //-------------------------------------------------------------------------------------------------

    // donc_50_eq
    if ( donc_50_pv[0] == 0.0 ) {

      // default position
      donc_50_eq[ii] = donc_50_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_50_pv[0] = 1.0 ; } // long

      if ( close[ii] < *std::min_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_50_pv[0] = -1.0 ; } // short

    } // end of donc_50_pv == 0.0 loop

    if ( donc_50_pv[0] == 1.0 ) {

      // default long position
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < *std::min_element( &close[ii-25], &close[ii] ) && close[ii] > *std::min_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] ; 
      donc_50_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < *std::min_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_50_pv[0] = -1.0 ; } // short
    
    } // end of donc_50_pv == 1.0 loop

    if ( donc_50_pv[0] == -1.0 ) {

      // default short position
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-25], &close[ii] ) && close[ii] < *std::max_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] ; 
      donc_50_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > *std::max_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_50_pv[0] = 1.0 ; } // long

    } // end of donc_50_pv == -1.0 loop

    //-------------------------------------------------------------------------------------------------

    // composite_eq
    comp_pv[0] = sma_10_20_pv[0] + sma_20_50_pv[0] + tma_pv[0] + bbo_20_pv[0] + bbo_50_pv[0] + donc_20_pv[0] + donc_50_pv[0] ;
    
    if ( comp_pv[0] > 0 ) {
    composite_eq[ii] = composite_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; } // long

    if ( comp_pv[0] < 0 ) {
    composite_eq[ii] = composite_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; } // short 

    if ( comp_pv[0] == 0 ) {
    composite_eq[ii] = composite_eq[ii-1] ; } // neutral 

} // end of main for loop

// Now fill in the last two spaces in the equity vectors
market_mode_long_eq[n-1] = market_mode_long_eq[n-3] ;
market_mode_long_eq[n-2] = market_mode_long_eq[n-3] ;

market_mode_short_eq[n-1] = market_mode_short_eq[n-3] ;
market_mode_short_eq[n-2] = market_mode_short_eq[n-3] ;

market_mode_composite_eq[n-1] = market_mode_composite_eq[n-3] ;
market_mode_composite_eq[n-2] = market_mode_composite_eq[n-3] ;

sma_10_20_eq[n-1] = sma_10_20_eq[n-3] ;
sma_10_20_eq[n-2] = sma_10_20_eq[n-3] ;

sma_20_50_eq[n-1] = sma_20_50_eq[n-3] ;
sma_20_50_eq[n-2] = sma_20_50_eq[n-3] ;

tma_eq[n-1] = tma_eq[n-3] ;
tma_eq[n-2] = tma_eq[n-3] ;

bbo_20_eq[n-1] = bbo_20_eq[n-3] ;
bbo_20_eq[n-2] = bbo_20_eq[n-3] ;

bbo_50_eq[n-1] = bbo_50_eq[n-3] ;
bbo_50_eq[n-2] = bbo_50_eq[n-3] ;

donc_20_eq[n-1] = donc_20_eq[n-3] ;
donc_20_eq[n-2] = donc_20_eq[n-3] ;

donc_50_eq[n-1] = donc_50_eq[n-3] ;
donc_50_eq[n-2] = donc_50_eq[n-3] ;

composite_eq[n-1] = composite_eq[n-3] ;
composite_eq[n-2] = composite_eq[n-3] ;

return List::create(
  _["market_mode_long_eq"] = market_mode_long_eq ,
  _["market_mode_short_eq"] = market_mode_short_eq ,
  _["market_mode_composite_eq"] = market_mode_composite_eq ,
  _["sma_10_20_eq"] = sma_10_20_eq ,
  _["sma_20_50_eq"] = sma_20_50_eq , 
  _["tma_eq"] = tma_eq ,
  _["bbo_20_eq"] = bbo_20_eq ,
  _["bbo_50_eq"] = bbo_50_eq ,
  _["donc_20_eq"] = donc_20_eq ,
  _["donc_50_eq"] = donc_50_eq ,
  _["composite_eq"] = composite_eq ) ; '

basic_benchmark_equity <- cxxfunction(signature(a = "numeric", b = "numeric", c = "numeric",
                                d = "numeric", e = "numeric", f = "numeric"), body=src, 
                                plugin = "Rcpp")

This Rcpp function is then called thus, using Rstudio

library(xts)  # load the required library

# load the "indicator" file 
data <- read.csv(file="usdyenind",head=FALSE,sep=,)
tick_size <- 0.01
tick_value <- 12.50

# extract other vectors of interest
open <- data[,2]
close <- data[,5]
market_mode <- data[,228]
kalman <- data[,283]

results <- basic_benchmark_equity(open,close,market_mode,kalman,tick_size,tick_value)

# coerce the above results list object to a data frame object
results_df <- data.frame( results )
df_max <- max(results_df) # for scaling of results plot
df_min <- min(results_df) # for scaling of results plot

# and now create an xts object for plotting
results_xts <- xts(results_df,as.Date(data[,'V1']))

# a nice plot of the results_xts object
par(col="#0000FF")
plot(results_xts[,'market_mode_long_eq'],main="USDYEN Pair",ylab="$ Equity Value",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="#B0171F")
plot(results_xts[,'market_mode_short_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="#00FF00")
plot(results_xts[,'market_mode_composite_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="#808080")
plot(results_xts[,'sma_10_20_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'sma_20_50_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'tma_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'bbo_20_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'bbo_50_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'donc_20_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'donc_50_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="black")
plot(results_xts[,'composite_eq'],main="",ylim=c(df_min,df_max),type="l")


to output .png files which I have strung together in this video
Non-embedded view here.
The light grey equity curves are the individual curves for the benchmark systems and the black is the "committee" 1 contract equity curve. Also shown are the long and short 1 contract equity curves ( blue and red respectively ), along with a green combined equity curve for these, for my Naive Bayesian Classifier following the simple rules
  • be long 1 contract if the market type is uwr, unr or cyclic with my Kalman filter pointing upwards
  • be short 1 contract if the market type is dwr, dnr or cyclic with my Kalman filter pointing downwards
Again this is just a toy example of the use of my Bayesian Classifier.  After I have completed Andrew Ng's machine learning course, as mentioned in my previous post, I will have a go at coding a neural net which may end up replacing the Bayesian Classifier.

Saturday, 14 April 2012

Bayesian Classifier Sanity Check, Part 2

Here are the results of the sliced_charts script when applied to the "down with no retracement" market mode, and again it is pleasing to see such high percentages.
octave:1> sliced_charts
Enter matrix e.g. gcmatrix: clmatrix
points_of_interest =  132
dwr_to_dnr =  118
dwr_to_dnr_percent =  0.89394
cyc_to_dnr =  13
cyc_to_dnr_percent =  0.098485
octave:2> sliced_charts
Enter matrix e.g. gcmatrix: homatrix
points_of_interest =  124
dwr_to_dnr =  100
dwr_to_dnr_percent =  0.80645
cyc_to_dnr =  20
cyc_to_dnr_percent =  0.16129
octave:3> sliced_charts
Enter matrix e.g. gcmatrix: ngmatrix
points_of_interest =  150
dwr_to_dnr =  127
dwr_to_dnr_percent =  0.84667
cyc_to_dnr =  21
cyc_to_dnr_percent =  0.14000
octave:4> sliced_charts
Enter matrix e.g. gcmatrix: rbmatrix
points_of_interest =  121
dwr_to_dnr =  104
dwr_to_dnr_percent =  0.85950
cyc_to_dnr =  13
cyc_to_dnr_percent =  0.10744
octave:5> sliced_charts
Enter matrix e.g. gcmatrix: ccmatrix
points_of_interest =  153
dwr_to_dnr =  132
dwr_to_dnr_percent =  0.86275
cyc_to_dnr =  19
cyc_to_dnr_percent =  0.12418
octave:6> sliced_charts
Enter matrix e.g. gcmatrix: kcmatrix
points_of_interest =  148
dwr_to_dnr =  123
dwr_to_dnr_percent =  0.83108
cyc_to_dnr =  23
cyc_to_dnr_percent =  0.15541
octave:7> sliced_charts
Enter matrix e.g. gcmatrix: ojmatrix
points_of_interest =  147
dwr_to_dnr =  120
dwr_to_dnr_percent =  0.81633
cyc_to_dnr =  21
cyc_to_dnr_percent =  0.14286
octave:8> sliced_charts
Enter matrix e.g. gcmatrix: sbmatrix
points_of_interest =  109
dwr_to_dnr =  94
dwr_to_dnr_percent =  0.86239
cyc_to_dnr =  12
cyc_to_dnr_percent =  0.11009
octave:9> sliced_charts
Enter matrix e.g. gcmatrix: ctmatrix
points_of_interest =  143
dwr_to_dnr =  122
dwr_to_dnr_percent =  0.85315
cyc_to_dnr =  14
cyc_to_dnr_percent =  0.097902
octave:10> sliced_charts
Enter matrix e.g. gcmatrix: lbmatrix
points_of_interest =  142
dwr_to_dnr =  120
dwr_to_dnr_percent =  0.84507
cyc_to_dnr =  20
cyc_to_dnr_percent =  0.14085
octave:11> sliced_charts
Enter matrix e.g. gcmatrix: hgmatrix
points_of_interest =  119
dwr_to_dnr =  101
dwr_to_dnr_percent =  0.84874
cyc_to_dnr =  10
cyc_to_dnr_percent =  0.084034
octave:12> sliced_charts
Enter matrix e.g. gcmatrix: smatrix
points_of_interest =  117
dwr_to_dnr =  107
dwr_to_dnr_percent =  0.91453
cyc_to_dnr =  8
cyc_to_dnr_percent =  0.068376
octave:13> sliced_charts
Enter matrix e.g. gcmatrix: smmatrix
points_of_interest =  123
dwr_to_dnr =  105
dwr_to_dnr_percent =  0.85366
cyc_to_dnr =  17
cyc_to_dnr_percent =  0.13821
octave:14> sliced_charts
Enter matrix e.g. gcmatrix: bomatrix
points_of_interest =  136
dwr_to_dnr =  121
dwr_to_dnr_percent =  0.88971
cyc_to_dnr =  11
cyc_to_dnr_percent =  0.080882
octave:15> sliced_charts
Enter matrix e.g. gcmatrix: cmatrix
points_of_interest =  137
dwr_to_dnr =  111
dwr_to_dnr_percent =  0.81022
cyc_to_dnr =  24
cyc_to_dnr_percent =  0.17518
octave:16> sliced_charts
Enter matrix e.g. gcmatrix: omatrix
points_of_interest =  138
dwr_to_dnr =  108
dwr_to_dnr_percent =  0.78261
cyc_to_dnr =  24
cyc_to_dnr_percent =  0.17391
octave:17> sliced_charts
Enter matrix e.g. gcmatrix: wmatrix
points_of_interest =  145
dwr_to_dnr =  122
dwr_to_dnr_percent =  0.84138
cyc_to_dnr =  21
cyc_to_dnr_percent =  0.14483
octave:18> sliced_charts
Enter matrix e.g. gcmatrix: lcmatrix
points_of_interest =  136
dwr_to_dnr =  113
dwr_to_dnr_percent =  0.83088
cyc_to_dnr =  22
cyc_to_dnr_percent =  0.16176
octave:19> sliced_charts
Enter matrix e.g. gcmatrix: fcmatrix
points_of_interest =  123
dwr_to_dnr =  105
dwr_to_dnr_percent =  0.85366
cyc_to_dnr =  15
cyc_to_dnr_percent =  0.12195
octave:20> sliced_charts
Enter matrix e.g. gcmatrix: lhmatrix
points_of_interest =  147
dwr_to_dnr =  122
dwr_to_dnr_percent =  0.82993
cyc_to_dnr =  21
cyc_to_dnr_percent =  0.14286
octave:21> sliced_charts
Enter matrix e.g. gcmatrix: gcmatrix
points_of_interest =  132
dwr_to_dnr =  115
dwr_to_dnr_percent =  0.87121
cyc_to_dnr =  16
cyc_to_dnr_percent =  0.12121
octave:22> sliced_charts
Enter matrix e.g. gcmatrix: simatrix
points_of_interest =  127
dwr_to_dnr =  105
dwr_to_dnr_percent =  0.82677
cyc_to_dnr =  16
cyc_to_dnr_percent =  0.12598
octave:23> sliced_charts
Enter matrix e.g. gcmatrix: plmatrix
points_of_interest =  128
dwr_to_dnr =  112
dwr_to_dnr_percent =  0.87500
cyc_to_dnr =  14
cyc_to_dnr_percent =  0.10938
octave:24> sliced_charts
Enter matrix e.g. gcmatrix: pamatrix
points_of_interest =  120
dwr_to_dnr =  96
dwr_to_dnr_percent =  0.80000
cyc_to_dnr =  18
cyc_to_dnr_percent =  0.15000
octave:25> sliced_charts
Enter matrix e.g. gcmatrix: usmatrix
points_of_interest =  94
dwr_to_dnr =  78
dwr_to_dnr_percent =  0.82979
cyc_to_dnr =  14
cyc_to_dnr_percent =  0.14894
octave:26> sliced_charts
Enter matrix e.g. gcmatrix: tymatrix
points_of_interest =  104
dwr_to_dnr =  99
dwr_to_dnr_percent =  0.95192
cyc_to_dnr =  2
cyc_to_dnr_percent =  0.019231
octave:27> sliced_charts
Enter matrix e.g. gcmatrix: edmatrix
points_of_interest =  108
dwr_to_dnr =  96
dwr_to_dnr_percent =  0.88889
cyc_to_dnr =  10
cyc_to_dnr_percent =  0.092593
octave:28> sliced_charts
Enter matrix e.g. gcmatrix: dxmatrix
points_of_interest =  116
dwr_to_dnr =  94
dwr_to_dnr_percent =  0.81034
cyc_to_dnr =  20
cyc_to_dnr_percent =  0.17241
octave:29> sliced_charts
Enter matrix e.g. gcmatrix: spmatrix
points_of_interest =  106
dwr_to_dnr =  85
dwr_to_dnr_percent =  0.80189
cyc_to_dnr =  19
cyc_to_dnr_percent =  0.17925
octave:30> sliced_charts
Enter matrix e.g. gcmatrix: esmatrix
points_of_interest =  109
dwr_to_dnr =  86
dwr_to_dnr_percent =  0.78899
cyc_to_dnr =  20
cyc_to_dnr_percent =  0.18349
octave:31> sliced_charts
Enter matrix e.g. gcmatrix: ndmatrix
points_of_interest =  107
dwr_to_dnr =  83
dwr_to_dnr_percent =  0.77570
cyc_to_dnr =  18
cyc_to_dnr_percent =  0.16822
octave:32> sliced_charts
Enter matrix e.g. gcmatrix: eurusdmatrix
points_of_interest =  72
dwr_to_dnr =  60
dwr_to_dnr_percent =  0.83333
cyc_to_dnr =  11
cyc_to_dnr_percent =  0.15278
octave:33> sliced_charts
Enter matrix e.g. gcmatrix: gbpusdmatrix
points_of_interest =  85
dwr_to_dnr =  72
dwr_to_dnr_percent =  0.84706
cyc_to_dnr =  11
cyc_to_dnr_percent =  0.12941
octave:34> sliced_charts
Enter matrix e.g. gcmatrix: usdchfmatrix
points_of_interest =  86
dwr_to_dnr =  69
dwr_to_dnr_percent =  0.80233
cyc_to_dnr =  14
cyc_to_dnr_percent =  0.16279
octave:35> sliced_charts
Enter matrix e.g. gcmatrix: usdyenmatrix
points_of_interest =  100
dwr_to_dnr =  80
dwr_to_dnr_percent =  0.80000
cyc_to_dnr =  16
cyc_to_dnr_percent =  0.16000
octave:36> sliced_charts
Enter matrix e.g. gcmatrix: eurchfmatrix
points_of_interest =  100
dwr_to_dnr =  83
dwr_to_dnr_percent =  0.83000
cyc_to_dnr =  15
cyc_to_dnr_percent =  0.15000
octave:37> sliced_charts
Enter matrix e.g. gcmatrix: eurgbpmatrix
points_of_interest =  91
dwr_to_dnr =  79
dwr_to_dnr_percent =  0.86813
cyc_to_dnr =  12
cyc_to_dnr_percent =  0.13187
octave:38> sliced_charts
Enter matrix e.g. gcmatrix: euryenmatrix
points_of_interest =  94
dwr_to_dnr =  75
dwr_to_dnr_percent =  0.79787
cyc_to_dnr =  15
cyc_to_dnr_percent =  0.15957
octave:39> sliced_charts
Enter matrix e.g. gcmatrix: eurausmatrix
points_of_interest =  99
dwr_to_dnr =  81
dwr_to_dnr_percent =  0.81818
cyc_to_dnr =  18
cyc_to_dnr_percent =  0.18182
octave:40> sliced_charts
Enter matrix e.g. gcmatrix: eurcadmatrix
points_of_interest =  92
dwr_to_dnr =  81
dwr_to_dnr_percent =  0.88043
cyc_to_dnr =  9
cyc_to_dnr_percent =  0.097826
octave:41> sliced_charts
Enter matrix e.g. gcmatrix: usdcadmatrix
points_of_interest =  97
dwr_to_dnr =  86
dwr_to_dnr_percent =  0.88660
cyc_to_dnr =  7
cyc_to_dnr_percent =  0.072165
octave:42> sliced_charts
Enter matrix e.g. gcmatrix: gbpchfmatrix
points_of_interest =  87
dwr_to_dnr =  70
dwr_to_dnr_percent =  0.80460
cyc_to_dnr =  14
cyc_to_dnr_percent =  0.16092
octave:43> sliced_charts
Enter matrix e.g. gcmatrix: gbpyenmatrix
points_of_interest =  85
dwr_to_dnr =  74
dwr_to_dnr_percent =  0.87059
cyc_to_dnr =  9
cyc_to_dnr_percent =  0.10588
octave:44> sliced_charts
Enter matrix e.g. gcmatrix: auscadmatrix
points_of_interest =  72
dwr_to_dnr =  59
dwr_to_dnr_percent =  0.81944
cyc_to_dnr =  13
cyc_to_dnr_percent =  0.18056
octave:45> sliced_charts
Enter matrix e.g. gcmatrix: aususdmatrix
points_of_interest =  74
dwr_to_dnr =  59
dwr_to_dnr_percent =  0.79730
cyc_to_dnr =  9
cyc_to_dnr_percent =  0.12162
octave:46> sliced_charts
Enter matrix e.g. gcmatrix: ausyenmatrix
points_of_interest =  65
dwr_to_dnr =  49
dwr_to_dnr_percent =  0.75385
cyc_to_dnr =  13
cyc_to_dnr_percent =  0.20000
But what does this all mean? For the purposes of simplicity I shall discuss "up with no retracement" only, but everything will apply equally to the "down with retracement," but of course in reverse.

My original concern was that the market classifications might have been erratically switching directly from "dnr" to "unr" with resultant lurches from short to long market positions and along the way incurring whipsaw losses from false signals. However, the high percentages from these simple tests show that this is not actually the case. Based on this, one can assume that
  • when the market classification changes to "unr" a long position will already be held because either
  1. a cyclic long position is held, having been initiated at the most recent low cyclic turn, or
  2. a "uwr" long position is held, having been initiated at the most recent rebound from a resistance/retracement level
  • which means that this market classification change is not necessarily a new entry signal, but rather a signal to change to a trend following exit criteria
Therefore, my next test(s) will be to assume that such a long position is held when the market classification changes to "unr" and will continued to be held until a trend following exit occurs, coupled with a "re-entry" signal if appropriate. My intent at the moment is simply to plot equity curves of these long positions, in the supposition that these equity curves will be "additions" or "extensions" to the equity curves generated by signals when in other market modes. More details in a coming post.

Friday, 13 April 2012

My Naive Bayesian Classifier Sanity Check

Following on from my previous post, as part of my initial attempts to "come up with a robust rule set that combines all these disparate indicators into a coherent whole," I have written a simple Octave script to conduct a basic sanity check of my Naive Bayesian Classifier. The purpose of this is to ensure that the various classifications are not wildly fluctuating between bull and bear market modes with no discernible order. This particular test identifies which market modes are indicated immediately prior to an "up with no retracement" mode being indicated. This code box shows the Octave terminal output of the test script, which is named "sliced_charts".
octave:1> sliced_charts
Enter matrix e.g. gcmatrix: clmatrix
points_of_interest =  147
uwr_to_unr =  126
uwr_to_unr_percent =  0.85714
cyc_to_unr =  16
cyc_to_unr_percent =  0.10884
octave:2> sliced_charts
Enter matrix e.g. gcmatrix: homatrix
points_of_interest =  130
uwr_to_unr =  106
uwr_to_unr_percent =  0.81538
cyc_to_unr =  20
cyc_to_unr_percent =  0.15385
octave:3> sliced_charts
Enter matrix e.g. gcmatrix: ngmatrix
points_of_interest =  123
uwr_to_unr =  104
uwr_to_unr_percent =  0.84553
cyc_to_unr =  15
cyc_to_unr_percent =  0.12195
octave:4> sliced_charts
Enter matrix e.g. gcmatrix: rbmatrix
points_of_interest =  162
uwr_to_unr =  134
uwr_to_unr_percent =  0.82716
cyc_to_unr =  23
cyc_to_unr_percent =  0.14198
octave:5> sliced_charts
Enter matrix e.g. gcmatrix: ccmatrix
points_of_interest =  123
uwr_to_unr =  95
uwr_to_unr_percent =  0.77236
cyc_to_unr =  24
cyc_to_unr_percent =  0.19512
octave:6> sliced_charts
Enter matrix e.g. gcmatrix: kcmatrix
points_of_interest =  114
uwr_to_unr =  98
uwr_to_unr_percent =  0.85965
cyc_to_unr =  14
cyc_to_unr_percent =  0.12281
octave:7> sliced_charts
Enter matrix e.g. gcmatrix: ojmatrix
points_of_interest =  129
uwr_to_unr =  109
uwr_to_unr_percent =  0.84496
cyc_to_unr =  17
cyc_to_unr_percent =  0.13178
octave:8> sliced_charts
Enter matrix e.g. gcmatrix: sbmatrix
points_of_interest =  141
uwr_to_unr =  113
uwr_to_unr_percent =  0.80142
cyc_to_unr =  23
cyc_to_unr_percent =  0.16312
octave:9> sliced_charts
Enter matrix e.g. gcmatrix: ctmatrix
points_of_interest =  114
uwr_to_unr =  90
uwr_to_unr_percent =  0.78947
cyc_to_unr =  16
cyc_to_unr_percent =  0.14035
octave:10> sliced_charts
Enter matrix e.g. gcmatrix: lbmatrix
points_of_interest =  113
uwr_to_unr =  89
uwr_to_unr_percent =  0.78761
cyc_to_unr =  19
cyc_to_unr_percent =  0.16814
octave:11> sliced_charts
Enter matrix e.g. gcmatrix: hgmatrix
points_of_interest =  132
uwr_to_unr =  118
uwr_to_unr_percent =  0.89394
cyc_to_unr =  14
cyc_to_unr_percent =  0.10606
octave:12> sliced_charts
Enter matrix e.g. gcmatrix: smatrix
points_of_interest =  141
uwr_to_unr =  118
uwr_to_unr_percent =  0.83688
cyc_to_unr =  20
cyc_to_unr_percent =  0.14184
octave:13> sliced_charts
Enter matrix e.g. gcmatrix: smmatrix
points_of_interest =  140
uwr_to_unr =  110
uwr_to_unr_percent =  0.78571
cyc_to_unr =  29
cyc_to_unr_percent =  0.20714
octave:14> sliced_charts
Enter matrix e.g. gcmatrix: bomatrix
points_of_interest =  121
uwr_to_unr =  97
uwr_to_unr_percent =  0.80165
cyc_to_unr =  16
cyc_to_unr_percent =  0.13223
octave:15> sliced_charts
Enter matrix e.g. gcmatrix: cmatrix
points_of_interest =  128
uwr_to_unr =  103
uwr_to_unr_percent =  0.80469
cyc_to_unr =  23
cyc_to_unr_percent =  0.17969
octave:16> sliced_charts
Enter matrix e.g. gcmatrix: omatrix
points_of_interest =  128
uwr_to_unr =  104
uwr_to_unr_percent =  0.81250
cyc_to_unr =  20
cyc_to_unr_percent =  0.15625
octave:17> sliced_charts
Enter matrix e.g. gcmatrix: wmatrix
points_of_interest =  105
uwr_to_unr =  91
uwr_to_unr_percent =  0.86667
cyc_to_unr =  11
cyc_to_unr_percent =  0.10476
octave:18> sliced_charts
Enter matrix e.g. gcmatrix: lcmatrix
points_of_interest =  144
uwr_to_unr =  119
uwr_to_unr_percent =  0.82639
cyc_to_unr =  23
cyc_to_unr_percent =  0.15972
octave:19> sliced_charts
Enter matrix e.g. gcmatrix: fcmatrix
points_of_interest =  132
uwr_to_unr =  110
uwr_to_unr_percent =  0.83333
cyc_to_unr =  19
cyc_to_unr_percent =  0.14394
octave:20> sliced_charts
Enter matrix e.g. gcmatrix: lhmatrix
points_of_interest =  136
uwr_to_unr =  113
uwr_to_unr_percent =  0.83088
cyc_to_unr =  21
cyc_to_unr_percent =  0.15441
octave:21> sliced_charts
Enter matrix e.g. gcmatrix: gcmatrix
points_of_interest =  137
uwr_to_unr =  118
uwr_to_unr_percent =  0.86131
cyc_to_unr =  16
cyc_to_unr_percent =  0.11679
octave:22> sliced_charts
Enter matrix e.g. gcmatrix: simatrix
points_of_interest =  121
uwr_to_unr =  102
uwr_to_unr_percent =  0.84298
cyc_to_unr =  15
cyc_to_unr_percent =  0.12397
octave:23> sliced_charts
Enter matrix e.g. gcmatrix: plmatrix
points_of_interest =  149
uwr_to_unr =  126
uwr_to_unr_percent =  0.84564
cyc_to_unr =  18
cyc_to_unr_percent =  0.12081
octave:24> sliced_charts
Enter matrix e.g. gcmatrix: pamatrix
points_of_interest =  129
uwr_to_unr =  103
uwr_to_unr_percent =  0.79845
cyc_to_unr =  22
cyc_to_unr_percent =  0.17054
octave:25> sliced_charts
Enter matrix e.g. gcmatrix: usmatrix
points_of_interest =  143
uwr_to_unr =  122
uwr_to_unr_percent =  0.85315
cyc_to_unr =  16
cyc_to_unr_percent =  0.11189
octave:26> sliced_charts
Enter matrix e.g. gcmatrix: tymatrix
points_of_interest =  148
uwr_to_unr =  128
uwr_to_unr_percent =  0.86486
cyc_to_unr =  17
cyc_to_unr_percent =  0.11486
octave:27> sliced_charts
Enter matrix e.g. gcmatrix: edmatrix
points_of_interest =  150
uwr_to_unr =  118
uwr_to_unr_percent =  0.78667
cyc_to_unr =  26
cyc_to_unr_percent =  0.17333
octave:28> sliced_charts
Enter matrix e.g. gcmatrix: dxmatrix
points_of_interest =  141
uwr_to_unr =  122
uwr_to_unr_percent =  0.86525
cyc_to_unr =  14
cyc_to_unr_percent =  0.099291
octave:29> sliced_charts
Enter matrix e.g. gcmatrix: spmatrix
points_of_interest =  158
uwr_to_unr =  123
uwr_to_unr_percent =  0.77848
cyc_to_unr =  31
cyc_to_unr_percent =  0.19620
octave:30> sliced_charts
Enter matrix e.g. gcmatrix: esmatrix
points_of_interest =  160
uwr_to_unr =  125
uwr_to_unr_percent =  0.78125
cyc_to_unr =  30
cyc_to_unr_percent =  0.18750
octave:31> sliced_charts
Enter matrix e.g. gcmatrix: ndmatrix
points_of_interest =  129
uwr_to_unr =  116
uwr_to_unr_percent =  0.89922
cyc_to_unr =  12
cyc_to_unr_percent =  0.093023
octave:32> sliced_charts
Enter matrix e.g. gcmatrix: eurusdmatrix
points_of_interest =  79
uwr_to_unr =  69
uwr_to_unr_percent =  0.87342
cyc_to_unr =  9
cyc_to_unr_percent =  0.11392
octave:33> sliced_charts
Enter matrix e.g. gcmatrix: gbpusdmatrix
points_of_interest =  84
uwr_to_unr =  72
uwr_to_unr_percent =  0.85714
cyc_to_unr =  11
cyc_to_unr_percent =  0.13095
octave:34> sliced_charts
Enter matrix e.g. gcmatrix: usdchfmatrix
points_of_interest =  80
uwr_to_unr =  73
uwr_to_unr_percent =  0.91250
cyc_to_unr =  6
cyc_to_unr_percent =  0.075000
octave:35> sliced_charts
Enter matrix e.g. gcmatrix: usdyenmatrix
points_of_interest =  74
uwr_to_unr =  62
uwr_to_unr_percent =  0.83784
cyc_to_unr =  8
cyc_to_unr_percent =  0.10811
octave:36> sliced_charts
Enter matrix e.g. gcmatrix: eurchfmatrix
points_of_interest =  93
uwr_to_unr =  80
uwr_to_unr_percent =  0.86022
cyc_to_unr =  10
cyc_to_unr_percent =  0.10753
octave:37> sliced_charts
Enter matrix e.g. gcmatrix: eurgbpmatrix
points_of_interest =  91
uwr_to_unr =  78
uwr_to_unr_percent =  0.85714
cyc_to_unr =  11
cyc_to_unr_percent =  0.12088
octave:38> sliced_charts
Enter matrix e.g. gcmatrix: euryenmatrix
points_of_interest =  103
uwr_to_unr =  88
uwr_to_unr_percent =  0.85437
cyc_to_unr =  14
cyc_to_unr_percent =  0.13592
octave:39> sliced_charts
Enter matrix e.g. gcmatrix: eurausmatrix
points_of_interest =  73
uwr_to_unr =  64
uwr_to_unr_percent =  0.87671
cyc_to_unr =  9
cyc_to_unr_percent =  0.12329
octave:40> sliced_charts
Enter matrix e.g. gcmatrix: eurcadmatrix
points_of_interest =  76
uwr_to_unr =  56
uwr_to_unr_percent =  0.73684
cyc_to_unr =  17
cyc_to_unr_percent =  0.22368
octave:41> sliced_charts
Enter matrix e.g. gcmatrix: usdcadmatrix
points_of_interest =  73
uwr_to_unr =  60
uwr_to_unr_percent =  0.82192
cyc_to_unr =  13
cyc_to_unr_percent =  0.17808
octave:42> sliced_charts
Enter matrix e.g. gcmatrix: gbpchfmatrix
points_of_interest =  98
uwr_to_unr =  84
uwr_to_unr_percent =  0.85714
cyc_to_unr =  12
cyc_to_unr_percent =  0.12245
octave:43> sliced_charts
Enter matrix e.g. gcmatrix: gbpyenmatrix
points_of_interest =  97
uwr_to_unr =  84
uwr_to_unr_percent =  0.86598
cyc_to_unr =  11
cyc_to_unr_percent =  0.11340
octave:44> sliced_charts
Enter matrix e.g. gcmatrix: auscadmatrix
points_of_interest =  107
uwr_to_unr =  92
uwr_to_unr_percent =  0.85981
cyc_to_unr =  14
cyc_to_unr_percent =  0.13084
octave:45> sliced_charts
Enter matrix e.g. gcmatrix: aususdmatrix
points_of_interest =  97
uwr_to_unr =  82
uwr_to_unr_percent =  0.84536
cyc_to_unr =  14
cyc_to_unr_percent =  0.14433
octave:46> sliced_charts
Enter matrix e.g. gcmatrix: ausyenmatrix
points_of_interest =  103
uwr_to_unr =  89
uwr_to_unr_percent =  0.86408
cyc_to_unr =  11
cyc_to_unr_percent =  0.10680
  • The clmatrix indicates which market is being tested e.g. crude oil (cl)
  • points of interest is the number of times the market mode changes to "up with no retracement" from a previously different classification
  • uwr_to_unr is the number of times such a change is from "up with retracement" to "up with no retracement"
  • cyc_to_unr is similarly from "cyclic" to "up with no retracement"
  • the *_percent is the previous two expressed as a percentage of the points of interest
It is pleasing to see such high percentages overall. I will perform a similar analysis for "down with retracement" and discuss the significance of this in my next post.

Friday, 30 March 2012

Kalman Filter Octave Coding Completed

I am pleased to say that the first phase of my Kalman filter coding, namely writing Octave code, is now complete. In doing so I have used/adapted code from the MATLAB toolbox available here. The second phase of coding, at some future date, will be to convert this code into a C++ .oct function. My code is a stripped down version of the 2D CWPA demo, which models price as a moving object with position and velocity, and which is described in detail with my model assumptions below.

The first thing I had to decide was what to actually model, and I decided on VWAP. The framework of the Kalman filter is that it tracks an underlying process that is not necessarily directly observable but for which measurements are available. VWAP calculated from OHLC bars fits this framework nicely. If one had access to high frequency daily tick data the VWAP could be calculated exactly, but since the only information available for my purposes is the daily OHLC, the daily OHLC approximation of VWAP is the observable measurement of the "unobservable" exact VWAP.

The next thing I considered was the measurement noise of the filter. Some algebraic manipulation of the VWAP approximation formula (see here) led me to choose two thirds (or 0.666) of the Hi-Lo range of the bar as the measurement noise associated with any single VWAP approximation, this being the maximum possible range of values that the VWAP can take given a bar's OHLC values.

Finally, for the process noise I employed a simple heuristic of the noise being half the bar to bar variation in successive VWAPs, the other half in this assumption being attributable to the process itself.

Having decided on the above the next step was to initialise the filter covariances, and to do this I decided to use the Median Absolute Deviation (MAD) of the noise processes as a consistent estimator of the standard deviation and use the scale factor of 1.4826 for normally distributed data (the Kalman filter assumes Gaussian noise) to calculate the noise variances (see this wiki for more details.) However, I had a concern with "look ahead bias" with this approach but a simple test dispelled these fears. This code box

   1279.9   1279.9   1279.9   1279.9   1279.9   1279.9   1279.9   1279.9
   1284.4   1284.4   1284.4   1284.4   1284.4   1284.4   1284.4   1284.4
   1284.0   1284.0   1284.0   1284.0   1284.0   1284.0   1284.0   1284.0
   1283.3   1283.3   1283.3   1283.3   1283.3   1283.3   1283.3   1283.3
   1288.2   1288.2   1288.2   1288.2   1288.2   1288.2   1288.2   1288.2
   1298.8   1298.7   1298.7   1298.8   1298.7   1298.7   1298.7   1298.7
   1305.0   1305.0   1305.0   1305.0   1305.0   1305.0   1305.0   1305.0
   1306.1   1306.2   1306.2   1306.1   1306.2   1306.2   1306.2   1306.2
   1304.9   1305.0   1305.0   1304.9   1305.0   1305.0   1305.0   1305.0
   1308.3   1308.3   1308.3   1308.3   1308.3   1308.3   1308.3   1308.3
   1312.0   1312.0   1312.0   1312.0   1312.0   1312.0   1312.0   1312.0
   1309.1   1309.1   1309.1   1309.1   1309.1   1309.1   1309.1   1309.1
   1304.3   1304.3   1304.3   1304.3   1304.3   1304.3   1304.3   1304.3
   1302.3   1302.3   1302.3   1302.3   1302.3   1302.3   1302.3   1302.3
   1306.5   1306.5   1306.5   1306.5   1306.4   1306.4   1306.4   1306.4
   1314.6   1314.5   1314.5   1314.6   1314.5   1314.5   1314.5   1314.5
   1325.1   1325.0   1325.0   1325.1   1325.0   1325.0   1325.0   1325.0
   1332.7   1332.7   1332.7   1332.7   1332.7   1332.7   1332.7   1332.7
   1336.7   1336.8   1336.8   1336.7   1336.8   1336.8   1336.8   1336.8
   1339.7   1339.8   1339.8   1339.7   1339.8   1339.8   1339.8   1339.8
   1341.6   1341.7   1341.7   1341.6   1341.7   1341.7   1341.7   1341.7
   1338.3   1338.4   1338.4   1338.3   1338.4   1338.4   1338.4   1338.4
   1340.6   1340.6   1340.6   1340.6   1340.6   1340.6   1340.6   1340.6
   1341.1   1341.1   1341.1   1341.1   1341.1   1341.1   1341.1   1341.1
   1340.4   1340.4   1340.4   1340.4   1340.3   1340.3   1340.3   1340.3
   1341.3   1341.3   1341.3   1341.3   1341.3   1341.3   1341.3   1341.3
   1349.7   1349.7   1349.7   1349.7   1349.6   1349.6   1349.6   1349.6
   1357.6   1357.6   1357.6   1357.6   1357.5   1357.5   1357.5   1357.5
   1355.2   1355.3   1355.3   1355.2   1355.3   1355.3   1355.3   1355.3
   1353.6   1353.6   1353.6   1353.6   1353.6   1353.6   1353.6   1353.6
   1356.6   1356.6   1356.6   1356.6   1356.6   1356.6   1356.6   1356.6
   1358.2   1358.2   1358.2   1358.2   1358.2   1358.2   1358.2   1358.2
   1362.8   1362.7   1362.7   1362.8   1362.7   1362.7   1362.7   1362.7
   1362.7   1362.7   1362.7   1362.7   1362.7   1362.7   1362.7   1362.7
   1362.6   1362.6   1362.6   1362.6   1362.6   1362.6   1362.6   1362.6
   1365.1   1365.1   1365.1   1365.1   1365.1   1365.1   1365.1   1365.1
   1360.8   1360.9   1360.9   1360.8   1360.9   1360.9   1360.9   1360.9
   1348.8   1348.9   1348.9   1348.8   1348.9   1348.9   1348.9   1348.9
   1340.8   1340.8   1340.8   1340.8   1340.8   1340.8   1340.8   1340.8
   1349.0   1348.9   1348.9   1349.0   1348.9   1348.9   1348.9   1348.9
   1361.7   1361.6   1361.6   1361.7   1361.5   1361.5   1361.5   1361.5
   1368.0   1368.0   1368.0   1368.0   1367.9   1367.9   1367.9   1367.9
   1379.2   1379.2   1379.2   1379.2   1379.2   1379.2   1379.2   1379.2
   1390.3   1390.4   1390.4   1390.3   1390.4   1390.4   1390.4   1390.4
   1394.1   1394.2   1394.2   1394.1   1394.2   1394.2   1394.2   1394.2
   1397.7   1397.8   1397.8   1397.7   1397.8   1397.8   1397.8   1397.8
   1400.6   1400.6   1400.6   1400.6   1400.6   1400.6   1400.6   1400.6
   1400.8   1400.8   1400.8   1400.8   1400.8   1400.8   1400.8   1400.8
   1399.2   1399.2   1399.2   1399.2   1399.2   1399.2   1399.2   1399.2
   1393.2   1393.2   1393.2   1393.2   1393.2   1393.2   1393.2   1393.2
   1389.3   1389.3   1389.3   1389.3   1389.3   1389.3   1389.3   1389.3
shows the last 50 values of the Kalman filter with different amounts of data used for the calculations for the initialisation of the filter. The leftmost column shows filter values using all available data for initialisation, the next all data except the most recent 50 values, then all data except the most recent 100 values etc. with the rightmost column being calculated using all data except for the most recent 350 values. This last column is akin to using the data through to the end of 2010, and nothing after this date. Comparison between the left and rightmost columns shows virtually insignificant differences. If one were to begin trading the right hand edge of the chart today, initialisation would be done using all available data. If one then traded for the next one and a half years and then re-initialised the filter using all this "new" data, there would be no practical difference in the filter values over this one and a half year period. So, although there may be "look ahead bias," frankly it doesn't matter. Such is the power of robust statistics and the recursive calculations of the Kalman filter combined!

This next code box shows my Octave code for the Kalman filter
data = load("-ascii","esmatrix") ;
tick = 0.25 ;

n = length(data(:,4))
finish = input('enter finish, no greater than n  ')

if ( finish > length(data(:,4)) )
   finish = 0 % i.e. all available data is used
end

open = data(:,4) ;
high = data(:,5) ;
low = data(:,6) ;
close = data(:,7) ;
market_type = data(:,230) ;

clear data

vwap = round( ( ( open .+ close .+ ( (high .+ low) ./ 2 ) ) ./ 3 ) ./ tick) .* tick ;
vwap_process_noise = ( vwap .- shift(vwap,1) ) ./ 2.0 ;
median_vwap_process_noise = median(vwap_process_noise(2:end-finish,1)) ;
vwap_process_noise_deviations = vwap_process_noise(2:end-finish,1) .- median_vwap_process_noise ;
MAD_process_noise = median( abs( vwap_process_noise_deviations ) ) ;

% convert this to variance under the assumption of a normal distribution
std_vwap_noise = 1.4826 * MAD_process_noise ;
process_noise_variance = std_vwap_noise * std_vwap_noise 

measurement_noise = 0.666 .* ( high .- low ) ;
median_measurement_noise = median( measurement_noise(1:end-finish,1) ) ;
measurement_noise_deviations = measurement_noise(1:end-finish,1) .- median_measurement_noise ;
MAD_measurement_noise = median( abs( measurement_noise_deviations ) ) ;

% convert this to variance under the assumption of a normal distribution
std_measurement_noise = 1.4826 * MAD_measurement_noise ;
measurement_noise_variance = std_measurement_noise * std_measurement_noise

% Transition matrix for the continous-time system.
F = [0 0 1 0 0 0;
     0 0 0 1 0 0;
     0 0 0 0 1 0;
     0 0 0 0 0 1;
     0 0 0 0 0 0;
     0 0 0 0 0 0];

% Noise effect matrix for the continous-time system.
L = [0 0;
     0 0;
     0 0;
     0 0;
     1 0;
     0 1];

% Process noise variance
q = process_noise_variance ;
Qc = diag([q q]);

% Discretisation of the continuous-time system.
[A,Q] = lti_disc(F,L,Qc,1); % last item is dt stepsize set to 1

% Measurement model.
H = [1 0 0 0 0 0;
     0 1 0 0 0 0];

% Variance in the measurements.
r1 = measurement_noise_variance ;
R = diag([r1 r1]);

% Initial guesses for the state mean and covariance.
m = [0 vwap(1,1) 0 0 0 0]';
P = diag([0.1 0.1 0.1 0.1 0.5 0.5]) ;

% Space for the estimates.
MM = zeros(size(m,1), length(vwap));

% create vectors for eventual plotting
predict_plot = zeros(length(vwap),1) ;
MM_plot = zeros(length(vwap),1) ;
sigmaP_plus = zeros(length(vwap),1) ;
sigmaP_minus = zeros(length(vwap),1) ;

% Filtering steps.
for ii = 1:length(vwap)
   [m,P] = kf_predict(m,P,A,Q);

   predict_plot(ii,1) = m(2,1) ;

   [m,P] = kf_update(m,P,vwap(ii,:),H,R);
   MM(:,ii) = m;

   MM_plot(ii,1) = m(2,1) ;

   % sigmaP is for storing the current error covariance for plotting purposes
   sigmaP = sqrt(diag(P)) ; 
   sigmaP_plus(ii,1) = MM_plot(ii,1) + 2 * sigmaP(1) ;
   sigmaP_minus(ii,1) = MM_plot(ii,1) - 2 * sigmaP(1) ;
end

% output in terminal for checking purposes
kalman_last_50 = [kalman_last_50,MM_plot(end-50:end,1)] 

% output for plotting in Gnuplot
x_axis = ( 1:length(vwap) )' ;
A = [x_axis,open,high,low,close,vwap,MM_plot,sigmaP_plus,sigmaP_minus,predict_plot,market_type] ;
dlmwrite('my_cosy_kalman_plot',A)
Note that this code calls three functions; lti_disc, kf_predict and kf_update; which are part of the above mentioned MATLAB toolbox. If readers wish to replicate my results, they will have to download said toolbox and put these functions where they may be called by this script.

Below is a screen shot of my Kalman filter in action.
This shows the S & P E-mini contact (daily bars) up to a week or so ago. The white line is the Kalman filter, the dotted white lines are the plus and minus 2 sigma levels taken from the covariance matrix and the red and light blue triangles show the output of the kf_predict function, prior to being updated by the kf_update function, but only shown if above (red) or below (blue) the 2 sigma level. As can be seen, while price is obviously trending most points are with these levels. The colour coding of the bars is based upon the market type as determined by my Naive Bayesian Classifier, Mark 2.

This next screen shot
shows price bars immediately prior to the first screen shot where price is certainly not trending, and it is interesting to note that the kf_predict triangles are now appearing at the turns in price. This fact may mean that the kf_predict function might be a complementary indicator to my Perfect Oscillator function
and Delta
along with my stable of other turn indicators. The next thing I will have to do is come up with a robust rule set that combines all these disparate indicators into a coherent whole. Also, I am now going to use the Kalman filter output as the input to all my other indicators. Up till now I have been using the typical price; (High+Low+Close)/3; as my input but I think the Kalman filtered VWAP for "today's" price action is a much more meaningful price input than "tomorrow's" pivot point!

Tuesday, 23 August 2011

Naive Bayes Classifier, Mark 2.

It has taken some time, but I have finally been able to incorporate the Trend Vigor indicator into my Naive Bayesian classifier, but with a slight twist. Instead of being purely Bayesian, the classifier has evolved to become a hybrid Bayesian/clustering classifier. The reason for this is that the Trend Vigor indicator has no varying distribution of values but tends to return values that are so close to each other that they can be considered a single value, as mentioned in an earlier post of mine. This can be clearly seen in the short 3D visualisation animation below. The x, y and z axis each represent an input to the classifier, and about 7 seconds into the video you can see the Trend Vigor axis in the foreground with almost straight vertical lines for its "distributions" for each market type. However, it can also be seen that there are spaces in 3D where only combined values for one specific market type appear, particularly evident in the "tails" of the no retracement markets ( the outermost blue and magenta distributions in the video. )



( Non embedded view here )

The revised version of the classifier takes advantage of this fact. Through a series conditional statements each 3D datum point is checked to see if it falls in any of these mutually exclusive spaces and if it does, it is classified as belonging to the market type that has "ownership" of the space in which it lies. If the point cannot be classified via this simple form of clustering then it is assigned a market type through Bayesian analysis.

This Bayesian analysis has also been revised to take into account the value of the Trend Vigor indicator. Since these values have no distribution to speak of a simple linear model is used. If a point is equidistant between two Trend Vigor classifications it is assigned a 0.5 probability of belong to each, this probability rising in linear fashion to 1.0 if it falls exactly on one of the vertical lines mentioned above, with a corresponding decrease in probability assigned to the other market type classification. There is also a boundary condition applied where the probability is set to 0.0 for belonging to a particular market type.

The proof of the pudding is in the eating, and this next chart shows the classification error rate when the classifier is applied to my usual "ideal" time series.


The y axis is the percentage of ideal time series runs in which market type was mis-classified, and the x axis is the period of the cyclic component of the time series being tested. In this test I am only concerned with the results for periods greater than 10 as in real data I have never seen extracted periods less than this. As can be seen the sideways market and both the up and down with no retracement markets have zero mis-classification rates, apart from a small blip at period 12, which is within the 5% mis-classification error rate I had set as my target earlier.

Of more concern was the apparent large mis-classification error rate of the retracement markets ( the green and black lines in the chart. ) However, further investigation of these errors revealed them not to be "errors" as such but more a quirk of the classifier, which lends itself to exploitation. Almost all of the "errors" occur consecutively at the same phase of the cyclic component, at all periods, and the "error" appears in the same direction. By this I mean that if the true market type is up with retracement, the "error" indicates an up with no retracement market; if the true market type is down with retracement, the "error" indicates a down with no retracement market. The two charts below show this visually for both the up and down with retracement markets and are typical representations of the "error" being discussed.


The first pane in each chart shows one complete cycle in which the whole cycle, including the most recent datum point, are correctly classified as being an up with retracement market ( upper chart ) and a down with retracement market ( lower chart. ) The second pane shows a snapshot of the cycle after it has progressed in time through its phase with the last point being the last point that is mis-classified. The "difference" between each chart's respective two panes at the right hand edge shows the portion of the time series that is mis-classified.

It can be seen that the mis-classification occurs at the end of the retracement, immediately prior to the actual turn. This behaviour could easily be exploited via a trading rule. For example, assume that the market has been classified as an up with retracement market and a retracement short trade has been taken. As the retracement proceeds our trade moves into profit but then the market classification changes to up with no retracement. Remember that the classifier (never?) mis-classifies such no retracement markets. What would one want to do in such a situation? Obviously one would want to exit the current short trade and go long, and in so doing would be exiting the short and initiating the possible long at precisely the right time; just before the market turn upwards! This mis-classification "error" could, on real data, turn out to be very serendipitous.

All in all, I think this revised, Mark 2 version of my market classifier is a marked improvement on its predecessor.

Friday, 29 July 2011

Update on the Trend Vigor indicator

I have now completed some preliminary Monte Carlo testing of the Trend Vigor indicator and I thought this would be a good opportunity to share with readers the general approach I have been taking when talking about my Monte Carlo testing on predefined, "ideal" market types.

Firstly, I create my "ideal" market types by using an Octave .oct function, the code for which is given below. In this code can also be seen my implementation of the code for the Trend Vigor indicator, which I think is slightly different from Elher's. The code is commented, so no further description is required here.

.oct function code
// This function takes as arguments a single value input for a sinewave period and a single value input for a degrees increment value. A sinewave of the 
// given period is created and then trends are added to the sinewave such that 5 hypothesised "ideal" market types are created: these are

// 1) a perfectly cyclic sideways market with no trend i.e. just the sinewave component
// 2) an uptrending market with cyclic retracements (uwr) such that retracements are down to the 50% Fibonacci ratio level
// 3) an uptrending market with no retracemnets (unr) i.e. the uptrend completely swamps the downward cyclic component counter to the trend
// 4) a downtrending market with cyclic retracements (dwr) such that retracements are up to the 50% Fibonacci ratio level
// 5) a downtrending market with no retracements (dnr) i.e. the downtrend completely swamps the upward cyclic component counter to the trend

// The cyclic component of these markets is then extracted using the bandpass indicator. These vector lengths are 500 in order to allow time for the 
// bandpass calculations to settle down. The peak to peak amplitude is then recovered from the bandpass at the end of each market vector. The trend slope 
// is also calculated at the end of each market vector. The trend vigor indicator value is then calculated thus
// trend_slope/p_to_p. The bandpass delta is set to 0.2.

// The idea is that 
// for a sideways market the value should be about 0
// for a trending with retracement market the value should be > 0 && < 1 or < 0 && > -1, depending on whether an up trend or down trend
// for a trending with no retracement market the ratio should be > 1  or > -1, depending on whether an up trend or down trend 

// The original sinewave is then repeatedly phase shifted by the degrees increment value, the above process repeated and new values are calculated. 
// All the calculated values for each market type are the vector outputs of the function. 

#include 
#include 
#include 
#define PI 3.14159265

DEFUN_DLD (trend_vigor_dist, args, , "Inputs are period & degrees increment value, outputs are vectors of repeated median slope values")
{
    octave_value_list retval_list;

    if (args(0).length () < 1 | args(0).length () > 1)
    {
        error ("Invalid arguments. Inputs are single value period length and single value degrees increment value");
        return retval_list;
    }

    if (args(1).length () < 1 | args(1).length () > 1)
    {
        error ("Invalid arguments. Inputs are single value period length and single value degrees increment value");
        return retval_list;
    }

    if (error_state)
    {
        error ("Invalid arguments. Inputs are single value period length and single value degrees increment value");
        return retval_list;
    }
    // read inputs
    int period = args(0).int_value ();
    double degrees_inc = args(1).double_value ();
    double period_inc = 360.0 / double(period);
    int length = period + 1; // the length of the "lookback". Is equal to period + 1 to encompass a full period

    // vectors to hold created market values
    ColumnVector sideways_vec(500); // vector to hold sideways market values
    ColumnVector sideways_bandpass_vec(500); // vector to hold bandpass of sideways market values

    ColumnVector uwr_vec(500); // vector to hold uwr market values
    ColumnVector uwr_bandpass_vec(500); // vector to hold bandpass of uwr market values
 
    ColumnVector unr_vec(500); // vector to hold unr market values
    ColumnVector unr_bandpass_vec(500); // vector to hold bandpass of unr market values

    ColumnVector dwr_vec(500); // vector to hold dwr market values
    ColumnVector dwr_bandpass_vec(500); // vector to hold bandpass of dwr market values
 
    ColumnVector dnr_vec(500); // vector to hold dnr market values
    ColumnVector dnr_bandpass_vec(500); // vector to hold bandpass of dnr market values

    // calculate the market trend_incs
    double uwr_trend_inc = 12 / ( 5 * double(period) );
    double unr_trend_inc = 4 / double(period);
    double dwr_trend_inc = -( 12 / ( 5 * double(period) ) );
    double dnr_trend_inc = -( 4 / double(period) );

    // declare variables for bandpass and trend vigor calculations
    double delta = 0.2;
    double beta = cos( (360.0/period)*PI/180.0 );
    double gamma = 1.0 / cos( (720.0*delta/period)*PI/180.0 );
    double alpha = gamma - sqrt(gamma*gamma - 1.0); 
    double power_side;
    double power_uwr;
    double power_unr;
    double power_dwr;
    double power_dnr;
    double rms;
    double p_to_p;

    // create output vectors
    int output_vec_length = int ( 360 / degrees_inc );
    ColumnVector sideways_dist ( output_vec_length ); // create output column of correct length for sideways market
    ColumnVector uwr_dist  ( output_vec_length ); // create output column of correct length for uwr market
    ColumnVector unr_dist  ( output_vec_length ); // create output column of correct length for unr market
    ColumnVector dwr_dist  ( output_vec_length ); // create output column of correct length for dwr market
    ColumnVector dnr_dist  ( output_vec_length ); // create output column of correct length for dnr market

    for (octave_idx_type ii (0); ii < output_vec_length; ii++) 
    {

        // Create the market types and their bandpasses for this ii iteration
 for (octave_idx_type jj (0); jj < 500; jj++) 
        {

        // First create the sideways market type
        sideways_vec(jj) = sin( (degrees_inc*ii + period_inc*jj) * PI / 180 );
                
                if ( jj < 2 )
                {
                sideways_bandpass_vec(jj) = 0;
                }
                else
                {
                sideways_bandpass_vec(jj) = 0.5*(1.0 - alpha)*(sideways_vec(jj) - sideways_vec(jj-2)) + beta*(1.0 + alpha)*sideways_bandpass_vec(jj-1) - alpha*sideways_bandpass_vec(jj-2);
                }

        // next, the uwr retracement market (uwr)
        uwr_vec(jj) = sideways_vec(jj) + jj*uwr_trend_inc;

                if ( jj < 2 )
                {
                uwr_bandpass_vec(jj) = 0;
                }
                else
                {
                uwr_bandpass_vec(jj) = 0.5*(1.0 - alpha)*(uwr_vec(jj) - uwr_vec(jj-2)) + beta*(1.0 + alpha)*uwr_bandpass_vec(jj-1) - alpha*uwr_bandpass_vec(jj-2);
                }

        // next, the unr retracement market (unr)
        unr_vec(jj) = sideways_vec(jj) + jj*unr_trend_inc;

                if ( jj < 2 )
                {
                unr_bandpass_vec(jj) = 0;
                }
                else
                {
                unr_bandpass_vec(jj) = 0.5*(1.0 - alpha)*(unr_vec(jj) - unr_vec(jj-2)) + beta*(1.0 + alpha)*unr_bandpass_vec(jj-1) - alpha*unr_bandpass_vec(jj-2);
                }

        // next, the dwr retracement market (dwr)
        dwr_vec(jj) = sideways_vec(jj) + jj*dwr_trend_inc;

                if ( jj < 2 )
                {
                dwr_bandpass_vec(jj) = 0;
                }
                else
                {
                dwr_bandpass_vec(jj) = 0.5*(1.0 - alpha)*(dwr_vec(jj) - dwr_vec(jj-2)) + beta*(1.0 + alpha)*dwr_bandpass_vec(jj-1) - alpha*dwr_bandpass_vec(jj-2);
                }

        // next, the dnr retracement market (dnr)
        dnr_vec(jj) = sideways_vec(jj) + jj*dnr_trend_inc;

                if ( jj < 2 )
                {
                dnr_bandpass_vec(jj) = 0;
                }
                else
                {
                dnr_bandpass_vec(jj) = 0.5*(1.0 - alpha)*(dnr_vec(jj) - dnr_vec(jj-2)) + beta*(1.0 + alpha)*dnr_bandpass_vec(jj-1) - alpha*dnr_bandpass_vec(jj-2);
                }

        } // end of jj loop to create the different markets and their bandpasses

    // now loop over end of each market vector to create the distributions

     power_side = 0.0;
     power_uwr = 0.0;
     power_unr = 0.0;
     power_dwr = 0.0;
     power_dnr = 0.0;

     for (octave_idx_type jj (0); jj < length; jj++)
         {
         power_side = power_side + sideways_bandpass_vec(499-jj)*sideways_bandpass_vec(499-jj) + sideways_bandpass_vec(499-jj-int(period/4.0))*sideways_bandpass_vec(499-jj-int(period/4.0)) ;
         power_uwr = power_uwr + uwr_bandpass_vec(499-jj)*uwr_bandpass_vec(499-jj) + uwr_bandpass_vec(499-jj-int(period/4.0))*uwr_bandpass_vec(499-jj-int(period/4.0)) ;
         power_unr = power_unr + unr_bandpass_vec(499-jj)*unr_bandpass_vec(499-jj) + unr_bandpass_vec(499-jj-int(period/4.0))*unr_bandpass_vec(499-jj-int(period/4.0)) ;
         power_dwr = power_dwr + dwr_bandpass_vec(499-jj)*dwr_bandpass_vec(499-jj) + dwr_bandpass_vec(499-jj-int(period/4.0))*dwr_bandpass_vec(499-jj-int(period/4.0)) ;
         power_dnr = power_dnr + dnr_bandpass_vec(499-jj)*dnr_bandpass_vec(499-jj) + dnr_bandpass_vec(499-jj-int(period/4.0))*dnr_bandpass_vec(499-jj-int(period/4.0)) ;
         }

     // fill the distribution vectors
     rms = sqrt( power_side / (period+1) ) ;
     p_to_p = 2.0 * 1.414 * rms ;
     sideways_dist(ii) = ( sideways_vec(499) - sideways_vec(499-period) ) / p_to_p ;

     rms = sqrt( power_uwr / (period+1) ) ;
     p_to_p = 2.0 * 1.414 * rms ;
     uwr_dist(ii) = ( uwr_vec(499) - uwr_vec(499-period) ) / p_to_p ;

     rms = sqrt( power_unr / (period+1) ) ;
     p_to_p = 2.0 * 1.414 * rms ;
     unr_dist(ii) = ( unr_vec(499) - unr_vec(499-period) ) / p_to_p ;

     rms = sqrt( power_dwr / (period+1) ) ;
     p_to_p = 2.0 * 1.414 * rms ;
     dwr_dist(ii) = ( dwr_vec(499) - dwr_vec(499-period) ) / p_to_p ;

     rms = sqrt( power_dnr / (period+1) ) ;
     p_to_p = 2.0 * 1.414 * rms ;
     dnr_dist(ii) = ( dnr_vec(499) - dnr_vec(499-period) ) / p_to_p ;

    } // end of main ii loop

    retval_list(4) = dnr_dist;
    retval_list(3) = dwr_dist;
    retval_list(2) = unr_dist;
    retval_list(1) = uwr_dist;
    retval_list(0) = sideways_dist;

    return retval_list;
}
 
This function is called by this simple Octave script
clear all

inc = input( "Enter phase increment: ");

for ii = 6:50

[sideways_dist,uwr_dist,unr_dist,dwr_dist,dnr_dist] = trend_vigor_dist(ii,inc);
A=[sideways_dist,uwr_dist,unr_dist,dwr_dist,dnr_dist];
file = strcat( int2str(ii),"_period_dist" );
dlmwrite(file,A)

endfor 
which writes the output of the tests to named files which are to be used for further analysis in R.

Firstly, using R, I wanted to see what the distribution of the results looks like, so this R script
rm(list=ls())
data <- as.matrix(read.csv(file="20_period_dist",head=FALSE,sep=,))
side <- density(data[,1])
uwr <- density(data[,2])
max_uwr_y <- max(uwr$y)
max_uwr_x <- max(uwr$x)
min_uwr_x <- min(uwr$x)
unr <- density(data[,3])
max_unr_y <- max(unr$y)
max_unr_x <- max(unr$x)
min_unr_x <- min(unr$x)
dwr <- density(data[,4])
max_dwr_y <- max(dwr$y)
max_dwr_x <- max(dwr$x)
min_dwr_x <- min(dwr$x)
dnr <- density(data[,5])
max_dnr_y <- max(dnr$y)
max_dnr_x <- max(dnr$x)
min_dnr_x <- min(dnr$x)
plot_max_y <- max(max_uwr_y,max_unr_y,max_dwr_y,max_dnr_y)
plot_max_x <- max(max_uwr_x,max_unr_x,max_dwr_x,max_dnr_x)
plot_min_x <- min(min_uwr_x,min_unr_x,min_dwr_x,min_dnr_x)
par(mfrow=c(2,1))
plot(uwr,xlim=c(plot_min_x,plot_max_x),ylim=c(0,plot_max_y),col="red")
par(new=TRUE)
plot(unr,xlim=c(plot_min_x,plot_max_x),ylim=c(0,plot_max_y),col="blue")
par(new=TRUE)
plot(dwr,xlim=c(plot_min_x,plot_max_x),ylim=c(0,plot_max_y),col="green")
par(new=TRUE)
plot(dnr,xlim=c(plot_min_x,plot_max_x),ylim=c(0,plot_max_y),col="black")
plot(side)
 
produces plots such as this output
where the top plot shows the distributions of the uwr, unr, dwr and dnr markets, and the lower plot the sideways market. In this particular case, the spread of each distribution is so narrow ( measured differences of the order of thousandths of a decimal place ) that I consider that for practical purposes the distributions can be treated as single values. This simple R boot strap script gets the average value of the distributions to be used as this single point value.
rm(list=ls())

data <- as.matrix(read.csv(file="trend_vigor_dist_results",head=FALSE,sep=,))
side <- data[,1]
uwr <- data[,2]
unr <- data[,3]
dwr <- data[,4]
dnr <- data[,5]
side_samples <- matrix(0,50000)
uwr_samples <- matrix(0,50000)
unr_samples <- matrix(0,50000)
dwr_samples <- matrix(0,50000)
dnr_samples <- matrix(0,50000)

for(ii in 1:50000) {
side_samples[ii] <- mean(sample(side, replace=T))
uwr_samples[ii] <- mean(sample(uwr, replace=T))
unr_samples[ii] <- mean(sample(unr, replace=T))
dwr_samples[ii] <- mean(sample(dwr, replace=T))
dnr_samples[ii] <- mean(sample(dnr, replace=T))
}

side_mean <- mean(side_samples)
uwr_mean <- mean(uwr_samples)
unr_mean <- mean(unr_samples)
dwr_mean <- mean(dwr_samples)
dnr_mean <- mean(dnr_samples) 
For readers' interest, the actual values are 0.829, 1.329, -0.829 and -1.329 with 0 for the sideways market.

This final plot is the same Natural Gas plot as in my previous post, but with the above values substituted for Ehler's default values of 1 and -1.
What I intend to do now is use these values as the means of normal distributions with varying standard deviations as inputs for my Naive Bayes classifier. Further Monte Carlo testing will be done such that values for the standard deviations are obtained that result in the classifier giving false classifications, when tested using the "ideal" markets code above, within acceptable limits, most probably a 5% classification error rate.

Tuesday, 26 July 2011

A "new" indicator for possible inclusion in my Naive Bayesian Classifier

I have recently come across another interesting indicator on Ehler's website, information about which is available for download in the seminars section. It is the Trend Vigor indicator and below is my Octave .oct function implementation of it, first shown on the continuous, back-adjusted Natural Gas contract
and next the EURUSD spot forex market, both daily bars.
My implementation is slightly different from Ehler's in that there is no smoothing of the trend slope prior to calculation of the indicator. My reason for this is that smoothing will change the probability distribution of the indicator values, and since my Naive Bayes Classifier uses Kernel Density Estimation of a Mixture model, I prefer not to "corrupt" this density calculation with the subjective choice of a smoothing algorithm. As before, all parameters for this part of the classifier will be determined by using Monte Carlo techniques on "ideal," synthetically modelled market prices.

In the Natural Gas chart above it can be seen that the latter half of the chart is mostly sideways action, as determined by the Trend Vigor indicator, whilst my current version of the classifier ( the colour coded candlesticks ) does not give a corresponding similar determination for the whole of this later half of the chart. This is also the case in the second EURUSD chart. It is my hope that including the Trend Vigor indicator as an input to the classifier will improve the classifier's classification ability.

More on this in due course.

Tuesday, 21 June 2011

Completion of Naive Bayesian Classifier coding

I am pleased to say that my .oct coding of the Naive Bayesian classifier is now complete. The purpose of the function is to take as inputs various measurements of the current state of the time series, and then for the classifier to classify the time series as being in one of the five following states:-
  • trending sideways with a cyclic action
  • trending upwards with 50% retracements of previous up-legs
  • trending upwards with no retracements
  • trending downwards with 50% retracemenst of previous down-legs
  • trending downwards with no retracements
These classifications will determine the most appropriate trading approach to take given the current state of the "tradable," and two screen shots of the classifier are given below, rendered as a "paint bar" study in the upper candlestick chart.

This first image just shows three classifications; trending sideways with a cycle in cyan, trending either up or down with 50% retracement in green, and trending up or down without retracement when the bars are blue or red (blue when close > open, red when close < open). Additionally, the coloured triangles show when the Cybercycle leading functions in the first subgraph cross, giving a count down to the predicted cyclic highs and lows, the blue "0" triangle being the predicted actual high or low. The green "fp high" and red "fp low" lines are the respective full period channel highs and lows of the price series, with the dotted yellow line being the midpoint of the two.

The second subgraph predicts turning points based on zero line crosses of the Cybercycle in the first subgraph. Read more about this indicator here.

The third subgraph is a plot of the sine of the phase of the Cybercycle, with leading signals, superimposed over a stochastic of the price bars. I may discuss this particular indicator in more depth in a future post.

This second screen shot is similar to the first, except that the classifications for retracement and no retracement are separated out into upward and downwards (chart key uwr = up with retracement, unr = up no retracement etc.). In this graph the coloured triangles represent the leading function crosses of the sine in the third subgraph.


Personally I am pleased that the effort to produce this indicator (almost six weeks of Monte Carlo simulation and statistical analysis in R, plus C++ coding) has resulted in a useful addition to my stable of indicators. I think there could still be some tweaks/improvements/additions that could be added, but for the moment I will leave the indicator as it is. The next thing I have set myself to do is implement my collection of C++ .oct functions as DLLs for Metatrader 4 (maybe an EA?) on a demo account so that I can utilise/test them intraday on forex data. Hopefully this will turn out to be a relatively simple case of "dragging and dropping" the already written C++ code.