Saturday 30 June 2012

Machine Learning Course Completed

I'm pleased to say that I have now completed Andrew Ng's machine learning course, which is offered through Coursera. This post is not intended to be a review of the course, which in my opinion is extremely good and very useful, but more of a reflection of my thoughts and what I think will be useful for me personally.

Firstly, I was pleasantly surprised that the software/programming language of instruction was Octave, which regular readers of this blog will know is my main software of choice. Apart from learning the concepts of ML, I also picked up some handy tips for Octave programming, and more importantly for me I now have a set of working Octave ML functions that I can use immediately in my system development.

In my previous post I mentioned that my first attempt at using ML will be to use a Neural Net to classify market types. As background to this, readers might be interested in a pdf file of the video lectures, available from here, which was put together and posted on the course discussion forum by another student - I think this is very good and all credit to said student, José Soares Augusto.

Due to the honour code ( or honor code for American readers ) of the course I will be unable to post the code that I wrote for the programming assignments. However, I do feel that I can post the code shown in the code box below, as the copyright notice allows it. A few slight changes I made are noted in the copyright notice. This is a minimisation function that was used in the training of the Neural Net assignment and was provided in the assignment download.
function [X, fX, i] = fmincg(f, X, options, P1, P2, P3, P4, P5)
% Minimize a continuous differentialble multivariate function. Starting point
% is given by "X" (D by 1), and the function named in the string "f", must
% return a function value and a vector of partial derivatives. The Polack-
% Ribiere flavour of conjugate gradients is used to compute search directions,
% and a line search using quadratic and cubic polynomial approximations and the
% Wolfe-Powell stopping criteria is used together with the slope ratio method
% for guessing initial step sizes. Additionally a bunch of checks are made to
% make sure that exploration is taking place and that extrapolation will not
% be unboundedly large. The "length" gives the length of the run: if it is
% positive, it gives the maximum number of line searches, if negative its
% absolute gives the maximum allowed number of function evaluations. You can
% (optionally) give "length" a second component, which will indicate the
% reduction in function value to be expected in the first line-search (defaults
% to 1.0). The function returns when either its length is up, or if no further
% progress can be made (ie, we are at a minimum, or so close that due to
% numerical problems, we cannot get any closer). If the function terminates
% within a few iterations, it could be an indication that the function value
% and derivatives are not consistent (ie, there may be a bug in the
% implementation of your "f" function). The function returns the found
% solution "X", a vector of function values "fX" indicating the progress made
% and "i" the number of iterations (line searches or function evaluations,
% depending on the sign of "length") used.
%
% Usage: [X, fX, i] = fmincg(f, X, options, P1, P2, P3, P4, P5)
%
% See also: checkgrad 
%
% Copyright (C) 2001 and 2002 by Carl Edward Rasmussen. Date 2002-02-13
%
% (C) Copyright 1999, 2000 & 2001, Carl Edward Rasmussen
% 
% Permission is granted for anyone to copy, use, or modify these
% programs and accompanying documents for purposes of research or
% education, provided this copyright notice is retained, and note is
% made of any changes that have been made.
% 
% These programs and documents are distributed without any warranty,
% express or implied.  As the programs were written for research
% purposes only, they have not been tested to the degree that would be
% advisable in any important application.  All use of these programs is
% entirely at the user's own risk.
%
% [ml-class] Changes Made:
% 1) Function name and argument specifications
% 2) Output display
%
% Dekalog Changes Made:
% Some lines have been altered, changing | to || and & to &&.
% This is to avoid "possible Matlab-style short-circuit operator" warnings
% being given when code is run under Octave. The lines where these changes
% have been made are indicated by comments at the end of each respective line. 

% Read options
if exist('options', 'var') && ~isempty(options) && isfield(options, 'MaxIter')
    length = options.MaxIter;
else
    length = 100;
end


RHO = 0.01;                            % a bunch of constants for line searches
SIG = 0.5;       % RHO and SIG are the constants in the Wolfe-Powell conditions
INT = 0.1;    % don't reevaluate within 0.1 of the limit of the current bracket
EXT = 3.0;                    % extrapolate maximum 3 times the current bracket
MAX = 20;                         % max 20 function evaluations per line search
RATIO = 100;                                      % maximum allowed slope ratio

argstr = ['feval(f, X'];                      % compose string used to call function
for i = 1:(nargin - 3)
  argstr = [argstr, ',P', int2str(i)];
end
argstr = [argstr, ')'];

if max(size(length)) == 2, red=length(2); length=length(1); else red=1; end
S=['Iteration '];

i = 0;                                            % zero the run length counter
ls_failed = 0;                             % no previous line search has failed
fX = [];
[f1 df1] = eval(argstr);                      % get function value and gradient
i = i + (length<0);                                            % count epochs?!
s = -df1;                                        % search direction is steepest
d1 = -s'*s;                                                 % this is the slope
z1 = red/(1-d1);                                  % initial step is red/(|s|+1)

while i < abs(length)                                      % while not finished
  i = i + (length>0);                                      % count iterations?!

  X0 = X; f0 = f1; df0 = df1;                   % make a copy of current values
  X = X + z1*s;                                             % begin line search
  [f2 df2] = eval(argstr);
  i = i + (length<0);                                          % count epochs?!
  d2 = df2'*s;
  f3 = f1; d3 = d1; z3 = -z1;             % initialize point 3 equal to point 1
  if length>0, M = MAX; else M = min(MAX, -length-i); end
  success = 0; limit = -1;                     % initialize quanteties
  while 1
    while ((f2 > f1+z1*RHO*d1) || (d2 > -SIG*d1)) && (M > 0) % | and & changed to || and && to avoid "possible Matlab-style short-circuit operator" warning 
      limit = z1;                                         % tighten the bracket
      if f2 > f1
        z2 = z3 - (0.5*d3*z3*z3)/(d3*z3+f2-f3);                 % quadratic fit
      else
        A = 6*(f2-f3)/z3+3*(d2+d3);                                 % cubic fit
        B = 3*(f3-f2)-z3*(d3+2*d2);
        z2 = (sqrt(B*B-A*d2*z3*z3)-B)/A;       % numerical error possible - ok!
      end
      if isnan(z2) || isinf(z2)     % | changed to || to avoid "possible Matlab-style short-circuit operator" warning 
        z2 = z3/2;                  % if we had a numerical problem then bisect
      end
      z2 = max(min(z2, INT*z3),(1-INT)*z3);  % don't accept too close to limits
      z1 = z1 + z2;                                           % update the step
      X = X + z2*s;
      [f2 df2] = eval(argstr);
      M = M - 1; i = i + (length<0);                           % count epochs?!
      d2 = df2'*s;
      z3 = z3-z2;                    % z3 is now relative to the location of z2
    end
    if f2 > f1+z1*RHO*d1 || d2 > -SIG*d1                    % | changed to || to avoid "possible Matlab-style short-circuit operator" warning
      break;                                                % this is a failure
    elseif d2 > SIG*d1
      success = 1; break;                                             % success
    elseif M == 0
      break;                                                          % failure
    end
    A = 6*(f2-f3)/z3+3*(d2+d3);                      % make cubic extrapolation
    B = 3*(f3-f2)-z3*(d3+2*d2);
    z2 = -d2*z3*z3/(B+sqrt(B*B-A*d2*z3*z3));        % num. error possible - ok!
    if ~isreal(z2) || isnan(z2) || isinf(z2) || z2 < 0   % num prob or wrong sign? % | changed to || to avoid "possible Matlab-style short-circuit operator" warning
      if limit < -0.5                               % if we have no upper limit
        z2 = z1 * (EXT-1);                 % the extrapolate the maximum amount
      else
        z2 = (limit-z1)/2;                                   % otherwise bisect
      end
    elseif (limit > -0.5) && (z2+z1 > limit)       % extraplation beyond max?   % & changed to && to avoid "possible Matlab-style short-circuit operator" warning
      z2 = (limit-z1)/2;                                               % bisect
    elseif (limit < -0.5) && (z2+z1 > z1*EXT)      % extrapolation beyond limit % & changed to && to avoid "possible Matlab-style short-circuit operator" warning
      z2 = z1*(EXT-1.0);                           % set to extrapolation limit
    elseif z2 < -z3*INT
      z2 = -z3*INT;
    elseif (limit > -0.5) && (z2 < (limit-z1)*(1.0-INT))   % too close to limit? % & changed to && to avoid "possible Matlab-style short-circuit operator" warning
      z2 = (limit-z1)*(1.0-INT);
    end
    f3 = f2; d3 = d2; z3 = -z2;                  % set point 3 equal to point 2
    z1 = z1 + z2; X = X + z2*s;                      % update current estimates
    [f2 df2] = eval(argstr);
    M = M - 1; i = i + (length<0);                             % count epochs?!
    d2 = df2'*s;
  end                                                      % end of line search

  if success                                         % if line search succeeded
    f1 = f2; fX = [fX' f1]';
    fprintf('%s %4i | Cost: %4.6e\r', S, i, f1);
    s = (df2'*df2-df1'*df2)/(df1'*df1)*s - df2;      % Polack-Ribiere direction
    tmp = df1; df1 = df2; df2 = tmp;                         % swap derivatives
    d2 = df1'*s;
    if d2 > 0                                      % new slope must be negative
      s = -df1;                              % otherwise use steepest direction
      d2 = -s'*s;    
    end
    z1 = z1 * min(RATIO, d1/(d2-realmin));          % slope ratio but max RATIO
    d1 = d2;
    ls_failed = 0;                              % this line search did not fail
  else
    X = X0; f1 = f0; df1 = df0;  % restore point from before failed line search
    if ls_failed || i > abs(length)         % line search failed twice in a row % | changed to || to avoid "possible Matlab-style short-circuit operator" warning
      break;                             % or we ran out of time, so we give up
    end
    tmp = df1; df1 = df2; df2 = tmp;                         % swap derivatives
    s = -df1;                                                    % try steepest
    d1 = -s'*s;
    z1 = 1/(1-d1);                     
    ls_failed = 1;                                    % this line search failed
  end
  if exist('OCTAVE_VERSION')
    fflush(stdout);
  end
end
fprintf('\n');

Finally, the last set of videos talked about "Artificial Data Synthesis," otherwise known as creating your own data for training purposes. This is basically what I had planned to do anyway ( see previous post ), but it is nice to learn that it is standard, accepted practice in the ML world. The first such way of creating data, in the context of Photo OCR, is shown below
where various font libraries are used against random backgrounds. I think this very much mirrors my planned approach of training on repeated sets of my "ideal time series" construct. However, another approach which could be used is "data distortion," shown in this next image
which is an approach that my creating synthetic data using FFT might be useful for, or alternatively a correlation and cointegration approach as shown in R code in this Quantitative Finance thread.

All in all, I'm quite excited by the possibilities of my new found knowledge, and I fully expect that in time, after development and testing, any Neural Net I develop will in fact replace my current Naive Bayesian classifier.

Friday 22 June 2012

Neural Net to Replace my Bayesian Classifier?

Having more or less completed the machine learning course alluded to in my last post I thought I would have a go at programming a neural net as a possible replacement for my Naive Bayesian Classifier. In said machine learning course there was a task to programme a neural net to recognise the digits 0 to 9 inclusive, as shown below.
It struck me whilst completing this task that I could use the code I was writing to recognise price patterns as a way of classifying the market into one of my 5 states. Quickly writing a pre-processing script in Octave I have been able to produce scaled "snapshots" of my "ideal" markets types thus:-
As can be seen, different market types and periods produce distinctive looking plots, and it is my hope that a neural net can be trained to identify them and thus act as a market classifier. More in a future post.

Wednesday 6 June 2012

Creation of a Simple Benchmark Suite

For some time now I have been toying with the idea of creating a simple benchmark suite to compare my own back test system performance with that of some public domain trading systems. I decided to select a few examples from the Trading Blox Users' Guide, specifically:
  • exponential moving average crossovers of periods 10-20 and 20-50
  • triple moving average crossover system with periods 10-20-50
  • Bollinger band breakouts of periods 20 and 50 with 1 & 2 standard deviations for exits and entries
  • donchian channel breakouts with periods 20-10 and 50-25
This is a total of 7 systems, and in the Rcpp code below these form a sort of "committee" to vote to be either long/short 1 contract, or neutral.

# This function takes as inputs vectors of opening and closing prices
# and creates a basic benchmark output suite of system equity curves 
# for the following basic trend following systems
# exponential moving average crossovers of 10-20 and 20-50
# triple moving average crossovers of 10-20-50
# bollinger band breakouts of 20 and 50 with 1 & 2 standard deviations
# donchian channel breakouts of 20-10 and 50-25
# The libraries required to compile this function are
# "Rcpp," "inline" and "compiler." The function is compiled by the command
# > source("basic_benchmark_equity.r") in the console/terminal.

library(Rcpp) # load the required library
library(inline) # load the required library
library(compiler) # load the required library

src <- '
#include 
#include 

Rcpp::NumericVector open(a) ;
Rcpp::NumericVector close(b) ;
Rcpp::NumericVector market_mode(c) ;
Rcpp::NumericVector kalman(d) ;
Rcpp::NumericVector tick_size(e) ;
Rcpp::NumericVector tick_value(f) ;
int n = open.size() ;
Rcpp::NumericVector sma_10(n) ;
Rcpp::NumericVector sma_20(n) ;
Rcpp::NumericVector sma_50(n) ;
Rcpp::NumericVector std_20(n) ;
Rcpp::NumericVector std_50(n) ;

// create equity output vectors
Rcpp::NumericVector market_mode_long_eq(n) ;
Rcpp::NumericVector market_mode_short_eq(n) ;
Rcpp::NumericVector market_mode_composite_eq(n) ;
Rcpp::NumericVector sma_10_20_eq(n) ; 
Rcpp::NumericVector sma_20_50_eq(n) ; 
Rcpp::NumericVector tma_eq(n) ; 
Rcpp::NumericVector bbo_20_eq(n) ;
Rcpp::NumericVector bbo_50_eq(n) ;
Rcpp::NumericVector donc_20_eq(n) ;
Rcpp::NumericVector donc_50_eq(n) ;
Rcpp::NumericVector composite_eq(n) ; 

// position vectors for benchmark systems
Rcpp::NumericVector sma_10_20_pv(1) ;
sma_10_20_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector sma_20_50_pv(1) ;
sma_20_50_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector tma_pv(1) ;
tma_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector bbo_20_pv(1) ;
bbo_20_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector bbo_50_pv(1) ;
bbo_50_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector donc_20_pv(1) ;
donc_20_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector donc_50_pv(1) ;
donc_50_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector comp_pv(1) ;
comp_pv[0] = 0.0 ; // initialise to zero, no position

// fill the equity curve vectors with zeros for "burn in" period
// and create the initial values for all indicators

for ( int ii = 0 ; ii < 50 ; ii++ ) {
    
    if ( ii >= 40 ) {
    sma_10[49] += close[ii] ; }

    if ( ii >= 30 ) {
    sma_20[49] += close[ii] ; }

    sma_50[49] += close[ii] ;

    std_20[ii] = 0.0 ;
    std_50[ii] = 0.0 ;

    market_mode_long_eq[ii] = 0.0 ;
    market_mode_short_eq[ii] = 0.0 ;
    market_mode_composite_eq = 0.0 ;
    sma_10_20_eq[ii] = 0.0 ;
    sma_20_50_eq[ii] = 0.0 ; 
    bbo_20_eq[ii] = 0.0 ;
    bbo_50_eq[ii] = 0.0 ;
    tma_eq[ii] = 0.0 ;
    donc_20_eq[ii] = 0.0 ;
    donc_50_eq[ii] = 0.0 ; 
    composite_eq[ii] = 0.0 ; } // end of initialising loop

    sma_10[49] = sma_10[49] / 10.0 ;
    sma_20[49] = sma_20[49] / 20.0 ;
    sma_50[49] = sma_50[49] / 50.0 ;

// the main calculation loop
for ( int ii = 50 ; ii < n-2 ; ii++ ) {

    // calculate the smas
    sma_10[ii] = ( sma_10[ii-1] - sma_10[ii-10] / 10.0 ) + ( close[ii] / 10.0 ) ;
    sma_20[ii] = ( sma_20[ii-1] - sma_20[ii-20] / 20.0 ) + ( close[ii] / 20.0 ) ;
    sma_50[ii] = ( sma_50[ii-1] - sma_50[ii-50] / 50.0 ) + ( close[ii] / 50.0 ) ;

      // calculate the standard deviations
      for ( int jj = 0 ; jj < 50 ; jj++ ) {
    
      if ( jj < 20 ) {
      std_20[ii] += ( close[ii-jj] - sma_20[ii] ) * ( close[ii-jj] - sma_20[ii] )  ; } // end of jj if

      std_50[ii] += ( close[ii-jj] - sma_50[ii] ) * ( close[ii-jj] - sma_50[ii] ) ; } // end of standard deviation loop

    std_20[ii] = sqrt( std_20[ii] / 20.0 ) ;
    std_50[ii] = sqrt( std_50[ii] / 50.0 ) ;

    //-------------------------------------------------------------------------------------------------------------------

    // calculate the equity values of the market modes
    // market_mode uwr and unr long signals
    if ( market_mode[ii] == 1 || market_mode[ii] == 2 ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; }

    // market_mode dwr and dnr short signals
    if ( market_mode[ii] == 3 || market_mode[ii] == 4 ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; }

    // calculate the equity values of the market modes
    // market_mode cyc long signals
    if ( market_mode[ii] == 0 && kalman[ii] > kalman[ii-1] ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; }

    // market_mode cyc short signals
    if ( market_mode[ii] == 0 && kalman[ii] < kalman[ii-1] ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; }

    //----------------------------------------------------------------------------------------------------------------------------

    // calculate the equity values and positions of each benchmark system
    // sma_10_20_eq
    if ( sma_10[ii] > sma_20[ii] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_10_20_pv[0] = 1.0 ; } // long

    // sma_10_20_eq
    if ( sma_10[ii] < sma_20[ii] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_10_20_pv[0] = -1.0 ; } // short

    // sma_10_20_eq
    if ( sma_10[ii] == sma_20[ii] && sma_10[ii-1] > sma_20[ii-1] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_10_20_pv[0] = 1.0 ; } // long

    // sma_10_20_eq
    if ( sma_10[ii] == sma_20[ii] && sma_10[ii-1] < sma_20[ii-1] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_10_20_pv[0] = -1.0 ; } // short

    //-----------------------------------------------------------------------------------------------------------

    // sma_20_50_eq
    if ( sma_20[ii] > sma_50[ii] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_20_50_pv[0] = 1.0 ; } // long

    // sma_20_50_eq
    if ( sma_20[ii] < sma_50[ii] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_20_50_pv[0] = -1.0 ; } // short

    // sma_20_50_eq
    if ( sma_20[ii] == sma_50[ii] && sma_20[ii-1] > sma_50[ii-1] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_20_50_pv[0] = 1.0 ; } // long

    // sma_20_50_eq
    if ( sma_20[ii] == sma_50[ii] && sma_20[ii-1] < sma_50[ii-1] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_20_50_pv[0] = -1.0 ; } // short

    //-----------------------------------------------------------------------------------------------------------

    // tma_eq
    if ( tma_pv[0] == 0.0 ) {

      // default position
      tma_eq[ii] = tma_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( sma_10[ii] > sma_20[ii] && sma_20[ii] > sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      tma_pv[0] = 1.0 ; } // long

      if ( sma_10[ii] < sma_20[ii] && sma_20[ii] < sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      tma_pv[0] = -1.0 ; } // short

    } // end of tma_pv == 0.0 loop

    if ( tma_pv[0] == 1.0 ) {

      // default long position
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( sma_10[ii] < sma_20[ii] && sma_10[ii] > sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] ; 
      tma_pv[0] = 0.0 ; } // exit long, go neutral

      if ( sma_10[ii] < sma_20[ii] && sma_20[ii] < sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      tma_pv[0] = -1.0 ; } // short
    
    } // end of tma_pv == 1.0 loop

    if ( tma_pv[0] == -1.0 ) {

      // default short position
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( sma_10[ii] > sma_20[ii] && sma_10[ii] < sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] ; 
      tma_pv[0] = 0.0 ; } // exit short, go neutral

      if ( sma_10[ii] > sma_20[ii] && sma_20[ii] > sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      tma_pv[0] = 1.0 ; } // long

    } // end of tma_pv == -1.0 loop

    //------------------------------------------------------------------------------------------------------------

    // bbo_20_eq
    if ( bbo_20_pv[0] == 0.0 ) {

      // default position
      bbo_20_eq[ii] = bbo_20_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > sma_20[ii] + 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_20_pv[0] = 1.0 ; } // long

      if ( close[ii] < sma_20[ii] - 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_20_pv[0] = -1.0 ; } // short

    } // end of bbo_20_pv == 0.0 loop

    if ( bbo_20_pv[0] == 1.0 ) {

      // default long position
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < sma_20[ii] + std_20[ii] && close[ii] > sma_20[ii] - 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] ; 
      bbo_20_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < sma_20[ii] - 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_20_pv[0] = -1.0 ; } // short
    
    } // end of bbo_20_pv == 1.0 loop

    if ( bbo_20_pv[0] == -1.0 ) {

      // default short position
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > sma_20[ii] - std_20[ii] && close[ii] < sma_20[ii] + 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] ; 
      bbo_20_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > sma_20[ii] + 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_20_pv[0] = 1.0 ; } // long

    } // end of bbo_20_pv == -1.0 loop

    //-------------------------------------------------------------------------------------------------

    // bbo_50_eq
    if ( bbo_50_pv[0] == 0.0 ) {

      // default position
      bbo_50_eq[ii] = bbo_50_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > sma_50[ii] + 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_50_pv[0] = 1.0 ; } // long

      if ( close[ii] < sma_50[ii] - 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_50_pv[0] = -1.0 ; } // short

    } // end of bbo_50_pv == 0.0 loop

    if ( bbo_50_pv[0] == 1.0 ) {

      // default long position
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < sma_50[ii] + std_50[ii] && close[ii] > sma_50[ii] - 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] ; 
      bbo_50_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < sma_50[ii] - 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_50_pv[0] = -1.0 ; } // short
    
    } // end of bbo_50_pv == 1.0 loop

    if ( bbo_50_pv[0] == -1.0 ) {

      // default short position
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > sma_50[ii] - std_50[ii] && close[ii] < sma_50[ii] + 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] ; 
      bbo_50_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > sma_50[ii] + 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_50_pv[0] = 1.0 ; } // long

    } // end of bbo_50_pv == -1.0 loop

    //-----------------------------------------------------------------------------------------------------

    // donc_20_eq
    if ( donc_20_pv[0] == 0.0 ) {

      // default position
      donc_20_eq[ii] = donc_20_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_20_pv[0] = 1.0 ; } // long

      if ( close[ii] < *std::min_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_20_pv[0] = -1.0 ; } // short

    } // end of donc_20_pv == 0.0 loop

    if ( donc_20_pv[0] == 1.0 ) {

      // default long position
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < *std::min_element( &close[ii-10], &close[ii] ) && close[ii] > *std::min_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] ; 
      donc_20_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < *std::min_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_20_pv[0] = -1.0 ; } // short
    
    } // end of donc_20_pv == 1.0 loop

    if ( donc_20_pv[0] == -1.0 ) {

      // default short position
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-10], &close[ii] ) && close[ii] < *std::max_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] ; 
      donc_20_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > *std::max_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_20_pv[0] = 1.0 ; } // long

    } // end of donc_20_pv == -1.0 loop

    //-------------------------------------------------------------------------------------------------

    // donc_50_eq
    if ( donc_50_pv[0] == 0.0 ) {

      // default position
      donc_50_eq[ii] = donc_50_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_50_pv[0] = 1.0 ; } // long

      if ( close[ii] < *std::min_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_50_pv[0] = -1.0 ; } // short

    } // end of donc_50_pv == 0.0 loop

    if ( donc_50_pv[0] == 1.0 ) {

      // default long position
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < *std::min_element( &close[ii-25], &close[ii] ) && close[ii] > *std::min_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] ; 
      donc_50_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < *std::min_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_50_pv[0] = -1.0 ; } // short
    
    } // end of donc_50_pv == 1.0 loop

    if ( donc_50_pv[0] == -1.0 ) {

      // default short position
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-25], &close[ii] ) && close[ii] < *std::max_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] ; 
      donc_50_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > *std::max_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_50_pv[0] = 1.0 ; } // long

    } // end of donc_50_pv == -1.0 loop

    //-------------------------------------------------------------------------------------------------

    // composite_eq
    comp_pv[0] = sma_10_20_pv[0] + sma_20_50_pv[0] + tma_pv[0] + bbo_20_pv[0] + bbo_50_pv[0] + donc_20_pv[0] + donc_50_pv[0] ;
    
    if ( comp_pv[0] > 0 ) {
    composite_eq[ii] = composite_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; } // long

    if ( comp_pv[0] < 0 ) {
    composite_eq[ii] = composite_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; } // short 

    if ( comp_pv[0] == 0 ) {
    composite_eq[ii] = composite_eq[ii-1] ; } // neutral 

} // end of main for loop

// Now fill in the last two spaces in the equity vectors
market_mode_long_eq[n-1] = market_mode_long_eq[n-3] ;
market_mode_long_eq[n-2] = market_mode_long_eq[n-3] ;

market_mode_short_eq[n-1] = market_mode_short_eq[n-3] ;
market_mode_short_eq[n-2] = market_mode_short_eq[n-3] ;

market_mode_composite_eq[n-1] = market_mode_composite_eq[n-3] ;
market_mode_composite_eq[n-2] = market_mode_composite_eq[n-3] ;

sma_10_20_eq[n-1] = sma_10_20_eq[n-3] ;
sma_10_20_eq[n-2] = sma_10_20_eq[n-3] ;

sma_20_50_eq[n-1] = sma_20_50_eq[n-3] ;
sma_20_50_eq[n-2] = sma_20_50_eq[n-3] ;

tma_eq[n-1] = tma_eq[n-3] ;
tma_eq[n-2] = tma_eq[n-3] ;

bbo_20_eq[n-1] = bbo_20_eq[n-3] ;
bbo_20_eq[n-2] = bbo_20_eq[n-3] ;

bbo_50_eq[n-1] = bbo_50_eq[n-3] ;
bbo_50_eq[n-2] = bbo_50_eq[n-3] ;

donc_20_eq[n-1] = donc_20_eq[n-3] ;
donc_20_eq[n-2] = donc_20_eq[n-3] ;

donc_50_eq[n-1] = donc_50_eq[n-3] ;
donc_50_eq[n-2] = donc_50_eq[n-3] ;

composite_eq[n-1] = composite_eq[n-3] ;
composite_eq[n-2] = composite_eq[n-3] ;

return List::create(
  _["market_mode_long_eq"] = market_mode_long_eq ,
  _["market_mode_short_eq"] = market_mode_short_eq ,
  _["market_mode_composite_eq"] = market_mode_composite_eq ,
  _["sma_10_20_eq"] = sma_10_20_eq ,
  _["sma_20_50_eq"] = sma_20_50_eq , 
  _["tma_eq"] = tma_eq ,
  _["bbo_20_eq"] = bbo_20_eq ,
  _["bbo_50_eq"] = bbo_50_eq ,
  _["donc_20_eq"] = donc_20_eq ,
  _["donc_50_eq"] = donc_50_eq ,
  _["composite_eq"] = composite_eq ) ; '

basic_benchmark_equity <- cxxfunction(signature(a = "numeric", b = "numeric", c = "numeric",
                                d = "numeric", e = "numeric", f = "numeric"), body=src, 
                                plugin = "Rcpp")

This Rcpp function is then called thus, using Rstudio

library(xts)  # load the required library

# load the "indicator" file 
data <- read.csv(file="usdyenind",head=FALSE,sep=,)
tick_size <- 0.01
tick_value <- 12.50

# extract other vectors of interest
open <- data[,2]
close <- data[,5]
market_mode <- data[,228]
kalman <- data[,283]

results <- basic_benchmark_equity(open,close,market_mode,kalman,tick_size,tick_value)

# coerce the above results list object to a data frame object
results_df <- data.frame( results )
df_max <- max(results_df) # for scaling of results plot
df_min <- min(results_df) # for scaling of results plot

# and now create an xts object for plotting
results_xts <- xts(results_df,as.Date(data[,'V1']))

# a nice plot of the results_xts object
par(col="#0000FF")
plot(results_xts[,'market_mode_long_eq'],main="USDYEN Pair",ylab="$ Equity Value",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="#B0171F")
plot(results_xts[,'market_mode_short_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="#00FF00")
plot(results_xts[,'market_mode_composite_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="#808080")
plot(results_xts[,'sma_10_20_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'sma_20_50_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'tma_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'bbo_20_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'bbo_50_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'donc_20_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'donc_50_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="black")
plot(results_xts[,'composite_eq'],main="",ylim=c(df_min,df_max),type="l")


to output .png files which I have strung together in this video
Non-embedded view here.
The light grey equity curves are the individual curves for the benchmark systems and the black is the "committee" 1 contract equity curve. Also shown are the long and short 1 contract equity curves ( blue and red respectively ), along with a green combined equity curve for these, for my Naive Bayesian Classifier following the simple rules
  • be long 1 contract if the market type is uwr, unr or cyclic with my Kalman filter pointing upwards
  • be short 1 contract if the market type is dwr, dnr or cyclic with my Kalman filter pointing downwards
Again this is just a toy example of the use of my Bayesian Classifier.  After I have completed Andrew Ng's machine learning course, as mentioned in my previous post, I will have a go at coding a neural net which may end up replacing the Bayesian Classifier.