Showing posts with label Trend Vigor indicator. Show all posts
Showing posts with label Trend Vigor indicator. Show all posts

Tuesday, 23 August 2011

Naive Bayes Classifier, Mark 2.

It has taken some time, but I have finally been able to incorporate the Trend Vigor indicator into my Naive Bayesian classifier, but with a slight twist. Instead of being purely Bayesian, the classifier has evolved to become a hybrid Bayesian/clustering classifier. The reason for this is that the Trend Vigor indicator has no varying distribution of values but tends to return values that are so close to each other that they can be considered a single value, as mentioned in an earlier post of mine. This can be clearly seen in the short 3D visualisation animation below. The x, y and z axis each represent an input to the classifier, and about 7 seconds into the video you can see the Trend Vigor axis in the foreground with almost straight vertical lines for its "distributions" for each market type. However, it can also be seen that there are spaces in 3D where only combined values for one specific market type appear, particularly evident in the "tails" of the no retracement markets ( the outermost blue and magenta distributions in the video. )



( Non embedded view here )

The revised version of the classifier takes advantage of this fact. Through a series conditional statements each 3D datum point is checked to see if it falls in any of these mutually exclusive spaces and if it does, it is classified as belonging to the market type that has "ownership" of the space in which it lies. If the point cannot be classified via this simple form of clustering then it is assigned a market type through Bayesian analysis.

This Bayesian analysis has also been revised to take into account the value of the Trend Vigor indicator. Since these values have no distribution to speak of a simple linear model is used. If a point is equidistant between two Trend Vigor classifications it is assigned a 0.5 probability of belong to each, this probability rising in linear fashion to 1.0 if it falls exactly on one of the vertical lines mentioned above, with a corresponding decrease in probability assigned to the other market type classification. There is also a boundary condition applied where the probability is set to 0.0 for belonging to a particular market type.

The proof of the pudding is in the eating, and this next chart shows the classification error rate when the classifier is applied to my usual "ideal" time series.


The y axis is the percentage of ideal time series runs in which market type was mis-classified, and the x axis is the period of the cyclic component of the time series being tested. In this test I am only concerned with the results for periods greater than 10 as in real data I have never seen extracted periods less than this. As can be seen the sideways market and both the up and down with no retracement markets have zero mis-classification rates, apart from a small blip at period 12, which is within the 5% mis-classification error rate I had set as my target earlier.

Of more concern was the apparent large mis-classification error rate of the retracement markets ( the green and black lines in the chart. ) However, further investigation of these errors revealed them not to be "errors" as such but more a quirk of the classifier, which lends itself to exploitation. Almost all of the "errors" occur consecutively at the same phase of the cyclic component, at all periods, and the "error" appears in the same direction. By this I mean that if the true market type is up with retracement, the "error" indicates an up with no retracement market; if the true market type is down with retracement, the "error" indicates a down with no retracement market. The two charts below show this visually for both the up and down with retracement markets and are typical representations of the "error" being discussed.


The first pane in each chart shows one complete cycle in which the whole cycle, including the most recent datum point, are correctly classified as being an up with retracement market ( upper chart ) and a down with retracement market ( lower chart. ) The second pane shows a snapshot of the cycle after it has progressed in time through its phase with the last point being the last point that is mis-classified. The "difference" between each chart's respective two panes at the right hand edge shows the portion of the time series that is mis-classified.

It can be seen that the mis-classification occurs at the end of the retracement, immediately prior to the actual turn. This behaviour could easily be exploited via a trading rule. For example, assume that the market has been classified as an up with retracement market and a retracement short trade has been taken. As the retracement proceeds our trade moves into profit but then the market classification changes to up with no retracement. Remember that the classifier (never?) mis-classifies such no retracement markets. What would one want to do in such a situation? Obviously one would want to exit the current short trade and go long, and in so doing would be exiting the short and initiating the possible long at precisely the right time; just before the market turn upwards! This mis-classification "error" could, on real data, turn out to be very serendipitous.

All in all, I think this revised, Mark 2 version of my market classifier is a marked improvement on its predecessor.

Friday, 29 July 2011

Update on the Trend Vigor indicator

I have now completed some preliminary Monte Carlo testing of the Trend Vigor indicator and I thought this would be a good opportunity to share with readers the general approach I have been taking when talking about my Monte Carlo testing on predefined, "ideal" market types.

Firstly, I create my "ideal" market types by using an Octave .oct function, the code for which is given below. In this code can also be seen my implementation of the code for the Trend Vigor indicator, which I think is slightly different from Elher's. The code is commented, so no further description is required here.

.oct function code
// This function takes as arguments a single value input for a sinewave period and a single value input for a degrees increment value. A sinewave of the 
// given period is created and then trends are added to the sinewave such that 5 hypothesised "ideal" market types are created: these are

// 1) a perfectly cyclic sideways market with no trend i.e. just the sinewave component
// 2) an uptrending market with cyclic retracements (uwr) such that retracements are down to the 50% Fibonacci ratio level
// 3) an uptrending market with no retracemnets (unr) i.e. the uptrend completely swamps the downward cyclic component counter to the trend
// 4) a downtrending market with cyclic retracements (dwr) such that retracements are up to the 50% Fibonacci ratio level
// 5) a downtrending market with no retracements (dnr) i.e. the downtrend completely swamps the upward cyclic component counter to the trend

// The cyclic component of these markets is then extracted using the bandpass indicator. These vector lengths are 500 in order to allow time for the 
// bandpass calculations to settle down. The peak to peak amplitude is then recovered from the bandpass at the end of each market vector. The trend slope 
// is also calculated at the end of each market vector. The trend vigor indicator value is then calculated thus
// trend_slope/p_to_p. The bandpass delta is set to 0.2.

// The idea is that 
// for a sideways market the value should be about 0
// for a trending with retracement market the value should be > 0 && < 1 or < 0 && > -1, depending on whether an up trend or down trend
// for a trending with no retracement market the ratio should be > 1  or > -1, depending on whether an up trend or down trend 

// The original sinewave is then repeatedly phase shifted by the degrees increment value, the above process repeated and new values are calculated. 
// All the calculated values for each market type are the vector outputs of the function. 

#include 
#include 
#include 
#define PI 3.14159265

DEFUN_DLD (trend_vigor_dist, args, , "Inputs are period & degrees increment value, outputs are vectors of repeated median slope values")
{
    octave_value_list retval_list;

    if (args(0).length () < 1 | args(0).length () > 1)
    {
        error ("Invalid arguments. Inputs are single value period length and single value degrees increment value");
        return retval_list;
    }

    if (args(1).length () < 1 | args(1).length () > 1)
    {
        error ("Invalid arguments. Inputs are single value period length and single value degrees increment value");
        return retval_list;
    }

    if (error_state)
    {
        error ("Invalid arguments. Inputs are single value period length and single value degrees increment value");
        return retval_list;
    }
    // read inputs
    int period = args(0).int_value ();
    double degrees_inc = args(1).double_value ();
    double period_inc = 360.0 / double(period);
    int length = period + 1; // the length of the "lookback". Is equal to period + 1 to encompass a full period

    // vectors to hold created market values
    ColumnVector sideways_vec(500); // vector to hold sideways market values
    ColumnVector sideways_bandpass_vec(500); // vector to hold bandpass of sideways market values

    ColumnVector uwr_vec(500); // vector to hold uwr market values
    ColumnVector uwr_bandpass_vec(500); // vector to hold bandpass of uwr market values
 
    ColumnVector unr_vec(500); // vector to hold unr market values
    ColumnVector unr_bandpass_vec(500); // vector to hold bandpass of unr market values

    ColumnVector dwr_vec(500); // vector to hold dwr market values
    ColumnVector dwr_bandpass_vec(500); // vector to hold bandpass of dwr market values
 
    ColumnVector dnr_vec(500); // vector to hold dnr market values
    ColumnVector dnr_bandpass_vec(500); // vector to hold bandpass of dnr market values

    // calculate the market trend_incs
    double uwr_trend_inc = 12 / ( 5 * double(period) );
    double unr_trend_inc = 4 / double(period);
    double dwr_trend_inc = -( 12 / ( 5 * double(period) ) );
    double dnr_trend_inc = -( 4 / double(period) );

    // declare variables for bandpass and trend vigor calculations
    double delta = 0.2;
    double beta = cos( (360.0/period)*PI/180.0 );
    double gamma = 1.0 / cos( (720.0*delta/period)*PI/180.0 );
    double alpha = gamma - sqrt(gamma*gamma - 1.0); 
    double power_side;
    double power_uwr;
    double power_unr;
    double power_dwr;
    double power_dnr;
    double rms;
    double p_to_p;

    // create output vectors
    int output_vec_length = int ( 360 / degrees_inc );
    ColumnVector sideways_dist ( output_vec_length ); // create output column of correct length for sideways market
    ColumnVector uwr_dist  ( output_vec_length ); // create output column of correct length for uwr market
    ColumnVector unr_dist  ( output_vec_length ); // create output column of correct length for unr market
    ColumnVector dwr_dist  ( output_vec_length ); // create output column of correct length for dwr market
    ColumnVector dnr_dist  ( output_vec_length ); // create output column of correct length for dnr market

    for (octave_idx_type ii (0); ii < output_vec_length; ii++) 
    {

        // Create the market types and their bandpasses for this ii iteration
 for (octave_idx_type jj (0); jj < 500; jj++) 
        {

        // First create the sideways market type
        sideways_vec(jj) = sin( (degrees_inc*ii + period_inc*jj) * PI / 180 );
                
                if ( jj < 2 )
                {
                sideways_bandpass_vec(jj) = 0;
                }
                else
                {
                sideways_bandpass_vec(jj) = 0.5*(1.0 - alpha)*(sideways_vec(jj) - sideways_vec(jj-2)) + beta*(1.0 + alpha)*sideways_bandpass_vec(jj-1) - alpha*sideways_bandpass_vec(jj-2);
                }

        // next, the uwr retracement market (uwr)
        uwr_vec(jj) = sideways_vec(jj) + jj*uwr_trend_inc;

                if ( jj < 2 )
                {
                uwr_bandpass_vec(jj) = 0;
                }
                else
                {
                uwr_bandpass_vec(jj) = 0.5*(1.0 - alpha)*(uwr_vec(jj) - uwr_vec(jj-2)) + beta*(1.0 + alpha)*uwr_bandpass_vec(jj-1) - alpha*uwr_bandpass_vec(jj-2);
                }

        // next, the unr retracement market (unr)
        unr_vec(jj) = sideways_vec(jj) + jj*unr_trend_inc;

                if ( jj < 2 )
                {
                unr_bandpass_vec(jj) = 0;
                }
                else
                {
                unr_bandpass_vec(jj) = 0.5*(1.0 - alpha)*(unr_vec(jj) - unr_vec(jj-2)) + beta*(1.0 + alpha)*unr_bandpass_vec(jj-1) - alpha*unr_bandpass_vec(jj-2);
                }

        // next, the dwr retracement market (dwr)
        dwr_vec(jj) = sideways_vec(jj) + jj*dwr_trend_inc;

                if ( jj < 2 )
                {
                dwr_bandpass_vec(jj) = 0;
                }
                else
                {
                dwr_bandpass_vec(jj) = 0.5*(1.0 - alpha)*(dwr_vec(jj) - dwr_vec(jj-2)) + beta*(1.0 + alpha)*dwr_bandpass_vec(jj-1) - alpha*dwr_bandpass_vec(jj-2);
                }

        // next, the dnr retracement market (dnr)
        dnr_vec(jj) = sideways_vec(jj) + jj*dnr_trend_inc;

                if ( jj < 2 )
                {
                dnr_bandpass_vec(jj) = 0;
                }
                else
                {
                dnr_bandpass_vec(jj) = 0.5*(1.0 - alpha)*(dnr_vec(jj) - dnr_vec(jj-2)) + beta*(1.0 + alpha)*dnr_bandpass_vec(jj-1) - alpha*dnr_bandpass_vec(jj-2);
                }

        } // end of jj loop to create the different markets and their bandpasses

    // now loop over end of each market vector to create the distributions

     power_side = 0.0;
     power_uwr = 0.0;
     power_unr = 0.0;
     power_dwr = 0.0;
     power_dnr = 0.0;

     for (octave_idx_type jj (0); jj < length; jj++)
         {
         power_side = power_side + sideways_bandpass_vec(499-jj)*sideways_bandpass_vec(499-jj) + sideways_bandpass_vec(499-jj-int(period/4.0))*sideways_bandpass_vec(499-jj-int(period/4.0)) ;
         power_uwr = power_uwr + uwr_bandpass_vec(499-jj)*uwr_bandpass_vec(499-jj) + uwr_bandpass_vec(499-jj-int(period/4.0))*uwr_bandpass_vec(499-jj-int(period/4.0)) ;
         power_unr = power_unr + unr_bandpass_vec(499-jj)*unr_bandpass_vec(499-jj) + unr_bandpass_vec(499-jj-int(period/4.0))*unr_bandpass_vec(499-jj-int(period/4.0)) ;
         power_dwr = power_dwr + dwr_bandpass_vec(499-jj)*dwr_bandpass_vec(499-jj) + dwr_bandpass_vec(499-jj-int(period/4.0))*dwr_bandpass_vec(499-jj-int(period/4.0)) ;
         power_dnr = power_dnr + dnr_bandpass_vec(499-jj)*dnr_bandpass_vec(499-jj) + dnr_bandpass_vec(499-jj-int(period/4.0))*dnr_bandpass_vec(499-jj-int(period/4.0)) ;
         }

     // fill the distribution vectors
     rms = sqrt( power_side / (period+1) ) ;
     p_to_p = 2.0 * 1.414 * rms ;
     sideways_dist(ii) = ( sideways_vec(499) - sideways_vec(499-period) ) / p_to_p ;

     rms = sqrt( power_uwr / (period+1) ) ;
     p_to_p = 2.0 * 1.414 * rms ;
     uwr_dist(ii) = ( uwr_vec(499) - uwr_vec(499-period) ) / p_to_p ;

     rms = sqrt( power_unr / (period+1) ) ;
     p_to_p = 2.0 * 1.414 * rms ;
     unr_dist(ii) = ( unr_vec(499) - unr_vec(499-period) ) / p_to_p ;

     rms = sqrt( power_dwr / (period+1) ) ;
     p_to_p = 2.0 * 1.414 * rms ;
     dwr_dist(ii) = ( dwr_vec(499) - dwr_vec(499-period) ) / p_to_p ;

     rms = sqrt( power_dnr / (period+1) ) ;
     p_to_p = 2.0 * 1.414 * rms ;
     dnr_dist(ii) = ( dnr_vec(499) - dnr_vec(499-period) ) / p_to_p ;

    } // end of main ii loop

    retval_list(4) = dnr_dist;
    retval_list(3) = dwr_dist;
    retval_list(2) = unr_dist;
    retval_list(1) = uwr_dist;
    retval_list(0) = sideways_dist;

    return retval_list;
}
 
This function is called by this simple Octave script
clear all

inc = input( "Enter phase increment: ");

for ii = 6:50

[sideways_dist,uwr_dist,unr_dist,dwr_dist,dnr_dist] = trend_vigor_dist(ii,inc);
A=[sideways_dist,uwr_dist,unr_dist,dwr_dist,dnr_dist];
file = strcat( int2str(ii),"_period_dist" );
dlmwrite(file,A)

endfor 
which writes the output of the tests to named files which are to be used for further analysis in R.

Firstly, using R, I wanted to see what the distribution of the results looks like, so this R script
rm(list=ls())
data <- as.matrix(read.csv(file="20_period_dist",head=FALSE,sep=,))
side <- density(data[,1])
uwr <- density(data[,2])
max_uwr_y <- max(uwr$y)
max_uwr_x <- max(uwr$x)
min_uwr_x <- min(uwr$x)
unr <- density(data[,3])
max_unr_y <- max(unr$y)
max_unr_x <- max(unr$x)
min_unr_x <- min(unr$x)
dwr <- density(data[,4])
max_dwr_y <- max(dwr$y)
max_dwr_x <- max(dwr$x)
min_dwr_x <- min(dwr$x)
dnr <- density(data[,5])
max_dnr_y <- max(dnr$y)
max_dnr_x <- max(dnr$x)
min_dnr_x <- min(dnr$x)
plot_max_y <- max(max_uwr_y,max_unr_y,max_dwr_y,max_dnr_y)
plot_max_x <- max(max_uwr_x,max_unr_x,max_dwr_x,max_dnr_x)
plot_min_x <- min(min_uwr_x,min_unr_x,min_dwr_x,min_dnr_x)
par(mfrow=c(2,1))
plot(uwr,xlim=c(plot_min_x,plot_max_x),ylim=c(0,plot_max_y),col="red")
par(new=TRUE)
plot(unr,xlim=c(plot_min_x,plot_max_x),ylim=c(0,plot_max_y),col="blue")
par(new=TRUE)
plot(dwr,xlim=c(plot_min_x,plot_max_x),ylim=c(0,plot_max_y),col="green")
par(new=TRUE)
plot(dnr,xlim=c(plot_min_x,plot_max_x),ylim=c(0,plot_max_y),col="black")
plot(side)
 
produces plots such as this output
where the top plot shows the distributions of the uwr, unr, dwr and dnr markets, and the lower plot the sideways market. In this particular case, the spread of each distribution is so narrow ( measured differences of the order of thousandths of a decimal place ) that I consider that for practical purposes the distributions can be treated as single values. This simple R boot strap script gets the average value of the distributions to be used as this single point value.
rm(list=ls())

data <- as.matrix(read.csv(file="trend_vigor_dist_results",head=FALSE,sep=,))
side <- data[,1]
uwr <- data[,2]
unr <- data[,3]
dwr <- data[,4]
dnr <- data[,5]
side_samples <- matrix(0,50000)
uwr_samples <- matrix(0,50000)
unr_samples <- matrix(0,50000)
dwr_samples <- matrix(0,50000)
dnr_samples <- matrix(0,50000)

for(ii in 1:50000) {
side_samples[ii] <- mean(sample(side, replace=T))
uwr_samples[ii] <- mean(sample(uwr, replace=T))
unr_samples[ii] <- mean(sample(unr, replace=T))
dwr_samples[ii] <- mean(sample(dwr, replace=T))
dnr_samples[ii] <- mean(sample(dnr, replace=T))
}

side_mean <- mean(side_samples)
uwr_mean <- mean(uwr_samples)
unr_mean <- mean(unr_samples)
dwr_mean <- mean(dwr_samples)
dnr_mean <- mean(dnr_samples) 
For readers' interest, the actual values are 0.829, 1.329, -0.829 and -1.329 with 0 for the sideways market.

This final plot is the same Natural Gas plot as in my previous post, but with the above values substituted for Ehler's default values of 1 and -1.
What I intend to do now is use these values as the means of normal distributions with varying standard deviations as inputs for my Naive Bayes classifier. Further Monte Carlo testing will be done such that values for the standard deviations are obtained that result in the classifier giving false classifications, when tested using the "ideal" markets code above, within acceptable limits, most probably a 5% classification error rate.

Tuesday, 26 July 2011

A "new" indicator for possible inclusion in my Naive Bayesian Classifier

I have recently come across another interesting indicator on Ehler's website, information about which is available for download in the seminars section. It is the Trend Vigor indicator and below is my Octave .oct function implementation of it, first shown on the continuous, back-adjusted Natural Gas contract
and next the EURUSD spot forex market, both daily bars.
My implementation is slightly different from Ehler's in that there is no smoothing of the trend slope prior to calculation of the indicator. My reason for this is that smoothing will change the probability distribution of the indicator values, and since my Naive Bayes Classifier uses Kernel Density Estimation of a Mixture model, I prefer not to "corrupt" this density calculation with the subjective choice of a smoothing algorithm. As before, all parameters for this part of the classifier will be determined by using Monte Carlo techniques on "ideal," synthetically modelled market prices.

In the Natural Gas chart above it can be seen that the latter half of the chart is mostly sideways action, as determined by the Trend Vigor indicator, whilst my current version of the classifier ( the colour coded candlesticks ) does not give a corresponding similar determination for the whole of this later half of the chart. This is also the case in the second EURUSD chart. It is my hope that including the Trend Vigor indicator as an input to the classifier will improve the classifier's classification ability.

More on this in due course.