Dekalog Blog

Wednesday, 8 May 2013

Random Entries

Over on the Tradersplace forum there is a thread about random entries, early on in which I replied

"...I don't think the random entry tests as outlined above prove anything. First off, I think there is a subtle bias in the results. I could say that a 50/50 win/lose random entry by rolling dice with payoffs of -1, -2, -3, +4, +5 and +6 is good because there is an expected +9 every 6 rolls so random entries are great, whereas payoffs of +1, +2, +3, -4, -5 and -6 is bad with expectation -9 every 6 rolls, so random entries suck. These two "tests" prove nothing about random entries because the payoffs aren't taken into account.

In a similar fashion a 50/50 choice to be long or short means nothing if there is an extended trend in one direction or another such that the final price of the tradeable is much higher or lower in February 2013 than it was in January 1990. Of course a 50% chance to get on the right side of such a market will on average show a profit, just as with the first dice example above, but this profit results from the bias (trend) in the data and not the efficacy of the entry or exit. At a bare minimum, the returns of the tradeable should be detrended to remove this bias before any tests are conducted."

Later on in the thread the OP asked me to expand on my views and this post is my exposition on the matter. First off, I am going to use simulated markets because I think it will make it easier to see the points I am trying to make. If real data were used, as in the OP's tests, along with the other assumptions such as 1% risk, choice of markets etc. the fundamental truth of the random entries would be obfuscated by the "noise" these choices introduce. To start with I have created the following interactive Octave script

clear all

fprintf('\nMarket oscillations are represented by a sine wave.\n')

period = input('Choose a period for this sine wave: ')

% create sideways market - a sine wave
sideways_market = sind(0:(360.0/period):3600).+2 ;
clf ;
plot(sideways_market) ;
title('Sideways Market') ;

fprintf('\nDisplayed is a plot of the chosen period sine wave, representing a sideways market.\nPress enter to continue.\n')
pause ;

sideways_returns = diff(log(sideways_market)) ;

fprintf('\nThe total log return (sum of log differences) of this market is %f\n', sum(sideways_returns) ) ;

fprintf('\nPress enter to see next market type.\n') ;
pause ;

% and create a trending market
trending_market = ((0:1:size(sideways_market,2)-1).*1.333/period).+sideways_market ;
clf ;
plot(trending_market) ;
title('Trending Market') ;

fprintf('\nNow displayed is the same sine wave with a trend component to represent\nan uptrending market with 50%% retracements.\nPress enter to see returns.\n')
pause ;

trending_returns = diff(log(trending_market)) ;

fprintf('\nThe total log return of this trending market is %f\n', sum(trending_returns) ) ;

fprintf('\nNow we will do the Monte Carlo testing.\n')
iter = input('Choose number of iterations for loop: ')

sideways_iter = zeros(iter,1) ;
trend_iter = zeros(iter,1) ;

for ii = 1:iter

% create the random position vector
ix = rand( size(sideways_returns,2) , 1 ) ;
ix( ix >= 0.5 ) = 1 ;
ix( ix &lt; 0.5 ) = -1 ;

sideways_iter( ii , 1 ) = sideways_returns * ix ;
trend_iter( ii , 1 ) = trending_returns * ix ;

end

clf ;
hist(sideways_iter)
title('Returns Histogram for Sideways Market')

fprintf('\nNow displayed is a histogram of sideways market returns.\nPress enter to see trending market returns.\n')
pause ;

clf ;
hist(trend_iter)
title('Returns Histogram for Trending Market')

fprintf('\nFinally, now displayed is a histogram of the trending market returns.\n')

the output of which is shown in the desktop recording video below.

Link to Youtube version of video.
The purpose of this basic coding introduction is to show that random entries combined together with random exits have no expected value, evidenced by the histograms for Monte Carlo returns for both the sideways market and the trending market being centred about zero. The video might not clearly show this, so readers are invited to run the code for themselves. I would suggest values for the period in the range of 10 to 50.

However, in the linked thread the OP did not use random exits but instead used a "trailing 5 ATR stop based on a 20 day simple average ATR," i.e. a random entry with a heuristically pre-defined exit criteria. To emulate this, in the next code box I have coded a trailing stop exit based on the standard deviation of the bar to bar differences. I deemed this approach to be necessary for the simulated markets being used as there are no OHLC bars from which to calculate ATR. As in the earlier code box the code creates the simulated markets with user input prompted from the terminal, displays them, then performs the random entry Monte Carlo routine with a user chosen "standard deviation stop multiplier," produces histograms of the MC routine results, displays simple summary statistics and then finally displays plots of the markets along with the relevant trailing stops, with the stop levels being in red.

clear all

fprintf('\nMarket oscillations are represented by a sine wave.\n')

period = input('Choose a period for this sine wave: ')

% create sideways market - a sine wave
sideways_market = sind(0:(360.0/period):3600).+2 ;
clf ;
plot(sideways_market) ;
title('Sideways Market') ;

fprintf('\nDisplayed is a plot of the chosen period sine wave, representing a sideways market.\nPress enter to continue.\n')
pause ;

sideways_returns = diff(log(sideways_market)) ;

fprintf('\nThe total log return (sum of log differences) of this market is %f\n', sum(sideways_returns) ) ;

fprintf('\nPress enter to see next market type.\n') ;
pause ;

% and create a trending market
trending_market = ((0:1:size(sideways_market,2)-1).*1.333/period).+sideways_market ;
clf ;
plot(trending_market) ;
title('Trending Market') ;

fprintf('\nNow displayed is the same sine wave with a trend component to represent\nan uptrending market with 50%% retracements.\nPress enter to see returns.\n')
pause ;

trending_returns = diff(log(trending_market)) ;

fprintf('\nThe total log return of this trending market is %f\n', sum(trending_returns) ) ;

fprintf('\nNow we will do the Monte Carlo testing.\n')
iter = input('Choose number of iterations for loop: ')
stop_mult = input('Choose a standard deviation stop multiplier: ')

sideways_iter = zeros(iter,1) ;
trend_iter = zeros(iter,1) ;

sideways_position_vector = zeros( size(sideways_returns,2) , 1 ) ;
sideways_std_stop = stop_mult * std( diff(sideways_market) ) ;

trending_position_vector = zeros( size(trending_returns,2) , 1 ) ;
trending_std_stop = stop_mult * std( diff(trending_market) ) ;

for ii = 1:iter

position = rand(1) ;
position( position >= 0.5 ) = 1 ;
position( position &lt; 0.5 ) = -1 ;

sideways_position_vector(1,1) = position ;
trending_position_vector(1,1) = position ;

sideways_long_stop = sideways_market(1,1) - sideways_std_stop ;
sideways_short_stop = sideways_market(1,1) + sideways_std_stop ;

trending_long_stop = trending_market(1,1) - trending_std_stop ;
trending_short_stop = trending_market(1,1) + trending_std_stop ;

  for jj = 2:size(sideways_returns,2)
  
    if sideways_position_vector(jj-1,1)==1 && sideways_market(1,jj)&lt;=sideways_long_stop
       position = rand(1) ;
       position( position >= 0.5 ) = 1 ;
       position( position &lt; 0.5 ) = -1 ;
       sideways_position_vector(jj,1) = position ;
       sideways_long_stop = sideways_market(1,jj) - sideways_std_stop ;
       sideways_short_stop = sideways_market(1,jj) + sideways_std_stop ;
    elseif sideways_position_vector(jj-1,1)==-1 && sideways_market(1,jj)>=sideways_short_stop  
       position = rand(1) ;
       position( position >= 0.5 ) = 1 ;
       position( position &lt; 0.5 ) = -1 ;
       sideways_position_vector(jj,1) = position ;
       sideways_long_stop = sideways_market(1,jj) - sideways_std_stop ;
       sideways_short_stop = sideways_market(1,jj) + sideways_std_stop ;
    else
       sideways_position_vector(jj,1) = sideways_position_vector(jj-1,1) ;
       sideways_long_stop = max( sideways_long_stop , sideways_market(1,jj) - sideways_std_stop ) ; 
       sideways_short_stop = min( sideways_short_stop , sideways_market(1,jj) + sideways_std_stop ) ;
    end 
    
     if trending_position_vector(jj-1,1)==1 && trending_market(1,jj)&lt;=trending_long_stop
       position = rand(1) ;
       position( position >= 0.5 ) = 1 ;
       position( position &lt; 0.5 ) = -1 ;
       trending_position_vector(jj,1) = position ;
       trending_long_stop = trending_market(1,jj) - trending_std_stop ;
       trending_short_stop = trending_market(1,jj) + trending_std_stop ;
    elseif trending_position_vector(jj-1,1)==-1 && trending_market(1,jj)>=trending_short_stop  
       position = rand(1) ;
       position( position >= 0.5 ) = 1 ;
       position( position &lt; 0.5 ) = -1 ;
       trending_position_vector(jj,1) = position ;
       trending_long_stop = trending_market(1,jj) - trending_std_stop ;
       trending_short_stop = trending_market(1,jj) + trending_std_stop ;
    else
       trending_position_vector(jj,1) = trending_position_vector(jj-1,1) ;
       trending_long_stop = max( trending_long_stop , trending_market(1,jj) - trending_std_stop ) ; 
       trending_short_stop = min( trending_short_stop , trending_market(1,jj) + trending_std_stop ) ;
    end 
  
  end

sideways_iter( ii , 1 ) = sideways_returns * sideways_position_vector ;
trend_iter( ii , 1 ) = trending_returns * trending_position_vector ;

end

mean_sideways_iter = mean( sideways_iter ) ;
std_sideways_iter = std( sideways_iter ) ;
sideways_dist = ( mean_sideways_iter - sum(sideways_returns) ) / std_sideways_iter 

mean_trend_iter = mean( trend_iter ) ;
std_trend_iter = std( trend_iter ) ;
trend_dist = ( mean_trend_iter - sum(trending_returns) ) / std_trend_iter 

clf ;
hist(sideways_iter)
title('Returns Histogram for Sideways Market')
xlabel('Log Returns')

fprintf('\nNow displayed is a histogram of random sideways market returns.\nPress enter to see trending market returns.\n')
pause ;

clf ;
hist(trend_iter)
title('Returns Histogram for Trending Market')
xlabel('Log Returns')

fprintf('\nFinally, now displayed is a histogram of the random trending market returns.\n')

sideways_trailing_stop_1 = sideways_market .+ sideways_std_stop ;
sideways_trailing_stop_2 = sideways_market .- sideways_std_stop ;

trending_trailing_stop_1 = trending_market .+ trending_std_stop ;
trending_trailing_stop_2 = trending_market .- trending_std_stop ;

fprintf('\nNow let us look at the stops for the sideways market\nPress enter\n')
pause ;

clf ;
plot(sideways_market(1,1:75),'b',sideways_trailing_stop_1(1,1:75),'r',sideways_trailing_stop_2(1,1:75),'r')

fprintf('\nTo look at the stops for the trending market\nPress enter\n')
pause ;

clf ;
plot(trending_market(1,1:75),'b',trending_trailing_stop_1(1,1:75),'r',trending_trailing_stop_2(1,1:75),'r')

Discussion
As my above reply was concerned with the OP's application of random entries on trending data I will start with the trending market output of the code. For the purposes of a "trending market" the code creates an upwardly trending market that has retracements to an imagined 50% Fibonacci retracement level of the immediately preceding upmove. The "buy and hold" log return of this trending market for a selected period of 20 is 2.036665. Below can be seen four histograms representing 5000 MC runs with the stop multiplier being set at 1,2, 3 and 4 standard deviations, running horizontally from the top left to the bottom right.

From the summary statistics output (readers are invited to run the code themselves to see) not one shows an average log return that is greater, to a statistically significant degree, than the "buy and hold" return. In fact, for standard deviations 2, 3, and 4 the returns are less than "buy and hold" to a statistically significant degree. What this perhaps shows is that there are many more opportunities to get stop choices wrong than there are to get them right! And even if you do get it right (standard deviation 1), well, what's the point if there is no significant improvement over buy and hold? This is the crux of my original assertion that the OP's random entry tests don't prove anything.

However, the above relates only to random entries in a trend following context, with a trailing stop being set wide enough to "allow the trend to breathe" or "avoid the noise" or other such sayings common in the trend following world. What happens if we tighten the stop? Several 5000 MC runs with a standard deviation stop multiplier of 0.6 shows significant improvement, with the "buy and hold" returns being in the left tail or more than 2 standard deviations away from the average random entry return. To achieve this a fairly tight stop is required, as can be seen below.

Now rather than being a "trend following stop" this could be characterised as a "swing trading stop" where one is trying to get in and out of the market at swing highs and lows. But what if the market is not making swings but is in fact strongly trending as in

which can be achieved by altering a code line thus:

% and create a trending market
trending_market = ((0:1:size(sideways_market,2)-1).*4/period).+sideways_market ;

The same stops as above on this market all show average positive returns, but none so great as to be significantly better than this market's "buy and hold" return of 3.04452. A typical histogram for these tests is

which is quite interesting. The large bar at the right represents those MC runs where the first random position is a long which is not stopped out by the wide stops, hence resulting in a log return for the run of the "buy and hold" log return. The other bars show those MC runs which start with one or more short trades and obviously incur some initial loses prior to the first long trade. Drastically tightening the stop for this market doesn't really help things; the result is still an extreme right hand bar at the "buy and hold" level with a varying left tail.

What all this shows is that the best random entries can do is capture what I called the "bias" in my above quoted thread response, and even doing this well is highly dependent on using a suitable trailing stop; as also mentioned above there are many more opportunities to choose an unsuitable trailing stop. I also suggested that "At a bare minimum, the returns of the tradeable should be detrended to remove this bias before any tests are conducted." The point of this is that by removing the bias it immediately becomes obvious that there is no value in random entries - it will effectively produce the results that the above code gives for the sideways market, shown next.

Standard deviations 2 and 3 are what one might expect - histograms centred around a net log return of zero, standard deviation 4 provides a great opportunity to shoot yourself in the foot by choosing a terrible stop for this market, and standard deviation 1 shows what is possible by choosing a good stop. In fact, by playing around with various values for the stop multiplier I have been able to get values as high as 14 for the net average log return on this market, but such tight stops as these cannot be really be considered trailing stops as they more or less act as take profit stops at the upper and lower levels of this sideways market.

So, what use are random entries? In live trading I believe there is no use whatsoever, but as a "research tool" similar to the approach outlined here there may be some uses. They could be used to set up null hypotheses for statistical testing and benchmarking, or they could be used to see what random looks like in the context of some trading idea. However, the big caveat to this is that if one uses real data how can the randomness in the data, plus any possible non-linear effects of parameter choices, be distinguished from the effects of the "injected randomness" supplied by the random entries? All the forgoing discussion has been based on the clean, simple and predictable data of a fully-known, simulated market, with the intent of illustrating my belief about random entries. When applied portfolio wide on real data, with portfolio heat restrictions, position sizing choices etc. the whole test routine may simply have too many moving parts to draw any useful conclusions about the efficacy of random entries given that statistically speaking, even on the idealised market data used in this post, random entry trend following returns are indistinguishable from buy and hold returns.

Wednesday, 27 February 2013

Restricted Boltzmann Machine

In an earlier post I said that I would write about Restricted Boltzmann machines and now that I have begun adapting the Geoffrey Hinton course code I have, this is the first post of possibly several on this topic.

Essentially, I am going to use the RBM to conduct unsupervised learning on unlabelled real market data, using some of the indicators I have developed, to extract relevant features to initialise the input to hidden layer weights of my market classifying neural net, and then conduct backpropagation training of this feedforward neural network using the labelled data from my usual, idealised market types.

Readers may well ask, "What's the point of doing this?" Well, taken from my course assignment notes, and edited by me for relevance to this post, we have:-

In the previous assignment we tried to reduce overfitting by learning less (early stopping, fewer hidden units etc.) RBMs, on the other hand, reduce overfitting by learning more: the RBM is being trained unsupervised so it's working to discover a lot of relevant regularity in the input features, and that learning distracts the model from excessively focusing on class labels. This is much more constructive distraction: instead of early stopping the model after a little learning we instead give the model something much more meaningful to do. ...it works great for regularisation, as well as training speed. ... In the previous assignment we did a lot of work selecting the right number of training iterations, the right number of hidden units, and the right weight decay. ... Now we don't need to do that at all, ... the unsupervised training of the RBM provides all the regularisation we need. If we select a decent learning rate, that will be enough, and we'll use lots of hidden units because we're much less worried about overfitting now.

Of course, a picture is worth a thousand words, so below are a 2D and a 3D picture

These two pictures show the weights of the input to hidden layer after only two iterations of RBM training, and effectively represent a "typical" random initialisation of weights prior to backpropagation training. It is from this type of random start that the class labelled data would normally be used to train the NN.

These next two pictures tell a different story

These show weights after 50,000 iterations of RBM training. Quite a difference, and it is from this sort of start that I will now train my market classifier NN using the class labelled data.

Some features are easily seen. Firstly, the six columns on the "left" sides of these pictures result from the cyclic period features in the real data, expressed in binary form, and effectively form the weights that will attach to the NN bias units. Secondly, the "right" side shows the most recent data in the look back window applied to the real market data. The weights here have greater magnitude than those further back, reflecting the fact that shorter periods are more prevalent than longer periods and that, intuitively obvious perhaps, more recent data has greater importance than older data. Finally, the colour mapping shows that across the entire weight matrix the magnitude of the values has been decreased by the RBM training, showing its regularisation effect.

Saturday, 23 February 2013

Regime Switching Article

Readers might be interested in this article about Regime Switching, from the IFTA journal, which in intent somewhat mirrors my attempts at market classification via neural net modelling.

Sunday, 27 January 2013

Softmax Neural Net Classifier "Half" Complete

Over the last few weeks I have been busy working on the Softmax activation function output neural net classifier and have now reached a point where I have trained it over enough of my usual training data that approximately half of the real data I have would be classified by it, rather than by my previously and incompletely trained "reserve" neural net.

It has taken this long to get this far for a few reasons; the necessity to substantially adapt the code from the Geoff Hinton neural net course and then conducting grid searches over the hyper-parameter space for the "optimum" learning rate, number of neurons in the hidden layer and also incorporating some changes to the features set used as input to the classifier. At this halfway stage I thought I would subject the classifier to the cross validation test of my recent post of 5th December 2012 and the results, which speak for themselves, are shown in the box below.

Random NN
Complete Accuracy percentage: 99.610000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 0.000000
Predicted = unr & actual = uwr: 0.062000
Predicted = dwr & actual = dnr: 0.008000
Predicted = dnr & actual = dwr: 0.004000
Predicted = uwr & actual = cyc: 0.082000
Predicted = dwr & actual = cyc: 0.004000
Predicted = cyc & actual = uwr: 0.058000
Predicted = cyc & actual = dwr: 0.098000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 0.000000
Predicted = unr & actual = dwr: 0.000000
Predicted = dwr & actual = uwr: 0.000000
Predicted = dnr & actual = uwr: 0.000000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 0.000000
Predicted = dnr & actual = unr: 0.000000

End NN
Complete Accuracy percentage: 98.518000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 0.002000
Predicted = unr & actual = uwr: 0.310000
Predicted = dwr & actual = dnr: 0.006000
Predicted = dnr & actual = dwr: 0.036000
Predicted = uwr & actual = cyc: 0.272000
Predicted = dwr & actual = cyc: 0.036000
Predicted = cyc & actual = uwr: 0.344000
Predicted = cyc & actual = dwr: 0.210000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 0.000000
Predicted = unr & actual = dwr: 0.000000
Predicted = dwr & actual = uwr: 0.000000
Predicted = dnr & actual = uwr: 0.000000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 0.000000
Predicted = dnr & actual = unr: 0.000000

This classifier has 3 sigmoid logistic neurons in its single hidden layer and during training early stopping was employed. I also tried adding L2 regularization but this didn't really seem to have any appreciable effect, so after a while I dropped this. All in all, I'm very pleased with my efforts and the classifier's performance so far. Over the next few weeks I shall continue with the training and when this is complete I shall post again.

On a related note, I have recently added another blog to the blogroll because I was impressed with a series of posts over the last couple of years concerning that particular blogger's investigations into neural nets for trading, especially the last two posts here and here. The ideas covered in these last two posts ring a bell with my post here, where I first talked about using a neural net as a market classifier based on the work I did in Andrew Ng's course on recognising hand written digits from pixel values. I shall follow this new blogroll addition with interest!

Thursday, 3 January 2013

The Coin Toss Experiment

" 'The coin toss experiment' provides an indication that when one comes across a process that generates many system alternatives with many equity curves, some acceptable and some unacceptable, one may get fooled by randomness. Minimizing data-mining and selection bias is a very involved process for the most part outside the capabilities of the average user of such processes," taken from a recent addition to the blogroll. Interesting!

Wednesday, 5 December 2012

Neural Net Market Classifier to Replace Bayesian Market Classifier

I have now completed the cross validation test I wanted to run, which compares my current Bayesian classifier with the recently retrained "reserve neural net," the results of which are shown in the code box below. The test consists of 50,000 random iterations of my usual "ideal" 5 market types with the market classifications from both of the above classifiers being compared with the actual, known market type. There are 2 points of comparison in each iteration: the last price bar in the sequence, identified as "End," and a randomly picked price bar from the 4 immediately preceding the last bar, identified as "Random."

Number of times to loop: 50000
Elapsed time is 1804.46 seconds.

Random NN
Complete Accuracy percentage: 50.354000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 1.288000
Predicted = unr & actual = uwr: 6.950000
Predicted = dwr & actual = dnr: 1.268000
Predicted = dnr & actual = dwr: 6.668000
Predicted = uwr & actual = cyc: 3.750000
Predicted = dwr & actual = cyc: 6.668000
Predicted = cyc & actual = uwr: 2.242000
Predicted = cyc & actual = dwr: 2.032000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 2.140000
Predicted = unr & actual = dwr: 2.140000
Predicted = dwr & actual = uwr: 2.500000
Predicted = dnr & actual = uwr: 2.500000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 0.838000
Predicted = dnr & actual = unr: 0.716000

End NN
Complete Accuracy percentage: 48.280000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 1.248000
Predicted = unr & actual = uwr: 7.630000
Predicted = dwr & actual = dnr: 0.990000
Predicted = dnr & actual = dwr: 7.392000
Predicted = uwr & actual = cyc: 3.634000
Predicted = dwr & actual = cyc: 7.392000
Predicted = cyc & actual = uwr: 1.974000
Predicted = cyc & actual = dwr: 1.718000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 2.170000
Predicted = unr & actual = dwr: 2.170000
Predicted = dwr & actual = uwr: 2.578000
Predicted = dnr & actual = uwr: 2.578000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 1.050000
Predicted = dnr & actual = unr: 0.886000

Random Bayes
Complete Accuracy percentage: 19.450000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 7.554000
Predicted = unr & actual = uwr: 2.902000
Predicted = dwr & actual = dnr: 7.488000
Predicted = dnr & actual = dwr: 2.712000
Predicted = uwr & actual = cyc: 5.278000
Predicted = dwr & actual = cyc: 2.712000
Predicted = cyc & actual = uwr: 0.000000
Predicted = cyc & actual = dwr: 0.000000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 5.730000
Predicted = unr & actual = dwr: 5.730000
Predicted = dwr & actual = uwr: 5.642000
Predicted = dnr & actual = uwr: 5.642000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 0.162000
Predicted = dnr & actual = unr: 0.128000

End Bayes
Complete Accuracy percentage: 24.212000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 8.400000
Predicted = unr & actual = uwr: 2.236000
Predicted = dwr & actual = dnr: 7.866000
Predicted = dnr & actual = dwr: 1.960000
Predicted = uwr & actual = cyc: 6.142000
Predicted = dwr & actual = cyc: 1.960000
Predicted = cyc & actual = uwr: 0.000000
Predicted = cyc & actual = dwr: 0.000000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 5.110000
Predicted = unr & actual = dwr: 5.110000
Predicted = dwr & actual = uwr: 4.842000
Predicted = dnr & actual = uwr: 4.842000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 0.048000
Predicted = dnr & actual = unr: 0.040000

A Quick Analysis

Looking at the figures for complete accuracy it can be seen that the Bayesian classifier is not much better than randomly guessing, with 19.45% and 24.21% for "Random Bayes" and "End Bayes" respectively. The corresponding accuracy figures for the NN are 50.35% and 48.28%.
In the "dubious, difficult to trade" mis-classification category Bayes gets approx. 22% and 20% this wrong, whilst for the NN these figures halve to approx. 9.5% and 9.5%.
In the "acceptable" mis-classification category Bayes gets approx. 29% and 29%, with the NN being more or less the same.

Although this is not a completely rigorous test, I am satisfied that the NN has shown its superiority over the Bayesian classifier. Also, I believe that there is significant scope to improve the NN even more by adding additional features, changes in architecture and use of the Softmax unit etc. As a result, I have decided to gracefully retire the Bayesian classifier and deploy the NN classifier in its place.

Wednesday, 28 November 2012

Geoff Hinton's Coursera Course Almost Ended

I am now in the final week of the course (see previous post) and just have the final exam to complete. The course has been very intensive, very interesting and much more difficult than the first machine learning course I took. Personally, the big take aways from this course for the things that I want to do are:

Softmax activation function for output layers. I intend to replace my current use of the Sigmoid function in the output layer of my standby neural net with this Softmax function. The Softmax is far more suitable for my intended classification purposes.
Octave code for using momentum to speed up the training of a neural net.
Restricted Boltzmann machines, the stacking thereof and deep learning, and unsupervised learning. I shall talk more about this in a future post.

With regard to the training of my standby neural net, it is mostly completed. I say mostly because as soon as I learned about the above mentioned items I stopped training it once I had trained it on sufficient data to cover almost 99% of the dominant cycle periods to be found in the data. It seemed pointless to continue training it with increasing training times and diminishing returns, particularly since it is destined to be remodelled and retrained using what I've just learned. For now I will subject it to cross validation testing and if it passes this, I shall deploy it for a short period until such time as it is replaced by the neural net I have in mind following on from the course.

Pages