Wednesday, 5 December 2012

Neural Net Market Classifier to Replace Bayesian Market Classifier

I have now completed the cross validation test I wanted to run, which compares my current Bayesian classifier with the recently retrained "reserve neural net," the results of which are shown in the code box below. The test consists of 50,000 random iterations of my usual "ideal" 5 market types with the market classifications from both of the above classifiers being compared with the actual, known market type. There are 2 points of comparison in each iteration: the last price bar in the sequence, identified as "End," and a randomly picked price bar from the 4 immediately preceding the last bar, identified as "Random."
Number of times to loop: 50000
Elapsed time is 1804.46 seconds.

Random NN
Complete Accuracy percentage: 50.354000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 1.288000
Predicted = unr & actual = uwr: 6.950000
Predicted = dwr & actual = dnr: 1.268000
Predicted = dnr & actual = dwr: 6.668000
Predicted = uwr & actual = cyc: 3.750000
Predicted = dwr & actual = cyc: 6.668000
Predicted = cyc & actual = uwr: 2.242000
Predicted = cyc & actual = dwr: 2.032000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 2.140000
Predicted = unr & actual = dwr: 2.140000
Predicted = dwr & actual = uwr: 2.500000
Predicted = dnr & actual = uwr: 2.500000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 0.838000
Predicted = dnr & actual = unr: 0.716000

End NN
Complete Accuracy percentage: 48.280000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 1.248000
Predicted = unr & actual = uwr: 7.630000
Predicted = dwr & actual = dnr: 0.990000
Predicted = dnr & actual = dwr: 7.392000
Predicted = uwr & actual = cyc: 3.634000
Predicted = dwr & actual = cyc: 7.392000
Predicted = cyc & actual = uwr: 1.974000
Predicted = cyc & actual = dwr: 1.718000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 2.170000
Predicted = unr & actual = dwr: 2.170000
Predicted = dwr & actual = uwr: 2.578000
Predicted = dnr & actual = uwr: 2.578000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 1.050000
Predicted = dnr & actual = unr: 0.886000

Random Bayes
Complete Accuracy percentage: 19.450000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 7.554000
Predicted = unr & actual = uwr: 2.902000
Predicted = dwr & actual = dnr: 7.488000
Predicted = dnr & actual = dwr: 2.712000
Predicted = uwr & actual = cyc: 5.278000
Predicted = dwr & actual = cyc: 2.712000
Predicted = cyc & actual = uwr: 0.000000
Predicted = cyc & actual = dwr: 0.000000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 5.730000
Predicted = unr & actual = dwr: 5.730000
Predicted = dwr & actual = uwr: 5.642000
Predicted = dnr & actual = uwr: 5.642000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 0.162000
Predicted = dnr & actual = unr: 0.128000

End Bayes
Complete Accuracy percentage: 24.212000

"Acceptable" Mis-classifications percentages 
Predicted = uwr & actual = unr: 8.400000
Predicted = unr & actual = uwr: 2.236000
Predicted = dwr & actual = dnr: 7.866000
Predicted = dnr & actual = dwr: 1.960000
Predicted = uwr & actual = cyc: 6.142000
Predicted = dwr & actual = cyc: 1.960000
Predicted = cyc & actual = uwr: 0.000000
Predicted = cyc & actual = dwr: 0.000000

Dubious, difficult to trade mis-classification percentages 
Predicted = uwr & actual = dwr: 5.110000
Predicted = unr & actual = dwr: 5.110000
Predicted = dwr & actual = uwr: 4.842000
Predicted = dnr & actual = uwr: 4.842000

Completely wrong classifications percentages 
Predicted = unr & actual = dnr: 0.048000
Predicted = dnr & actual = unr: 0.040000
A Quick Analysis
  • Looking at the figures for complete accuracy it can be seen that the Bayesian classifier is not much better than randomly guessing, with 19.45% and 24.21% for "Random Bayes" and "End Bayes" respectively. The corresponding accuracy figures for the NN are 50.35% and 48.28%.
  • In the "dubious, difficult to trade" mis-classification category Bayes gets approx. 22% and 20% this wrong, whilst for the NN these figures halve to approx. 9.5% and 9.5%.
  • In the "acceptable" mis-classification category Bayes gets approx. 29% and 29%, with the NN being more or less the same.
Although this is not a completely rigorous test, I am satisfied that the NN has shown its superiority over the Bayesian classifier. Also, I believe that there is significant scope to improve the NN even more by adding additional features, changes in architecture and use of the Softmax unit etc. As a result, I have decided to gracefully retire the Bayesian classifier and deploy the NN classifier in its place.

Wednesday, 28 November 2012

Geoff Hinton's Coursera Course Almost Ended

I am now in the final week of the course (see previous post) and just have the final exam to complete. The course has been very intensive, very interesting and much more difficult than the first machine learning course I took. Personally, the big take aways from this course for the things that I want to do are:
  • Softmax activation function for output layers. I intend to replace my current use of the Sigmoid function in the output layer of my standby neural net with this Softmax function. The Softmax is far more suitable for my intended classification purposes.
  • Octave code for using momentum to speed up the training of a neural net.
  • Restricted Boltzmann machines, the stacking thereof and deep learning, and unsupervised learning. I shall talk more about this in a future post.
With regard to the training of my standby neural net, it is mostly completed. I say mostly because as soon as I learned about the above mentioned items I stopped training it once I had trained it on sufficient data to cover almost 99% of the dominant cycle periods to be found in the data. It seemed pointless to continue training it with increasing training times and diminishing returns, particularly since it is destined to be remodelled and retrained using what I've just learned. For now I will subject it to cross validation testing and if it passes this, I shall deploy it for a short period until such time as it is replaced by the neural net I have in mind following on from the course.

Thursday, 4 October 2012

Change in Neural Net Training Plans

After having spent the last few days training my NNs and seeing how long it is taking on my new data I have decided to change my training plans. I had been simultaneously training (on two separate computers) my decision tree idea alongside a more "normal" multi-class NN in the hope of eventually comparing the two. However, I anticipate that if I continued with this two pronged approach it would take about a month to finish, and I'd like quicker results than that. Also my attempt to use the hyperbolic tangent activation function hasn't been too successful and I'm not sure whether it's my coding or some deeper theoretical reason why it isn't working satisfactorily. Another reason is that the Coursera Neural Nets for Machine Learning course has just started, the syllabus for which is shown below:-

Lecture 1: Introduction
Lecture 2: The Perceptron learning procedure
Lecture 3: The backpropagation learning procedure
Lecture 4: Learning feature vectors for words
Lecture 5: Object recognition with neural nets
Lecture 6: Optimisation: How to make the learning go faster
Lecture 7: Recurrent neural networks and advanced optimisation
Lecture 8: How to make neural networks generalise better
Lecture 9: Combining multiple neural networks to improve generalisation
TOPICS TO BE COVERED IN LECTURES 10-16
Deep Autoencoders (including semantic hashing and image search with binary codes)
Hopfield Nets and Simulated Annealing
Boltzmann machines and the general learning algorithm
Restricted Boltzmann machines and contrastive divergence learning
Applications of Restricted Boltzmann machines to collaborative filtering and document modelling.
Stacking restricted Boltzmann machines or shallow autoencoders to make deep nets.
The wake-sleep algorithm and its contrastive version
Recent applications of generatively pre-trained deep nets
Deep Boltzmann machines and how to pre-train them
Modelling hierarchical structure with neural nets

I think that rather than ploughing on with the training of my decision tree NN it would perhaps be better to finish this course before I get too carried away with myself with new NN ideas; for example, lecture 9, or the "stacking of Boltzman machines," might give me much better insight to the issues involved.

For these reasons I have decided to retrain my "reserve NN" on my enlarged data set with my new feature set, using both computers available to me, whilst I work through the above course. I expect that this reserve NN will be fully trained before the course ends, so then I will be free to experiment with my newly acquired knowledge.

Monday, 1 October 2012

Dominant Cycle Periods in End of Day Data

As part of the training of my neural net classifier it has become necessary, in order to speed up the training process, for me to have knowledge of the relative frequency of occurrences of specific dominant cycle periods. This is due to my having increased the size of my training data to almost five and a half million training/cross validation examples of my feature set. Given the amount of data now involved in training I have to subset it by period and train my neural nets on one period at a time. To decide the period training order I created a histogram of historical, dominant cycle period occurrences in all the commodity contracts/forex pairs I follow for all the data I have. The histogram is shown below.
The bin range is from 10 to 50, with a maximum occurring, the cyan line, at 18. The NN training schedule will follow this order:- 18, 19, 17, 20 etc... working downwards along those bins with the remaining outstanding maximum occurrences.

Looking at this histogram I'm struck by a couple of things: although all the indicators I use are adaptive to the current measured period I'm sure many people use "classical" TA indicators with their recommended default values, for example the 14 day RSI or 20 day Bollinger Band. The above histogram shows how "hit and miss" this can be. The 20 day Bollinger Band would seem to be OK with its look back parameter approximately covering the full cyclic period of the most frequently occurring periods in the data, but the RSI? The theoretical optimum for the RSI is half a period, which at 14 is a cyclic period of 28 days, the red line in the histogram. Not so good! And what about the Commodity Channel Index (CCI)? Its creator originally recommended a one third period look back, which based on the histogram above would imply a default setting of 6 or 7. Just goes to show what value there is in some very basic statistical analysis.

Sunday, 23 September 2012

NN Training Update

The next round of cross validation tests has shown that 600 to 1000 epochs per NN is the optimum number for training purposes. Also, the NN architecture has slightly evolved into having two output nodes in the output layer, one for each class in the binary NN classifier.

In an earlier post I said that I would use a hyperbolic tangent function as the activation functions, as per Y. LeCun (1998). The actual function from this paper is$$f(x) = 1.7159  tanh( (2/3) x )$$and its derivative is
$$f'(x) = (2*1.7159/3) ( 1 - tanh^2 ( (2/3) x) )$$I have written Octave functions for this and they are now being used in the final CV test to determine the optimum regularisation term to be used during NN training.

Thursday, 20 September 2012

NN Architecture CV Tests Complete

The cross validation test results are shown below, with the x-axis being the number of nodes in the hidden layer and the y-axis being the mean squared error of the trained NN (500 epochs) on the validation data set
and an enlarged view of x-axis values 20 to 30.

Basically what these charts say is that having more than 26 nodes in the hidden layer does not appreciably improve the performance of the NN. As a result, the final architecture of my proposed NNs will be 48 nodes in the input layer, 26 nodes in the hidden layer and 1 node in the output layer.

As an aside to all the above, I have just read an interesting paper entitled "Using Trading Dynamics to Boost Quantitative Strategy Performance," available from here. Some very interesting concepts that I shall probably get around to investigating in due course.

Wednesday, 19 September 2012

Neural Net Classifier Architecture Testing

Having successfully integrated the FANN library I've changed my mind as to the design of my neural net classifier. Previously I had planned to train a series of classifiers, each "tuned" to a specific measured period in the data, and each to discriminate between 5 distinct market types - my normal cyclic, uwr, unr, dwr and dnr as I've talked about in earlier posts.

Now, however, I'm working on a revised design that incorporates elements of a decision tree, where neural nets sit at interior nodes of the tree. A simple schematic is shown below
The advantage of this model is that I only have to train 5 neural nets, represented by the 5 colours in the above schematic, and each net is a binary classifier (-1 and 1) with only one node in its output layer, with a decision rule of > 0 or < 0 for the activation function. Rather than have one neural net per period, the period is now one of the features in the features input vector.

At the moment the features vector has a length of 48, and as I write this my computer is churning through a cross validation test (using the FANN library) to determine the optimum number of nodes for the single hidden layer(s) of the neural nets. Once this is complete, I plan to run another series of cross validation tests to determine the optimum number of epochs to run during training. When these are complete I shall then run one final cross validation test to determine the optimum lambda for a regularisation term. This last set of tests will use the Octave code previously used because, as far as I can see, FANN library functions do not appear to have this capability.

One slight change to this Octave code will be to use the Hyberbolic tangent as the activation function (see Y. LeCun 1998, available as paper 86 here). I may also try to implement some other tricks recommended in this paper.

Wednesday, 5 September 2012

Successful Integration of FANN

I am pleased to say that all my recent work seems to have borne fruit, and I have now managed to code up a training and testing routine in Octave that uses the FANN library and its Octave bindings. I think that this has been some of my most challenging coding work up to now, and required many hours of research on the web and forum help to complete.

I find that one frustration with using open source software is the sparse and sometimes non-existent documentation and this blog post is partly intended as a guide for those readers who may also wish to use FANN in Octave. The code in the code box below is roughly divided into these sections
  • Octave code to index into and extract the relevant data from previously saved files
  • a section that uses Perl to format this data
  • the Octave binding code that actually implements the FANN library functions to set up and train a NN
  • a short bit of code to save and then test the NN on the training data
As the code itself is heavily commented no further comment is required.
% load training_data_1.mat on command line before running this script.

clear exclusive -X -accurate_period -y

yy = eye(5)(y,:) ; % using training labels y, create an output vector suitable for NN training

period = input('Enter period of interest: ') ;

%for period = 10:50

fprintf('\nTraining for ANN period: %f\n', period ) ;

% This first switch control block creates the training data by indexing, by period, into them
% data loaded from training_data_1.mat
switch (period)

case 10

% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;

% now index using input period plus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period+1 ) ;
% extract the relevant part of X using above i_X index
X_test = X( [i_X] , : ) ;
y_test = yy( [i_X] , : ) ;

train_data = [ X_train y_train ] ;
test_data = [ X_test y_test ] ;
detect_optima = train_data( (60:60:9000) , : ) ;

case 50

% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;

% now index using input period minus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period-1 ) ;
% extract the relevant part of X using above i_X index
X_test = X( [i_X] , : ) ;
y_test = yy( [i_X] , : ) ;

train_data = [ X_train y_train ] ;
test_data = [ X_test y_test ] ;
detect_optima = train_data( (60:60:9000) , : ) ;

otherwise

% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;

% now index using input period minus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period-1 ) ;
% extract the relevant part of X using above i_X index
X_test_1 = X( [i_X] , : ) ;
% and take every other value
X_test_1 = X_test_1( (2:2:9000) , : ) ;
% and same for market labels vector y
y_test_1 = yy( [i_X] , : ) ;
% and take every other value
y_test_1 = y_test_1( (2:2:9000) , : ) ;

% now index using input period plus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period+1 ) ;
% extract the relevant part of X using above i_X index
X_test_2 = X( [i_X] , : ) ;
% and take every other value
X_test_2 = X_test_2( (2:2:9000) , : ) ;
% and same for market labels vector y
y_test_2 = yy( [i_X] , : ) ;
% and take every other value
y_test_2 = y_test_2( (2:2:9000) , : ) ;

train_data = [ X_train y_train ] ;
test_data = [ [ X_test_1 y_test_1 ] ; [ X_test_2 y_test_2 ] ] ;
detect_optima = train_data( (60:60:9000) , : ) ;

endswitch % end of training data indexing switch

% now write this selected period data to -ascii files
save data_for_training -ascii train_data
save data_for_testing -ascii test_data
save detect_optima -ascii detect_optima % for use in Fanntool software

%************************************************************************
% Now the FANN training code !                                          *
%************************************************************************

% First set the parameters for the FANN structure
No_of_input_layer_nodes = 102 
No_of_hidden_layer_nodes = 102 
No_of_output_layer_nodes = 5 
Total_no_of_layers = length( [ No_of_input_layer_nodes No_of_hidden_layer_nodes No_of_output_layer_nodes ] )

% save and write this FANN structure info and length of training data file into an -ascii file - "train_nn_from_this_file"
fid = fopen( 'train_nn_from_this_file' , 'w' ) ;
fprintf( fid , ' %i %i %i\n ' , length(train_data) , No_of_input_layer_nodes , No_of_output_layer_nodes ) ;
fclose(fid) ;

% now create the FANN formatted training file - "train_nn_from_this_file"
system( "perl perl_file_manipulate.pl >train_nn_from_this_file" ) ;

%{
The above call to "system" interupts, or pauses, Octave at this point. Now the "shell" or "bash"
takes over and calls a Perl script, "perl_file_manipulate.pl", with the command line arguments
">train_nn_from_this_file", where < indicates that the file "data_for_training"
is to be read by the Perl script and >> indicates that the file "train_nn_from_this_file" is to be 
appended by the Perl script. From the fopen and fclose operations above the file to be appended contains only 
FANN structure info, e.g. 9000 102 5 on one line, and the file that is to be read is the training data of NN features 
and outputs extracted by the switch control structure above and written to -ascii files. The contents of the Perl
script file are:

#!/usr/bin/env perl

while (<>) { 
   my @f = split ;
   print("@f[0..$#f-5]\n@f[-5..-1]\n") ;
}

After these Perl operations the file "train_nn_from_this_file" is correctly formatted for the FANN library calls that
are to come
e.g. the file looks like this:-

9000 102 5
-2.50350699e-09 -2.52301858e-09 -2.50273727e-09 -2.44301942e-09 -2.34482961e-09 -2.20974520e-09
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00
etc.

When all this Perl script stuff is finished control returns to Octave.
%}

%***************************************************************************
% Begin FANN training ! Hurrah !                                           *
%***************************************************************************
% create the FANN
ANN = fann_create( [ No_of_input_layer_nodes No_of_hidden_layer_nodes No_of_output_layer_nodes ] ) ;

% create the parameters for training the FANN in an Octave "struct." All parameters are explicitly stated and set to the
% the default values. If not explicitly stated they would be these values anyway, but are explicitly stated just to show 
% how this is done 
NN_PARAMS = struct( "TrainingAlgorithm", 'rprop', "LearningRate", 0.7, "ActivationHidden", 'Sigmoid', "ActivationOutput", 'Sigmoid',...
"ActivationSteepnessHidden", 0.5, "ActivationSteepnessOutput", 0.5, "TrainErrorFunction", 'TanH', "QuickPropDecay", -0.0001,...
"QuickPropMu", 1.75, "RPropIncreaseFactor", 1.2, "RPropDecreaseFactor", 0.5, "RPropDeltaMin", 0.0, "RPropDeltaMax", 50.0 )

% and then set the parameters
fann_set_parameters( ANN , NN_PARAMS ) ;

% now train the FANN on data contained in file "train_nn_from_this_file"
fann_train( ANN, 'train_nn_from_this_file', 'MaxIterations', 200, 'DesiredError', 0.001, 'IterationsBetweenReports', 10 )

% save the trained FANN in a file e.g. "ann_25.net"
fann_save( ANN , [ "ann_" num2str(period) ".net" ] )

% Now test the ANN on the test_data set
% create ANN from saved fann_save file
ANN = fann_create( [ "ann_" num2str(period) ".net" ] ) ;

% run the trained ANN on the original feature training set, X_train
X_train_FANN_results = fann_run( ANN , X_train ) ;

% convert the X_train_FANN_results matrix to a single prediction vector
[dummy, prediction] = max( X_train_FANN_results, [], 2 ) ;

% compare accuracy of this NN prediction vector with the known labels in y for this period and display 
[i_X j_X] = find( accurate_period(:,1) == period ) ;
fprintf('\nTraining Set Accuracy: %f\n', mean( double( prediction == y([i_X],:) ) ) * 100 ) ;
fprintf('End of training for ANN period: %f\n', period ) ;

%end % end of period for loop
Typical terminal output during the running of this code looks like this:

octave:143> net_train_octave
Enter period of interest: 25
Max epochs      200. Desired error: 0.0010000000.
Epochs            1. Current error: 0.2537834346. Bit fail 45000.
Epochs           10. Current error: 0.1802092344. Bit fail 20947.
Epochs           20. Current error: 0.0793143436. Bit fail 7380.
Epochs           30. Current error: 0.0403240845. Bit fail 5215.
Epochs           40. Current error: 0.0254898760. Bit fail 2853.
Epochs           50. Current error: 0.0180807728. Bit fail 1611.
Epochs           60. Current error: 0.0150692556. Bit fail 1414.
Epochs           70. Current error: 0.0119200321. Bit fail 1187.
Epochs           80. Current error: 0.0091521516. Bit fail 937.
Epochs           90. Current error: 0.0073408978. Bit fail 670.
Epochs          100. Current error: 0.0060765576. Bit fail 492.
Epochs          110. Current error: 0.0051601632. Bit fail 446.
Epochs          120. Current error: 0.0041675218. Bit fail 386.
Epochs          130. Current error: 0.0036309268. Bit fail 374.
Epochs          140. Current error: 0.0032380833. Bit fail 343.
Epochs          150. Current error: 0.0028855132. Bit fail 302.
Epochs          160. Current error: 0.0025165526. Bit fail 280.
Epochs          170. Current error: 0.0022868335. Bit fail 253.
Epochs          180. Current error: 0.0021089041. Bit fail 220.
Epochs          190. Current error: 0.0019043182. Bit fail 197.
Epochs          200. Current error: 0.0017739790. Bit fail 169.

Training for ANN period: 25.000000
No_of_input_layer_nodes =  102
No_of_hidden_layer_nodes =  102
No_of_output_layer_nodes =  5
Total_no_of_layers =  3
NN_PARAMS =

  scalar structure containing the fields:

    TrainingAlgorithm = rprop
    LearningRate =  0.70000
    ActivationHidden = Sigmoid
    ActivationOutput = Sigmoid
    ActivationSteepnessHidden =  0.50000
    ActivationSteepnessOutput =  0.50000
    TrainErrorFunction = TanH
    QuickPropDecay = -1.0000e-04
    QuickPropMu =  1.7500
    RPropIncreaseFactor =  1.2000
    RPropDecreaseFactor =  0.50000
    RPropDeltaMin = 0
    RPropDeltaMax =  50

Training Set Accuracy: 100.000000
End of training for ANN period: 25.000000

The accuracy obtained on all periods from 10 to 50 is at least 98%, with about two thirds being 100%. However, the point of this post is not to show results of any one set of NN features or training parameters, but rather that I can now be more productive by using the speed and flexibility of FANN in the development of my NN market classifier.

Wednesday, 15 August 2012

Progress Report on Neural Net

Well, reducing the number of nodes in the hidden layer didn't help much; if anything it made things look slightly more erratic. As a result I decided to increase the number of input features to 102, which gave much more pleasing results. A screen shot of this newer NN, in the bottom pane, is shown below
Comparing this with the earlier version shown in my previous post, for example by looking at the smooth uptrend in the middle, it can be seen that there are far fewer "false" market types indicated - a definite improvement. The moral seems to be that adding more informative features is the way to go.

However, this raises the problem of training time - it took about 30 hours to train this model using my current Octave scripts - which is far too long for me. Due to this I have decided to use the FANN library, fanntool and the octave-fann bindings for my future development of NNs. I've recently been playing around with these and I think that, in the long run, a lot of time will be saved, even though I will have to write a certain amount of "glue code" to achieve what I want. The above 102 input feature NN will be my reserve NN in the event that I can't get the FANN library, fanntool and octave-fann to work to my satisfaction.

Thursday, 2 August 2012

Results of Comparative Cross Validation Tests

As expected the NN achieved 100 % accuracy and my prediction of 20 % to 30 % accuracy for my current Naive Bayesian Classifier was more or less right - in various runs of sample sizes up to 50,000 it achieved accuracy rates of 30 % to 33 %. A screen shot of both classifiers applied to the last 200 days worth of S & P futures prices is shown below, with the Naive Bayesian in the upper pane and the NN in the lower pane.
However, despite it vastly superior performance in the tests, I don't really like the look of the NN on real data - it appears to be more erratic or noisier than the Bayesian classifier. I suspect that the NN may be overly complex, with 54 nodes in its one hidden layer. I shall try to improve the NN by reducing the number of hidden layer nodes to 25, and then seeing how that looks on real data.

Tuesday, 31 July 2012

Successful Completion of Neural Net Cross Validation Tests

In my last post I suggested that I was unsure of my coding of the cross validation test I had written so what I have done is take a new coding approach and completely rewritten the test, which I'm happy to say has been very successful. Using this newly coded implementation the out of sample accuracy of the trained neural nets is 100 %. As before, these tests were run overnight, but this time for a total of 2,400,000 separate test examples due to increased code efficiency.

The next test I'm going to code, more out of curiosity than anything else, is a concurrent cross validation test to test both my new neural net classifier algorithm and my Naive Bayesian Classifier together. I expect the NN to again obtain results similar to the above, but anticipate that the Naive Bayesian Classifier will perform quite poorly, achieving between 20 % to 30 % accuracy. I expect such low performance simply because the Naive Bayesian Classifier was developed using just 5 exemplar market type examples compared to 25 for the NN.

Friday, 20 July 2012

Neural Net Cross Validation Tests Completed

These tests were conducted by looping over a series of replicated "idealised" market types; in each iteration cyclic component amplitudes were randomly chosen to range between 1 and 25 and phase shifts were randomly chosen such that the phase shifts that appear in the training set markets do not also appear in these cross validation sets of markets. For each combination of the above one of 25 possible market type changes was also randomly applied and then the relevant feature vector for each iteration was extracted. These tests were run overnight for a total of 1,200,000 separate, iterated test examples. The results are shown below.

Complete Accuracy percentage: 33.574500

"Acceptable" mis-classifications percentages
Predicted = uwr & actual = unr: 5.083417
Predicted = unr & actual = uwr: 7.230083
Predicted = dwr & actual = dnr: 5.170667
Predicted = dnr & actual = dwr: 7.180167
Predicted = uwr & actual = cyc: 3.287917
Predicted = dwr & actual = cyc: 7.180167
Predicted = cyc & actual = uwr: 3.623167
Predicted = cyc & actual = dwr: 3.554333

Dubious, difficult to trade mis-classification percentages
Predicted = uwr & actual = dwr: 2.432667
Predicted = unr & actual = dwr: 2.432667
Predicted = dwr & actual = uwr: 2.351500
Predicted = dnr & actual = uwr: 2.351500

Completely wrong classifications percentages
Predicted = unr & actual = dnr: 0.210083
Predicted = dnr & actual = unr: 0.207333

The complete accuracy percentage requires no comment. The "acceptable" mis-classifications are situations in which the erroneous prediction would not have one trading in a manner that would be inconsistent with the actual state of the market i.e. a predicted uwr and actual cyc is a situation where the market is predicted to be trending upwards with 50% retracements, but in actual fact is trending sideways in a cyclic manner. In either case one might be tempted to trade the swings of the market, so the mis-classification is acceptable because the erroneous prediction would still have you trading in a manner suitable to the "true" situation.

The "Dubious, difficult to trade" mis-classifications are where the above does not apply, i.e. attempting to swing trade in a bullish manner when in fact the market is trending down. One might get lucky and extract some profit, but in all probability the net expectation would be to make a loss. The completely wrong classifications again require no comment. The above totals of percentages do not add up to 100 because some combinations of mis-classifications are not included in this summary.

I'm not overwhelmed by these results, and so I shall continue to extend the features vector with more informative features to hopefully improve future cross validation test results. Also, I'm not 100% sure that my test implementation code is doing what I think it is doing, so that needs checking too.

On a related note, I've just enrolled in another online course, this time devoted entirely to neural nets.

Wednesday, 18 July 2012

Neural Net Training Completed

I am pleased to say that I have now completed the training of my NN market type classifier.

In an earlier post I mentioned that I had constructed a training set of 324,000 training examples to train the NN on. However, my first attempt at using this in its entirety wasn't successful, with an accuracy on the training set of between 52 % to 58 %. What's more, one training "session" lasted approximately 24 hours, with only 50 calls to the fmincg.m function ( a Java implementation is available from here ), and this would need to be repeated many times. This wasn't a practical proposition and I began to think about ways in which I could speed up the training process. One possible solution was to use other software and in my search of the internet I discovered the FANN library and the Fanntool GUI. After a close reading of the manuals I decided that for my purposes this wasn't the route I wanted to take, but in the future I may come back to this, particularly since the library has bindings to Octave.

After some consideration I decided to split the training set into smaller sets, with the intention of training numerous NNs, each trained to classify a market with a given period, and then to index into the relevant NN in a manner similar to that used in my brute force similarity classifier. The code for this training session is shown below.
% first, training data "training_data.mat" should be loaded in command line

clear -exclusive X y accurate_period % clear everything except y and X, previously loaded from the command line

% ************************************************************************
% Comment out the non relevant preprocessing step for the test in question
% ************************************************************************
% use X as it is for X_train
X_train = X ;
% ************************************************************************
% change zeros in X into -1 for X_train
%X_train = X ;
%change = X_train(:,4:end) ;
%change( change == 0 ) = -1 ;
%X_train(:,4:end) = change ;
%*************************************************************************
% train on just one period's features in X
% index into training set based on period measurement

% create final matrices for storing all unrolled Theta1 and Theta2 and cost record
all_ur_Theta1 = zeros(2862,288) ;
all_ur_Theta2 = zeros(270,288) ;
cost_record = zeros(288,4) ;
col_count = 1 ;

for period = 15:50 
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , 2:54 ) ;
% and same for market labels vector y
y_train = y( [i_X] , 1 ) ;
% ************************************************************************

%% Setup the parameter sizes 
input_layer_size = size(X_train,2) ;   % the number of features ( columns ) in X_train
hidden_layer_size = size(X_train,2) ;  % original was 25 hidden units
num_labels = 5 ;                 % 5 labels, from 1 to 5  
                                 % 1=uwr 2=unr 3=dwr 4=dnr 5=cyc

for lambda = [ 0.01 0.03 0.1 0.3 1 3 10 30 ]

% Initializing Neural Network Parameters
initial_Theta1 = randInitializeWeights( input_layer_size , hidden_layer_size ) ;
initial_Theta2 = randInitializeWeights( hidden_layer_size , num_labels ) ;

% Unroll parameters
initial_nn_params = [ initial_Theta1(:) ; initial_Theta2(:) ] ;

%% =================== Training NN ===================
%  To train the neural network, we will now use "fmincg", which
%  is a function which works similarly to "fminunc". Recall that these
%  advanced optimizers are able to train our cost functions efficiently as
%  long as we provide them with the gradient computations.
%
fprintf( '\nTraining Neural Network... \n' )

%  After you have completed the assignment, change the MaxIter to a larger
%  value to see how more training helps.
options = optimset( 'MaxIter' , 200 ) ; % original was 50

% try different values of lambda
%lambda = 0.1 ;

% Create "short hand" for the cost function to be minimized
costFunction = @(p) nnCostFunction( p, ...
                                   input_layer_size, ...
                                   hidden_layer_size, ...
                                   num_labels, X_train, y_train, lambda ) ;

% Now, costFunction is a function that takes in only one argument (the
% neural network parameters)
[ nn_params , cost ] = fmincg( costFunction , initial_nn_params , options ) ;

% Obtain Theta1 and Theta2 back from nn_params
Theta1 = reshape( nn_params( 1:hidden_layer_size * (input_layer_size + 1) ) , ...
                 hidden_layer_size , (input_layer_size + 1) ) ;

Theta2 = reshape( nn_params( (1 + (hidden_layer_size * (input_layer_size + 1))):end ) , ...
                 num_labels , (hidden_layer_size + 1) ) ;

%% ================= Implement Predict =================
%  After training the neural network, we would like to use it to predict
%  the labels. You will now implement the "predict" function to use the
%  neural network to predict the labels of the training set. This lets
%  you compute the training set accuracy.

pred = predict( Theta1 , Theta2 , X_train ) ;
training_set_accuracy = mean( double(pred == y_train) ) * 100.0 ;

fprintf( 'Training Set Accuracy: %f\n' , training_set_accuracy ) ;
fprintf( 'for lambda value of: %f\n' , lambda ) ;
fprintf( 'and period: %f\n' , period ) ;

% write to all_ur_Theta1 & all_ur_Theta2 & cost record
all_ur_Theta1(:,col_count) = Theta1(:) ;
all_ur_Theta2(:,col_count) = Theta2(:) ;
cost_record(col_count,1) = period ;
cost_record(col_count,2) = lambda ;
cost_record(col_count,3) = training_set_accuracy ;
cost_record(col_count,4) = cost(end) ;
col_count = col_count + 1 ;

end % lambda loop

end % period loop

save -binary all_ur_Thetas.mat all_ur_Theta1 all_ur_Theta2 cost_record
With 200 calls to the fmincg.m function this took an overnight run to complete, but in the morning I had extremely good results. For every period there was a trained NN that obtained 100 % accuracy. In fact for most periods there were several values for lambda ( a regularisation term to avoid over-fitting ) that gave 100 % accuracy, in which case I took the NN that had the lowest cost for 100 % accuracy.

So now I have a set of trained NNs, and the next step will be to test them on a cross validation set of my normal "ideal" market types, which will be the subject of my next post.

Monday, 16 July 2012

Jack Schwagger on Youtube

I have just watched a very interesting Youtube video of Jack Schwagger, of Market Wizards fame, giving a presentation. Well worth watching.

Update on Neural Network:- as I write this I have a NN training session running in Octave, which looks very promising. More in a new post in a day or so.

Thursday, 12 July 2012

Brute Force Classifier in Action

As an update to my recent post, here is a short video of the brute force similarity search classifier in action.

Non-embedded view here.
The coloured coded candlestick bars are coloured thus: purple = a cyclic market classification; green = up with retracement; blue = up with no retracement; yellow = down with retracement; red = down with no retracement. The upper price series is the classification as per the brute force algorithm and the lower is the classification as per my Naive Bayesian classifier, shown for comparative purposes.  The cyan trend line is my implementation of a Kalman filter, and where this trend line extends out at the hard right hand edge of the chart it changes to become the prediction of the Kalman filter for the next 10 bars, this prediction based on extending the pattern that was selected during the run of the brute force algorithm.

I will leave it up to readers to judge for themselves the efficacy of this new indicator, but I think it shows some promise, and I have some ideas about how it can be improved. This, however, is work for the future. For now I intend to crack on with working on my neural net classification algorithm. 

Saturday, 7 July 2012

A Possible Brute Force Similarity Classifier in Octave Code

As part of the development of my neural net classifier it has been necessary to use training data and as usual I have been using my model market types. To increase the amount of such training data I have extended the range of the data to include a change in market type half way through the cycle of one measured cyclic period. I have done this in increments of 1 degree from 1 degree to 360 degrees of a sine wave, for periods 15 to 50, for all possible combinations of market type, for a total database of 324,000 possible market model patterns. However, it struck me after reading this pdf that I could use this database as the basis of what is called in the pdf a "brute force similarity search" classifier. Below is my proof of concept Octave code implementation of such a classifier,
% first, training data "training_data.mat" should be loaded in command line

clear -exclusive X y % clear everything except y and X, previously loaded from the command line

lookup_value = input( 'Enter a number from 1 to 324,000 to choose a lookup candidate row from X: ' ) ;
fprintf( 'Based on this choice the market type to look up is :- ' ) ;
y( lookup_value , 1 )

tic() ;

% index into training set based on period measurement
[i_X j_X] = find( X(:,1) == X( lookup_value , 1 ) ) ;

% keep a record of all i_X indexes
all_i_X = i_X ;

% extract the relevant part of X using above index
X_look_up_matrix = X( [i_X] , 4:54 ) ;

% and same for market labels vector y
y_look_up_vector = y( [i_X] , 1 ) ;

% find pattern in X_look_up_matrix that minimises Euclidean distance between itself and the training example randomly taken from X
[ euc_dist_min i_euc_dist_min ] = min( sum( ( repmat( X(lookup_value,4:54), size(X_look_up_matrix,1), 1) .- X_look_up_matrix ) .^ 2.0 , 2 , 'extra' ) ) ;

fprintf( 'and the algo returns a market type of :- ' ) ;
% take this minimum distance vector index to get predicted market type
y_look_up_vector( i_euc_dist_min , 1 )

fprintf( '\nwith a calculated Euclidean distance of :- ' ) ;
double(euc_dist_min)
fprintf( 'which ideally should be 0.0 on this X test set.\n' ) ;

fprintf( '\nOriginal lookup row check.\n' ) ;
original_i_X_check = all_i_X( i_euc_dist_min , 1 )
fprintf( 'which ideally should be the same as row choice entered.\n' ) ;

fprintf( '\nTime for algo to run.\n' ) ;

toc() ;
where X is the database already mentioned and y is a vector containing the market type labels. Typical terminal output of this code is

octave:1> bf_pattern_recognition
Enter a number from 1 to 324,000 to choose a lookup candidate row from X: 100235
Based on this choice the market type to look up is :- ans =  3
and the algo returns a market type of :- ans =  3

with a calculated Euclidean distance of :- ans = 0
which ideally should be 0.0 on this X test set.

Original lookup row check.
original_i_X_check =  100235
which ideally should be the same as row choice entered.

Time for algo to run.
Elapsed time is 0.1130519 seconds.
octave:2>

Of course it obtains 100 % accuracy on the test set X because the original choice of pattern to be matched comes from X so there is always an exact match to be found. The important thing is that this is a workable algorithm which, making allowances for all the print statements included in the above code, runs in hundredths of a second.

This speed, despite having such a large database to search through, is achieved by indexing into the database by the measured period of the pattern to be matched, which is the first entry on each line. This reduces the search base down to a more manageable 9000 row matrix, and then one line of vectorised code is used to perform the actual Euclidean distance search and classification.

Another possible advantage of this approach on real market data is that, having hopefully accurately classified the data, the matched pattern in the database can be extrapolated under the assumption that the market model will persist for the next 5 to 10 bars, to make a prediction of near future prices. I shall certainly be doing more work will this classifying algorithm!


Saturday, 30 June 2012

Machine Learning Course Completed

I'm pleased to say that I have now completed Andrew Ng's machine learning course, which is offered through Coursera. This post is not intended to be a review of the course, which in my opinion is extremely good and very useful, but more of a reflection of my thoughts and what I think will be useful for me personally.

Firstly, I was pleasantly surprised that the software/programming language of instruction was Octave, which regular readers of this blog will know is my main software of choice. Apart from learning the concepts of ML, I also picked up some handy tips for Octave programming, and more importantly for me I now have a set of working Octave ML functions that I can use immediately in my system development.

In my previous post I mentioned that my first attempt at using ML will be to use a Neural Net to classify market types. As background to this, readers might be interested in a pdf file of the video lectures, available from here, which was put together and posted on the course discussion forum by another student - I think this is very good and all credit to said student, José Soares Augusto.

Due to the honour code ( or honor code for American readers ) of the course I will be unable to post the code that I wrote for the programming assignments. However, I do feel that I can post the code shown in the code box below, as the copyright notice allows it. A few slight changes I made are noted in the copyright notice. This is a minimisation function that was used in the training of the Neural Net assignment and was provided in the assignment download.
function [X, fX, i] = fmincg(f, X, options, P1, P2, P3, P4, P5)
% Minimize a continuous differentialble multivariate function. Starting point
% is given by "X" (D by 1), and the function named in the string "f", must
% return a function value and a vector of partial derivatives. The Polack-
% Ribiere flavour of conjugate gradients is used to compute search directions,
% and a line search using quadratic and cubic polynomial approximations and the
% Wolfe-Powell stopping criteria is used together with the slope ratio method
% for guessing initial step sizes. Additionally a bunch of checks are made to
% make sure that exploration is taking place and that extrapolation will not
% be unboundedly large. The "length" gives the length of the run: if it is
% positive, it gives the maximum number of line searches, if negative its
% absolute gives the maximum allowed number of function evaluations. You can
% (optionally) give "length" a second component, which will indicate the
% reduction in function value to be expected in the first line-search (defaults
% to 1.0). The function returns when either its length is up, or if no further
% progress can be made (ie, we are at a minimum, or so close that due to
% numerical problems, we cannot get any closer). If the function terminates
% within a few iterations, it could be an indication that the function value
% and derivatives are not consistent (ie, there may be a bug in the
% implementation of your "f" function). The function returns the found
% solution "X", a vector of function values "fX" indicating the progress made
% and "i" the number of iterations (line searches or function evaluations,
% depending on the sign of "length") used.
%
% Usage: [X, fX, i] = fmincg(f, X, options, P1, P2, P3, P4, P5)
%
% See also: checkgrad 
%
% Copyright (C) 2001 and 2002 by Carl Edward Rasmussen. Date 2002-02-13
%
% (C) Copyright 1999, 2000 & 2001, Carl Edward Rasmussen
% 
% Permission is granted for anyone to copy, use, or modify these
% programs and accompanying documents for purposes of research or
% education, provided this copyright notice is retained, and note is
% made of any changes that have been made.
% 
% These programs and documents are distributed without any warranty,
% express or implied.  As the programs were written for research
% purposes only, they have not been tested to the degree that would be
% advisable in any important application.  All use of these programs is
% entirely at the user's own risk.
%
% [ml-class] Changes Made:
% 1) Function name and argument specifications
% 2) Output display
%
% Dekalog Changes Made:
% Some lines have been altered, changing | to || and & to &&.
% This is to avoid "possible Matlab-style short-circuit operator" warnings
% being given when code is run under Octave. The lines where these changes
% have been made are indicated by comments at the end of each respective line. 

% Read options
if exist('options', 'var') && ~isempty(options) && isfield(options, 'MaxIter')
    length = options.MaxIter;
else
    length = 100;
end


RHO = 0.01;                            % a bunch of constants for line searches
SIG = 0.5;       % RHO and SIG are the constants in the Wolfe-Powell conditions
INT = 0.1;    % don't reevaluate within 0.1 of the limit of the current bracket
EXT = 3.0;                    % extrapolate maximum 3 times the current bracket
MAX = 20;                         % max 20 function evaluations per line search
RATIO = 100;                                      % maximum allowed slope ratio

argstr = ['feval(f, X'];                      % compose string used to call function
for i = 1:(nargin - 3)
  argstr = [argstr, ',P', int2str(i)];
end
argstr = [argstr, ')'];

if max(size(length)) == 2, red=length(2); length=length(1); else red=1; end
S=['Iteration '];

i = 0;                                            % zero the run length counter
ls_failed = 0;                             % no previous line search has failed
fX = [];
[f1 df1] = eval(argstr);                      % get function value and gradient
i = i + (length<0);                                            % count epochs?!
s = -df1;                                        % search direction is steepest
d1 = -s'*s;                                                 % this is the slope
z1 = red/(1-d1);                                  % initial step is red/(|s|+1)

while i < abs(length)                                      % while not finished
  i = i + (length>0);                                      % count iterations?!

  X0 = X; f0 = f1; df0 = df1;                   % make a copy of current values
  X = X + z1*s;                                             % begin line search
  [f2 df2] = eval(argstr);
  i = i + (length<0);                                          % count epochs?!
  d2 = df2'*s;
  f3 = f1; d3 = d1; z3 = -z1;             % initialize point 3 equal to point 1
  if length>0, M = MAX; else M = min(MAX, -length-i); end
  success = 0; limit = -1;                     % initialize quanteties
  while 1
    while ((f2 > f1+z1*RHO*d1) || (d2 > -SIG*d1)) && (M > 0) % | and & changed to || and && to avoid "possible Matlab-style short-circuit operator" warning 
      limit = z1;                                         % tighten the bracket
      if f2 > f1
        z2 = z3 - (0.5*d3*z3*z3)/(d3*z3+f2-f3);                 % quadratic fit
      else
        A = 6*(f2-f3)/z3+3*(d2+d3);                                 % cubic fit
        B = 3*(f3-f2)-z3*(d3+2*d2);
        z2 = (sqrt(B*B-A*d2*z3*z3)-B)/A;       % numerical error possible - ok!
      end
      if isnan(z2) || isinf(z2)     % | changed to || to avoid "possible Matlab-style short-circuit operator" warning 
        z2 = z3/2;                  % if we had a numerical problem then bisect
      end
      z2 = max(min(z2, INT*z3),(1-INT)*z3);  % don't accept too close to limits
      z1 = z1 + z2;                                           % update the step
      X = X + z2*s;
      [f2 df2] = eval(argstr);
      M = M - 1; i = i + (length<0);                           % count epochs?!
      d2 = df2'*s;
      z3 = z3-z2;                    % z3 is now relative to the location of z2
    end
    if f2 > f1+z1*RHO*d1 || d2 > -SIG*d1                    % | changed to || to avoid "possible Matlab-style short-circuit operator" warning
      break;                                                % this is a failure
    elseif d2 > SIG*d1
      success = 1; break;                                             % success
    elseif M == 0
      break;                                                          % failure
    end
    A = 6*(f2-f3)/z3+3*(d2+d3);                      % make cubic extrapolation
    B = 3*(f3-f2)-z3*(d3+2*d2);
    z2 = -d2*z3*z3/(B+sqrt(B*B-A*d2*z3*z3));        % num. error possible - ok!
    if ~isreal(z2) || isnan(z2) || isinf(z2) || z2 < 0   % num prob or wrong sign? % | changed to || to avoid "possible Matlab-style short-circuit operator" warning
      if limit < -0.5                               % if we have no upper limit
        z2 = z1 * (EXT-1);                 % the extrapolate the maximum amount
      else
        z2 = (limit-z1)/2;                                   % otherwise bisect
      end
    elseif (limit > -0.5) && (z2+z1 > limit)       % extraplation beyond max?   % & changed to && to avoid "possible Matlab-style short-circuit operator" warning
      z2 = (limit-z1)/2;                                               % bisect
    elseif (limit < -0.5) && (z2+z1 > z1*EXT)      % extrapolation beyond limit % & changed to && to avoid "possible Matlab-style short-circuit operator" warning
      z2 = z1*(EXT-1.0);                           % set to extrapolation limit
    elseif z2 < -z3*INT
      z2 = -z3*INT;
    elseif (limit > -0.5) && (z2 < (limit-z1)*(1.0-INT))   % too close to limit? % & changed to && to avoid "possible Matlab-style short-circuit operator" warning
      z2 = (limit-z1)*(1.0-INT);
    end
    f3 = f2; d3 = d2; z3 = -z2;                  % set point 3 equal to point 2
    z1 = z1 + z2; X = X + z2*s;                      % update current estimates
    [f2 df2] = eval(argstr);
    M = M - 1; i = i + (length<0);                             % count epochs?!
    d2 = df2'*s;
  end                                                      % end of line search

  if success                                         % if line search succeeded
    f1 = f2; fX = [fX' f1]';
    fprintf('%s %4i | Cost: %4.6e\r', S, i, f1);
    s = (df2'*df2-df1'*df2)/(df1'*df1)*s - df2;      % Polack-Ribiere direction
    tmp = df1; df1 = df2; df2 = tmp;                         % swap derivatives
    d2 = df1'*s;
    if d2 > 0                                      % new slope must be negative
      s = -df1;                              % otherwise use steepest direction
      d2 = -s'*s;    
    end
    z1 = z1 * min(RATIO, d1/(d2-realmin));          % slope ratio but max RATIO
    d1 = d2;
    ls_failed = 0;                              % this line search did not fail
  else
    X = X0; f1 = f0; df1 = df0;  % restore point from before failed line search
    if ls_failed || i > abs(length)         % line search failed twice in a row % | changed to || to avoid "possible Matlab-style short-circuit operator" warning
      break;                             % or we ran out of time, so we give up
    end
    tmp = df1; df1 = df2; df2 = tmp;                         % swap derivatives
    s = -df1;                                                    % try steepest
    d1 = -s'*s;
    z1 = 1/(1-d1);                     
    ls_failed = 1;                                    % this line search failed
  end
  if exist('OCTAVE_VERSION')
    fflush(stdout);
  end
end
fprintf('\n');

Finally, the last set of videos talked about "Artificial Data Synthesis," otherwise known as creating your own data for training purposes. This is basically what I had planned to do anyway ( see previous post ), but it is nice to learn that it is standard, accepted practice in the ML world. The first such way of creating data, in the context of Photo OCR, is shown below
where various font libraries are used against random backgrounds. I think this very much mirrors my planned approach of training on repeated sets of my "ideal time series" construct. However, another approach which could be used is "data distortion," shown in this next image
which is an approach that my creating synthetic data using FFT might be useful for, or alternatively a correlation and cointegration approach as shown in R code in this Quantitative Finance thread.

All in all, I'm quite excited by the possibilities of my new found knowledge, and I fully expect that in time, after development and testing, any Neural Net I develop will in fact replace my current Naive Bayesian classifier.

Friday, 22 June 2012

Neural Net to Replace my Bayesian Classifier?

Having more or less completed the machine learning course alluded to in my last post I thought I would have a go at programming a neural net as a possible replacement for my Naive Bayesian Classifier. In said machine learning course there was a task to programme a neural net to recognise the digits 0 to 9 inclusive, as shown below.
It struck me whilst completing this task that I could use the code I was writing to recognise price patterns as a way of classifying the market into one of my 5 states. Quickly writing a pre-processing script in Octave I have been able to produce scaled "snapshots" of my "ideal" markets types thus:-
As can be seen, different market types and periods produce distinctive looking plots, and it is my hope that a neural net can be trained to identify them and thus act as a market classifier. More in a future post.

Wednesday, 6 June 2012

Creation of a Simple Benchmark Suite

For some time now I have been toying with the idea of creating a simple benchmark suite to compare my own back test system performance with that of some public domain trading systems. I decided to select a few examples from the Trading Blox Users' Guide, specifically:
  • exponential moving average crossovers of periods 10-20 and 20-50
  • triple moving average crossover system with periods 10-20-50
  • Bollinger band breakouts of periods 20 and 50 with 1 & 2 standard deviations for exits and entries
  • donchian channel breakouts with periods 20-10 and 50-25
This is a total of 7 systems, and in the Rcpp code below these form a sort of "committee" to vote to be either long/short 1 contract, or neutral.

# This function takes as inputs vectors of opening and closing prices
# and creates a basic benchmark output suite of system equity curves 
# for the following basic trend following systems
# exponential moving average crossovers of 10-20 and 20-50
# triple moving average crossovers of 10-20-50
# bollinger band breakouts of 20 and 50 with 1 & 2 standard deviations
# donchian channel breakouts of 20-10 and 50-25
# The libraries required to compile this function are
# "Rcpp," "inline" and "compiler." The function is compiled by the command
# > source("basic_benchmark_equity.r") in the console/terminal.

library(Rcpp) # load the required library
library(inline) # load the required library
library(compiler) # load the required library

src <- '
#include 
#include 

Rcpp::NumericVector open(a) ;
Rcpp::NumericVector close(b) ;
Rcpp::NumericVector market_mode(c) ;
Rcpp::NumericVector kalman(d) ;
Rcpp::NumericVector tick_size(e) ;
Rcpp::NumericVector tick_value(f) ;
int n = open.size() ;
Rcpp::NumericVector sma_10(n) ;
Rcpp::NumericVector sma_20(n) ;
Rcpp::NumericVector sma_50(n) ;
Rcpp::NumericVector std_20(n) ;
Rcpp::NumericVector std_50(n) ;

// create equity output vectors
Rcpp::NumericVector market_mode_long_eq(n) ;
Rcpp::NumericVector market_mode_short_eq(n) ;
Rcpp::NumericVector market_mode_composite_eq(n) ;
Rcpp::NumericVector sma_10_20_eq(n) ; 
Rcpp::NumericVector sma_20_50_eq(n) ; 
Rcpp::NumericVector tma_eq(n) ; 
Rcpp::NumericVector bbo_20_eq(n) ;
Rcpp::NumericVector bbo_50_eq(n) ;
Rcpp::NumericVector donc_20_eq(n) ;
Rcpp::NumericVector donc_50_eq(n) ;
Rcpp::NumericVector composite_eq(n) ; 

// position vectors for benchmark systems
Rcpp::NumericVector sma_10_20_pv(1) ;
sma_10_20_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector sma_20_50_pv(1) ;
sma_20_50_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector tma_pv(1) ;
tma_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector bbo_20_pv(1) ;
bbo_20_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector bbo_50_pv(1) ;
bbo_50_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector donc_20_pv(1) ;
donc_20_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector donc_50_pv(1) ;
donc_50_pv[0] = 0.0 ; // initialise to zero, no position

Rcpp::NumericVector comp_pv(1) ;
comp_pv[0] = 0.0 ; // initialise to zero, no position

// fill the equity curve vectors with zeros for "burn in" period
// and create the initial values for all indicators

for ( int ii = 0 ; ii < 50 ; ii++ ) {
    
    if ( ii >= 40 ) {
    sma_10[49] += close[ii] ; }

    if ( ii >= 30 ) {
    sma_20[49] += close[ii] ; }

    sma_50[49] += close[ii] ;

    std_20[ii] = 0.0 ;
    std_50[ii] = 0.0 ;

    market_mode_long_eq[ii] = 0.0 ;
    market_mode_short_eq[ii] = 0.0 ;
    market_mode_composite_eq = 0.0 ;
    sma_10_20_eq[ii] = 0.0 ;
    sma_20_50_eq[ii] = 0.0 ; 
    bbo_20_eq[ii] = 0.0 ;
    bbo_50_eq[ii] = 0.0 ;
    tma_eq[ii] = 0.0 ;
    donc_20_eq[ii] = 0.0 ;
    donc_50_eq[ii] = 0.0 ; 
    composite_eq[ii] = 0.0 ; } // end of initialising loop

    sma_10[49] = sma_10[49] / 10.0 ;
    sma_20[49] = sma_20[49] / 20.0 ;
    sma_50[49] = sma_50[49] / 50.0 ;

// the main calculation loop
for ( int ii = 50 ; ii < n-2 ; ii++ ) {

    // calculate the smas
    sma_10[ii] = ( sma_10[ii-1] - sma_10[ii-10] / 10.0 ) + ( close[ii] / 10.0 ) ;
    sma_20[ii] = ( sma_20[ii-1] - sma_20[ii-20] / 20.0 ) + ( close[ii] / 20.0 ) ;
    sma_50[ii] = ( sma_50[ii-1] - sma_50[ii-50] / 50.0 ) + ( close[ii] / 50.0 ) ;

      // calculate the standard deviations
      for ( int jj = 0 ; jj < 50 ; jj++ ) {
    
      if ( jj < 20 ) {
      std_20[ii] += ( close[ii-jj] - sma_20[ii] ) * ( close[ii-jj] - sma_20[ii] )  ; } // end of jj if

      std_50[ii] += ( close[ii-jj] - sma_50[ii] ) * ( close[ii-jj] - sma_50[ii] ) ; } // end of standard deviation loop

    std_20[ii] = sqrt( std_20[ii] / 20.0 ) ;
    std_50[ii] = sqrt( std_50[ii] / 50.0 ) ;

    //-------------------------------------------------------------------------------------------------------------------

    // calculate the equity values of the market modes
    // market_mode uwr and unr long signals
    if ( market_mode[ii] == 1 || market_mode[ii] == 2 ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; }

    // market_mode dwr and dnr short signals
    if ( market_mode[ii] == 3 || market_mode[ii] == 4 ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; }

    // calculate the equity values of the market modes
    // market_mode cyc long signals
    if ( market_mode[ii] == 0 && kalman[ii] > kalman[ii-1] ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; }

    // market_mode cyc short signals
    if ( market_mode[ii] == 0 && kalman[ii] < kalman[ii-1] ) {
    market_mode_long_eq[ii] = market_mode_long_eq[ii-1] ;
    market_mode_short_eq[ii] = market_mode_short_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ;
    market_mode_composite_eq[ii] = market_mode_composite_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; }

    //----------------------------------------------------------------------------------------------------------------------------

    // calculate the equity values and positions of each benchmark system
    // sma_10_20_eq
    if ( sma_10[ii] > sma_20[ii] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_10_20_pv[0] = 1.0 ; } // long

    // sma_10_20_eq
    if ( sma_10[ii] < sma_20[ii] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_10_20_pv[0] = -1.0 ; } // short

    // sma_10_20_eq
    if ( sma_10[ii] == sma_20[ii] && sma_10[ii-1] > sma_20[ii-1] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_10_20_pv[0] = 1.0 ; } // long

    // sma_10_20_eq
    if ( sma_10[ii] == sma_20[ii] && sma_10[ii-1] < sma_20[ii-1] ) { 
    sma_10_20_eq[ii] = sma_10_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_10_20_pv[0] = -1.0 ; } // short

    //-----------------------------------------------------------------------------------------------------------

    // sma_20_50_eq
    if ( sma_20[ii] > sma_50[ii] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_20_50_pv[0] = 1.0 ; } // long

    // sma_20_50_eq
    if ( sma_20[ii] < sma_50[ii] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_20_50_pv[0] = -1.0 ; } // short

    // sma_20_50_eq
    if ( sma_20[ii] == sma_50[ii] && sma_20[ii-1] > sma_50[ii-1] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
    sma_20_50_pv[0] = 1.0 ; } // long

    // sma_20_50_eq
    if ( sma_20[ii] == sma_50[ii] && sma_20[ii-1] < sma_50[ii-1] ) { 
    sma_20_50_eq[ii] = sma_20_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
    sma_20_50_pv[0] = -1.0 ; } // short

    //-----------------------------------------------------------------------------------------------------------

    // tma_eq
    if ( tma_pv[0] == 0.0 ) {

      // default position
      tma_eq[ii] = tma_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( sma_10[ii] > sma_20[ii] && sma_20[ii] > sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      tma_pv[0] = 1.0 ; } // long

      if ( sma_10[ii] < sma_20[ii] && sma_20[ii] < sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      tma_pv[0] = -1.0 ; } // short

    } // end of tma_pv == 0.0 loop

    if ( tma_pv[0] == 1.0 ) {

      // default long position
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( sma_10[ii] < sma_20[ii] && sma_10[ii] > sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] ; 
      tma_pv[0] = 0.0 ; } // exit long, go neutral

      if ( sma_10[ii] < sma_20[ii] && sma_20[ii] < sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      tma_pv[0] = -1.0 ; } // short
    
    } // end of tma_pv == 1.0 loop

    if ( tma_pv[0] == -1.0 ) {

      // default short position
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( sma_10[ii] > sma_20[ii] && sma_10[ii] < sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] ; 
      tma_pv[0] = 0.0 ; } // exit short, go neutral

      if ( sma_10[ii] > sma_20[ii] && sma_20[ii] > sma_50[ii] ) { 
      tma_eq[ii] = tma_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      tma_pv[0] = 1.0 ; } // long

    } // end of tma_pv == -1.0 loop

    //------------------------------------------------------------------------------------------------------------

    // bbo_20_eq
    if ( bbo_20_pv[0] == 0.0 ) {

      // default position
      bbo_20_eq[ii] = bbo_20_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > sma_20[ii] + 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_20_pv[0] = 1.0 ; } // long

      if ( close[ii] < sma_20[ii] - 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_20_pv[0] = -1.0 ; } // short

    } // end of bbo_20_pv == 0.0 loop

    if ( bbo_20_pv[0] == 1.0 ) {

      // default long position
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < sma_20[ii] + std_20[ii] && close[ii] > sma_20[ii] - 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] ; 
      bbo_20_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < sma_20[ii] - 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_20_pv[0] = -1.0 ; } // short
    
    } // end of bbo_20_pv == 1.0 loop

    if ( bbo_20_pv[0] == -1.0 ) {

      // default short position
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > sma_20[ii] - std_20[ii] && close[ii] < sma_20[ii] + 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] ; 
      bbo_20_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > sma_20[ii] + 2.0 * std_20[ii] ) { 
      bbo_20_eq[ii] = bbo_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_20_pv[0] = 1.0 ; } // long

    } // end of bbo_20_pv == -1.0 loop

    //-------------------------------------------------------------------------------------------------

    // bbo_50_eq
    if ( bbo_50_pv[0] == 0.0 ) {

      // default position
      bbo_50_eq[ii] = bbo_50_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > sma_50[ii] + 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_50_pv[0] = 1.0 ; } // long

      if ( close[ii] < sma_50[ii] - 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_50_pv[0] = -1.0 ; } // short

    } // end of bbo_50_pv == 0.0 loop

    if ( bbo_50_pv[0] == 1.0 ) {

      // default long position
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < sma_50[ii] + std_50[ii] && close[ii] > sma_50[ii] - 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] ; 
      bbo_50_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < sma_50[ii] - 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      bbo_50_pv[0] = -1.0 ; } // short
    
    } // end of bbo_50_pv == 1.0 loop

    if ( bbo_50_pv[0] == -1.0 ) {

      // default short position
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > sma_50[ii] - std_50[ii] && close[ii] < sma_50[ii] + 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] ; 
      bbo_50_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > sma_50[ii] + 2.0 * std_50[ii] ) { 
      bbo_50_eq[ii] = bbo_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      bbo_50_pv[0] = 1.0 ; } // long

    } // end of bbo_50_pv == -1.0 loop

    //-----------------------------------------------------------------------------------------------------

    // donc_20_eq
    if ( donc_20_pv[0] == 0.0 ) {

      // default position
      donc_20_eq[ii] = donc_20_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_20_pv[0] = 1.0 ; } // long

      if ( close[ii] < *std::min_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_20_pv[0] = -1.0 ; } // short

    } // end of donc_20_pv == 0.0 loop

    if ( donc_20_pv[0] == 1.0 ) {

      // default long position
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < *std::min_element( &close[ii-10], &close[ii] ) && close[ii] > *std::min_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] ; 
      donc_20_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < *std::min_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_20_pv[0] = -1.0 ; } // short
    
    } // end of donc_20_pv == 1.0 loop

    if ( donc_20_pv[0] == -1.0 ) {

      // default short position
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-10], &close[ii] ) && close[ii] < *std::max_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] ; 
      donc_20_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > *std::max_element( &close[ii-20], &close[ii] ) ) { 
      donc_20_eq[ii] = donc_20_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_20_pv[0] = 1.0 ; } // long

    } // end of donc_20_pv == -1.0 loop

    //-------------------------------------------------------------------------------------------------

    // donc_50_eq
    if ( donc_50_pv[0] == 0.0 ) {

      // default position
      donc_50_eq[ii] = donc_50_eq[ii-1] ;

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_50_pv[0] = 1.0 ; } // long

      if ( close[ii] < *std::min_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_50_pv[0] = -1.0 ; } // short

    } // end of donc_50_pv == 0.0 loop

    if ( donc_50_pv[0] == 1.0 ) {

      // default long position
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; // long

      // unless one of the two following conditions is true

      if ( close[ii] < *std::min_element( &close[ii-25], &close[ii] ) && close[ii] > *std::min_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] ; 
      donc_50_pv[0] = 0.0 ; } // exit long, go neutral

      if ( close[ii] < *std::min_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; 
      donc_50_pv[0] = -1.0 ; } // short
    
    } // end of donc_50_pv == 1.0 loop

    if ( donc_50_pv[0] == -1.0 ) {

      // default short position
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; // short

      // unless one of the two following conditions is true

      if ( close[ii] > *std::max_element( &close[ii-25], &close[ii] ) && close[ii] < *std::max_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] ; 
      donc_50_pv[0] = 0.0 ; } // exit short, go neutral

      if ( close[ii] > *std::max_element( &close[ii-50], &close[ii] ) ) { 
      donc_50_eq[ii] = donc_50_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; 
      donc_50_pv[0] = 1.0 ; } // long

    } // end of donc_50_pv == -1.0 loop

    //-------------------------------------------------------------------------------------------------

    // composite_eq
    comp_pv[0] = sma_10_20_pv[0] + sma_20_50_pv[0] + tma_pv[0] + bbo_20_pv[0] + bbo_50_pv[0] + donc_20_pv[0] + donc_50_pv[0] ;
    
    if ( comp_pv[0] > 0 ) {
    composite_eq[ii] = composite_eq[ii-1] + tick_value[0] * ( (open[ii+2]-open[ii+1])/tick_size[0] ) ; } // long

    if ( comp_pv[0] < 0 ) {
    composite_eq[ii] = composite_eq[ii-1] + tick_value[0] * ( (open[ii+1]-open[ii+2])/tick_size[0] ) ; } // short 

    if ( comp_pv[0] == 0 ) {
    composite_eq[ii] = composite_eq[ii-1] ; } // neutral 

} // end of main for loop

// Now fill in the last two spaces in the equity vectors
market_mode_long_eq[n-1] = market_mode_long_eq[n-3] ;
market_mode_long_eq[n-2] = market_mode_long_eq[n-3] ;

market_mode_short_eq[n-1] = market_mode_short_eq[n-3] ;
market_mode_short_eq[n-2] = market_mode_short_eq[n-3] ;

market_mode_composite_eq[n-1] = market_mode_composite_eq[n-3] ;
market_mode_composite_eq[n-2] = market_mode_composite_eq[n-3] ;

sma_10_20_eq[n-1] = sma_10_20_eq[n-3] ;
sma_10_20_eq[n-2] = sma_10_20_eq[n-3] ;

sma_20_50_eq[n-1] = sma_20_50_eq[n-3] ;
sma_20_50_eq[n-2] = sma_20_50_eq[n-3] ;

tma_eq[n-1] = tma_eq[n-3] ;
tma_eq[n-2] = tma_eq[n-3] ;

bbo_20_eq[n-1] = bbo_20_eq[n-3] ;
bbo_20_eq[n-2] = bbo_20_eq[n-3] ;

bbo_50_eq[n-1] = bbo_50_eq[n-3] ;
bbo_50_eq[n-2] = bbo_50_eq[n-3] ;

donc_20_eq[n-1] = donc_20_eq[n-3] ;
donc_20_eq[n-2] = donc_20_eq[n-3] ;

donc_50_eq[n-1] = donc_50_eq[n-3] ;
donc_50_eq[n-2] = donc_50_eq[n-3] ;

composite_eq[n-1] = composite_eq[n-3] ;
composite_eq[n-2] = composite_eq[n-3] ;

return List::create(
  _["market_mode_long_eq"] = market_mode_long_eq ,
  _["market_mode_short_eq"] = market_mode_short_eq ,
  _["market_mode_composite_eq"] = market_mode_composite_eq ,
  _["sma_10_20_eq"] = sma_10_20_eq ,
  _["sma_20_50_eq"] = sma_20_50_eq , 
  _["tma_eq"] = tma_eq ,
  _["bbo_20_eq"] = bbo_20_eq ,
  _["bbo_50_eq"] = bbo_50_eq ,
  _["donc_20_eq"] = donc_20_eq ,
  _["donc_50_eq"] = donc_50_eq ,
  _["composite_eq"] = composite_eq ) ; '

basic_benchmark_equity <- cxxfunction(signature(a = "numeric", b = "numeric", c = "numeric",
                                d = "numeric", e = "numeric", f = "numeric"), body=src, 
                                plugin = "Rcpp")

This Rcpp function is then called thus, using Rstudio

library(xts)  # load the required library

# load the "indicator" file 
data <- read.csv(file="usdyenind",head=FALSE,sep=,)
tick_size <- 0.01
tick_value <- 12.50

# extract other vectors of interest
open <- data[,2]
close <- data[,5]
market_mode <- data[,228]
kalman <- data[,283]

results <- basic_benchmark_equity(open,close,market_mode,kalman,tick_size,tick_value)

# coerce the above results list object to a data frame object
results_df <- data.frame( results )
df_max <- max(results_df) # for scaling of results plot
df_min <- min(results_df) # for scaling of results plot

# and now create an xts object for plotting
results_xts <- xts(results_df,as.Date(data[,'V1']))

# a nice plot of the results_xts object
par(col="#0000FF")
plot(results_xts[,'market_mode_long_eq'],main="USDYEN Pair",ylab="$ Equity Value",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="#B0171F")
plot(results_xts[,'market_mode_short_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="#00FF00")
plot(results_xts[,'market_mode_composite_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="#808080")
plot(results_xts[,'sma_10_20_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'sma_20_50_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'tma_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'bbo_20_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'bbo_50_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'donc_20_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE)
plot(results_xts[,'donc_50_eq'],main="",ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="black")
plot(results_xts[,'composite_eq'],main="",ylim=c(df_min,df_max),type="l")


to output .png files which I have strung together in this video
Non-embedded view here.
The light grey equity curves are the individual curves for the benchmark systems and the black is the "committee" 1 contract equity curve. Also shown are the long and short 1 contract equity curves ( blue and red respectively ), along with a green combined equity curve for these, for my Naive Bayesian Classifier following the simple rules
  • be long 1 contract if the market type is uwr, unr or cyclic with my Kalman filter pointing upwards
  • be short 1 contract if the market type is dwr, dnr or cyclic with my Kalman filter pointing downwards
Again this is just a toy example of the use of my Bayesian Classifier.  After I have completed Andrew Ng's machine learning course, as mentioned in my previous post, I will have a go at coding a neural net which may end up replacing the Bayesian Classifier.

Monday, 28 May 2012

Update on Gold Delta Solution

Back in February I published a post about the Delta solution for Gold, and this post is a follow up to that earlier post.

Below is an updated chart of Gold showing the next few turning points following on from where the previous chart ended.

My read of this chart is that MTD 1 (solid green line) is a high which came in early March, followed by a low MTD 2 which came in a week late (the second solid red line). The market then struggled to move up to a high MTD 3, indicating a very weak market, and now the market is moving down to a low MTD 4 (the second solid yellow line). Since the most recent MTD 3 high is lower than the previous MTD 1 high and it looks like the MTD 4 low will come in lower that the MTD 2 low, I'd say that on the MTD time frame Gold is now a bear market.

On an unrelated note, it has been more than a month since my last post. During this time I have been busy working through the free, online version of Andrew Ng's Machine Learning course. More on this in a future post.

Tuesday, 24 April 2012

First Use of Rcpp Package

In preparation for the tests I want to conduct, I have finally started to use the Rcpp package. The reason for this decision is straightforward; I cannot begin to imagine how the code for the tests I want to conduct could be vectorised, and since loops in R are slow and not recommended there is no realistic alternative to biting the Rcpp bullet. Another reason is that some R packages, such as the ttrTests package, require an R function as part of the input and using Rcpp will enable me to transfer my .oct C++ function coding directly into R without having to rewrite in R code itself. A simple proof of concept programming exercise I set myself is shown below.
rm(list=ls()) # clear the entire workspace
library(xts)  # load the required library
library(zoo)  # load the required library
library(Rcpp) # load the required library
library(inline) # load the required library
library(compiler) # load the required library

# load the "indicator" file 
data <- read.csv(file="sind",head=FALSE,sep=,)
tick_size <- 0.025
tick_value <- 12.50

# extract other vectors of interest
#open <- data[,2]
market_mode <- data[,228]
kalman <- data[,283]

# source the Rcpp inline file for the C++ code for the test in question
# this file effectively creates a function that takes input and gives test 
# output in the desired form e.g. equity curve, position vector etc. for
# further statistical tests in the R environment, e.g the ttrTests package.

source("basic_market_mode_equity.r")
results <- basic_market_mode_equity(data[,2],market_mode,kalman,tick_size,tick_value)
This is "normal" R code which
  • loads the required libraries
  • loads the required file(s) from disk and takes other inputs for the test in question
  • extracts the required information from the loaded file(s)
  • sources and then calls the test function, named "basic_market_mode_equity"
This second code block contains the actual C++ test function code, which is contained in a file named "basic_market_mode_equity.r"
# This function takes as inputs vectors of opening prices, the "market mode," 
# the kalman filter of vwap, and single values for tick size and tick value.
# The values of "market_mode" are as follows:-
# 0 = cyclic
# 1 = up with retracement
# 2 = up with no retracement
# 3 = down with retracement
# 4 = down with no retracement
# 
# This first test is very simple - be long when market mode is 1 or 2, 
# short when 3 or 4, and when 0 be long or short depending on the direction
# of the kalman filter. This test is just practice in coding using Rcpp:- 
# don't expect any meaningful results. Equity curves for each market mode 
# are the function outputs

src <- '
Rcpp::NumericVector open(a) ;
Rcpp::NumericVector market_mode(b) ;
Rcpp::NumericVector kalman(c) ;
Rcpp::NumericVector tick_size(d) ;
Rcpp::NumericVector tick_value(e) ;
int n = open.size() ;
Rcpp::NumericVector cyc(n) ; // create output vector
Rcpp::NumericVector uwr(n) ; // create output vector
Rcpp::NumericVector unr(n) ; // create output vector
Rcpp::NumericVector dwr(n) ; // create output vector
Rcpp::NumericVector dnr(n) ; // create output vector

// fill the equity_curve with zeros for "burn in" period
for ( int ii = 0 ; ii < 100 ; ii++ ) {
    cyc[ii] = 0.0 ;
    uwr[ii] = 0.0 ;
    unr[ii] = 0.0 ;
    dwr[ii] = 0.0 ;
    dnr[ii] = 0.0 ; }

for ( int ii = 100 ; ii < n ; ii++ ) {
    
    // uwr market type
    if ( market_mode[ii-2] == 1 ) { 
    cyc[ii] = cyc[ii-1] ;
    uwr[ii] = uwr[ii-1] + tick_value[0] * ( (open[ii]-open[ii-1])/tick_size[0] ) ;
    unr[ii] = unr[ii-1] ;
    dwr[ii] = dwr[ii-1] ;
    dnr[ii] = dnr[ii-1] ; }

    // unr market type
    if ( market_mode[ii-2] == 2 ) { 
    cyc[ii] = cyc[ii-1] ;
    uwr[ii] = uwr[ii-1] ; 
    unr[ii] = unr[ii-1] + tick_value[0] * ( (open[ii]-open[ii-1])/tick_size[0] ) ;
    dwr[ii] = dwr[ii-1] ;
    dnr[ii] = dnr[ii-1] ; }

    // dwr
    if ( market_mode[ii-2] == 3 ) { 
    cyc[ii] = cyc[ii-1] ;
    uwr[ii] = uwr[ii-1] ;
    unr[ii] = unr[ii-1] ;
    dwr[ii] = dwr[ii-1] + tick_value[0] * ( (open[ii-1]-open[ii])/tick_size[0] ) ; 
    dnr[ii] = dnr[ii-1] ; }

    // dnr
    if ( market_mode[ii-2] == 4 ) { 
    cyc[ii] = cyc[ii-1] ;
    uwr[ii] = uwr[ii-1] ;
    unr[ii] = unr[ii-1] ;
    dwr[ii] = dwr[ii-1] ; 
    dnr[ii] = dnr[ii-1] + tick_value[0] * ( (open[ii-1]-open[ii])/tick_size[0] ) ; }

    // cyc long
    if ( market_mode[ii-2] == 0 && kalman[ii-2] > kalman[ii-3] ) { 
    cyc[ii] = cyc[ii-1] + tick_value[0] * ( (open[ii]-open[ii-1])/tick_size[0] ) ;
    uwr[ii] = uwr[ii-1] ;
    unr[ii] = unr[ii-1] ;
    dwr[ii] = dwr[ii-1] ; 
    dnr[ii] = dnr[ii-1] ; }

    // cyc short
    if ( market_mode[ii-2] == 0 && kalman[ii-2] < kalman[ii-3] ) { 
    cyc[ii] = cyc[ii-1] + tick_value[0] * ( (open[ii-1]-open[ii])/tick_size[0] ) ;
    uwr[ii] = uwr[ii-1] ;
    unr[ii] = unr[ii-1] ;
    dwr[ii] = dwr[ii-1] ; 
    dnr[ii] = dnr[ii-1] ; }

} // end of main for loop

return List::create(
  _["cyc"] = cyc ,
  _["uwr"] = uwr ,
  _["unr"] = unr ,
  _["dwr"] = dwr ,
  _["dnr"] = dnr ) ; '

basic_market_mode_equity <- cxxfunction(signature(a = "numeric", b = "numeric", c = "numeric",
                                d = "numeric", e = "numeric"), body=src, 
                                plugin = "Rcpp")
which basically just outputs five equity curves corresponding to the identified market mode (see the comments in the code for more details).

This final code block is again "normal" R code
# coerce the above results list object to a data frame object
results_df <- data.frame( results )
df_max <- max(results_df) # for scaling of results plot
df_min <- min(results_df) # for scaling of results plot

# and now create an xts object for plotting
results_xts <- xts(results_df,as.Date(data[,'V1']))

# a nice plot of the results_xts object
plot(results_xts[,'cyc'],ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="cyan")
plot(results_xts[,'uwr'],ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="blue")
plot(results_xts[,'unr'],ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="green")
plot(results_xts[,'dwr'],ylim=c(df_min,df_max),type="l")
par(new=TRUE,col="red")
plot(results_xts[,'dnr'],ylim=c(df_min,df_max),type="l")
which produces this typical plot output.
This test is just a toy test and the results are not really important. The important thing is that I can now easily import the output of my Octave .oct C++ functions into R and conduct a whole range of tests utilising R packages from CRAN or simply write my own test routines in C++ (my intention) to run in R and not have to struggle with vectorising code for speed optimisation purposes.