I am pleased to say that all my recent work seems to have borne fruit, and I have now managed to code up a training and testing routine in
Octave that uses the
FANN library and its
Octave bindings. I think that this has been some of my most challenging coding work up to now, and required many hours of research on the web and
forum help to complete.
I find that one frustration with using open source software is the sparse and sometimes non-existent documentation and this blog post is partly intended as a guide for those readers who may also wish to use FANN in Octave. The code in the code box below is roughly divided into these sections
- Octave code to index into and extract the relevant data from previously saved files
- a section that uses Perl to format this data
- the Octave binding code that actually implements the FANN library functions to set up and train a NN
- a short bit of code to save and then test the NN on the training data
As the code itself is heavily commented no further comment is required.
% load training_data_1.mat on command line before running this script.
clear exclusive -X -accurate_period -y
yy = eye(5)(y,:) ; % using training labels y, create an output vector suitable for NN training
period = input('Enter period of interest: ') ;
%for period = 10:50
fprintf('\nTraining for ANN period: %f\n', period ) ;
% This first switch control block creates the training data by indexing, by period, into them
% data loaded from training_data_1.mat
switch (period)
case 10
% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;
% now index using input period plus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period+1 ) ;
% extract the relevant part of X using above i_X index
X_test = X( [i_X] , : ) ;
y_test = yy( [i_X] , : ) ;
train_data = [ X_train y_train ] ;
test_data = [ X_test y_test ] ;
detect_optima = train_data( (60:60:9000) , : ) ;
case 50
% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;
% now index using input period minus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period-1 ) ;
% extract the relevant part of X using above i_X index
X_test = X( [i_X] , : ) ;
y_test = yy( [i_X] , : ) ;
train_data = [ X_train y_train ] ;
test_data = [ X_test y_test ] ;
detect_optima = train_data( (60:60:9000) , : ) ;
otherwise
% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;
% now index using input period minus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period-1 ) ;
% extract the relevant part of X using above i_X index
X_test_1 = X( [i_X] , : ) ;
% and take every other value
X_test_1 = X_test_1( (2:2:9000) , : ) ;
% and same for market labels vector y
y_test_1 = yy( [i_X] , : ) ;
% and take every other value
y_test_1 = y_test_1( (2:2:9000) , : ) ;
% now index using input period plus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period+1 ) ;
% extract the relevant part of X using above i_X index
X_test_2 = X( [i_X] , : ) ;
% and take every other value
X_test_2 = X_test_2( (2:2:9000) , : ) ;
% and same for market labels vector y
y_test_2 = yy( [i_X] , : ) ;
% and take every other value
y_test_2 = y_test_2( (2:2:9000) , : ) ;
train_data = [ X_train y_train ] ;
test_data = [ [ X_test_1 y_test_1 ] ; [ X_test_2 y_test_2 ] ] ;
detect_optima = train_data( (60:60:9000) , : ) ;
endswitch % end of training data indexing switch
% now write this selected period data to -ascii files
save data_for_training -ascii train_data
save data_for_testing -ascii test_data
save detect_optima -ascii detect_optima % for use in Fanntool software
%************************************************************************
% Now the FANN training code ! *
%************************************************************************
% First set the parameters for the FANN structure
No_of_input_layer_nodes = 102
No_of_hidden_layer_nodes = 102
No_of_output_layer_nodes = 5
Total_no_of_layers = length( [ No_of_input_layer_nodes No_of_hidden_layer_nodes No_of_output_layer_nodes ] )
% save and write this FANN structure info and length of training data file into an -ascii file - "train_nn_from_this_file"
fid = fopen( 'train_nn_from_this_file' , 'w' ) ;
fprintf( fid , ' %i %i %i\n ' , length(train_data) , No_of_input_layer_nodes , No_of_output_layer_nodes ) ;
fclose(fid) ;
% now create the FANN formatted training file - "train_nn_from_this_file"
system( "perl perl_file_manipulate.pl >train_nn_from_this_file" ) ;
%{
The above call to "system" interupts, or pauses, Octave at this point. Now the "shell" or "bash"
takes over and calls a Perl script, "perl_file_manipulate.pl", with the command line arguments
">train_nn_from_this_file", where < indicates that the file "data_for_training"
is to be read by the Perl script and >> indicates that the file "train_nn_from_this_file" is to be
appended by the Perl script. From the fopen and fclose operations above the file to be appended contains only
FANN structure info, e.g. 9000 102 5 on one line, and the file that is to be read is the training data of NN features
and outputs extracted by the switch control structure above and written to -ascii files. The contents of the Perl
script file are:
#!/usr/bin/env perl
while (<>) {
my @f = split ;
print("@f[0..$#f-5]\n@f[-5..-1]\n") ;
}
After these Perl operations the file "train_nn_from_this_file" is correctly formatted for the FANN library calls that
are to come
e.g. the file looks like this:-
9000 102 5
-2.50350699e-09 -2.52301858e-09 -2.50273727e-09 -2.44301942e-09 -2.34482961e-09 -2.20974520e-09
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00
etc.
When all this Perl script stuff is finished control returns to Octave.
%}
%***************************************************************************
% Begin FANN training ! Hurrah ! *
%***************************************************************************
% create the FANN
ANN = fann_create( [ No_of_input_layer_nodes No_of_hidden_layer_nodes No_of_output_layer_nodes ] ) ;
% create the parameters for training the FANN in an Octave "struct." All parameters are explicitly stated and set to the
% the default values. If not explicitly stated they would be these values anyway, but are explicitly stated just to show
% how this is done
NN_PARAMS = struct( "TrainingAlgorithm", 'rprop', "LearningRate", 0.7, "ActivationHidden", 'Sigmoid', "ActivationOutput", 'Sigmoid',...
"ActivationSteepnessHidden", 0.5, "ActivationSteepnessOutput", 0.5, "TrainErrorFunction", 'TanH', "QuickPropDecay", -0.0001,...
"QuickPropMu", 1.75, "RPropIncreaseFactor", 1.2, "RPropDecreaseFactor", 0.5, "RPropDeltaMin", 0.0, "RPropDeltaMax", 50.0 )
% and then set the parameters
fann_set_parameters( ANN , NN_PARAMS ) ;
% now train the FANN on data contained in file "train_nn_from_this_file"
fann_train( ANN, 'train_nn_from_this_file', 'MaxIterations', 200, 'DesiredError', 0.001, 'IterationsBetweenReports', 10 )
% save the trained FANN in a file e.g. "ann_25.net"
fann_save( ANN , [ "ann_" num2str(period) ".net" ] )
% Now test the ANN on the test_data set
% create ANN from saved fann_save file
ANN = fann_create( [ "ann_" num2str(period) ".net" ] ) ;
% run the trained ANN on the original feature training set, X_train
X_train_FANN_results = fann_run( ANN , X_train ) ;
% convert the X_train_FANN_results matrix to a single prediction vector
[dummy, prediction] = max( X_train_FANN_results, [], 2 ) ;
% compare accuracy of this NN prediction vector with the known labels in y for this period and display
[i_X j_X] = find( accurate_period(:,1) == period ) ;
fprintf('\nTraining Set Accuracy: %f\n', mean( double( prediction == y([i_X],:) ) ) * 100 ) ;
fprintf('End of training for ANN period: %f\n', period ) ;
%end % end of period for loop
Typical terminal output during the running of this code looks like this:
octave:143> net_train_octave
Enter period of interest: 25
Max epochs 200. Desired error: 0.0010000000.
Epochs 1. Current error: 0.2537834346. Bit fail 45000.
Epochs 10. Current error: 0.1802092344. Bit fail 20947.
Epochs 20. Current error: 0.0793143436. Bit fail 7380.
Epochs 30. Current error: 0.0403240845. Bit fail 5215.
Epochs 40. Current error: 0.0254898760. Bit fail 2853.
Epochs 50. Current error: 0.0180807728. Bit fail 1611.
Epochs 60. Current error: 0.0150692556. Bit fail 1414.
Epochs 70. Current error: 0.0119200321. Bit fail 1187.
Epochs 80. Current error: 0.0091521516. Bit fail 937.
Epochs 90. Current error: 0.0073408978. Bit fail 670.
Epochs 100. Current error: 0.0060765576. Bit fail 492.
Epochs 110. Current error: 0.0051601632. Bit fail 446.
Epochs 120. Current error: 0.0041675218. Bit fail 386.
Epochs 130. Current error: 0.0036309268. Bit fail 374.
Epochs 140. Current error: 0.0032380833. Bit fail 343.
Epochs 150. Current error: 0.0028855132. Bit fail 302.
Epochs 160. Current error: 0.0025165526. Bit fail 280.
Epochs 170. Current error: 0.0022868335. Bit fail 253.
Epochs 180. Current error: 0.0021089041. Bit fail 220.
Epochs 190. Current error: 0.0019043182. Bit fail 197.
Epochs 200. Current error: 0.0017739790. Bit fail 169.
Training for ANN period: 25.000000
No_of_input_layer_nodes = 102
No_of_hidden_layer_nodes = 102
No_of_output_layer_nodes = 5
Total_no_of_layers = 3
NN_PARAMS =
scalar structure containing the fields:
TrainingAlgorithm = rprop
LearningRate = 0.70000
ActivationHidden = Sigmoid
ActivationOutput = Sigmoid
ActivationSteepnessHidden = 0.50000
ActivationSteepnessOutput = 0.50000
TrainErrorFunction = TanH
QuickPropDecay = -1.0000e-04
QuickPropMu = 1.7500
RPropIncreaseFactor = 1.2000
RPropDecreaseFactor = 0.50000
RPropDeltaMin = 0
RPropDeltaMax = 50
Training Set Accuracy: 100.000000
End of training for ANN period: 25.000000
The accuracy obtained on all periods from 10 to 50 is at least 98%, with about two thirds being 100%. However, the point of this post is not to show results of any one set of NN features or training parameters, but rather that I can now be more productive by using the speed and flexibility of FANN in the development of my NN market classifier.