## Thursday, 9 January 2014

### Neural Net Walk Forward Octave Training Code

Following on from my last post I have now completed the basic refactoring of my existing NN Octave code to a "Walk Forward" training regime. After some simple housekeeping code to load/extract data and create inputs the basic nuts and bolts of this new, walk forward Octave code is given in the code box below.
% get the corresponding targets - map the target labels as binary vectors of 1's and 0's for the class labels
n_classes = 3 ; % 3 outputs in softmax layer
targets = eye( n_classes )( l_s_n_pos_vec, : ) ; % there are 3 labels

% get NN features for all data
[ binary_features , scaled_features ] = windowed_nn_input_features( open, high, low, close, tick ) ;

tic ;

% a uniformly distributed randomness source for seeding & reproductibility purposes
global randomness_source

% vector to hold final NN classification results
nn_classification = zeros( length(close) , 2 ) ;

ii_begin = 2000 ;
ii_end = 2500 ;

for ii = ii_begin : ii_end

% get the look back period
lookback_period = max( period(ii-2,1)+2, period(ii,1) ) ;

% extract features and targets from windowed_nn_input_features output
training_binary_features = binary_features( ii-lookback_period:ii , : ) ;
training_scaled_features = scaled_features( ii-lookback_period:ii , : ) ;
training_targets = targets( ii-lookback_period:ii , : ) ;

% set some default values for NN training
n_hid = 2.0 * ( size( training_binary_features, 2 ) + size( training_scaled_features, 2 ) ) ; % units in hidden layer
lr_rbm = 0.02 ;       % default value for learning rate for rbm training
learning_rate = 0.5 ; % default value for learning rate for back prop training
n_iterations = 500 ; % number of training iterations to perform

% get the rbm trained weights for the input layer to hidden layer for training_binary_features
rbm_w_binary = train_rbm_binary_features( training_binary_features, n_hid, lr_rbm, n_iterations ) ;

% and now get the rbm trained weights of the input layer to hidden layer for training_scaled_features
rbm_w_scaled = train_rbm_scaled_features( training_scaled_features, n_hid, lr_rbm, n_iterations ) ;

% now stack these to form the initial input layer to hidden layer weight matrix
initial_input_weights = [ rbm_w_binary ; rbm_w_scaled ] ;

% now feedforward through hidden layer to get the hidden layer output
hidden_layer_output = logistic( [ training_binary_features training_scaled_features ] * initial_input_weights ) ;

% now add the hidden layer bias unit to the above output of the hidden layer and RBM train
hidden_layer_output = [ ones( size(hidden_layer_output,1), 1 ) hidden_layer_output ] ;
initial_softmax_weights = train_rbm_output_w( hidden_layer_output, n_classes, lr_rbm, n_iterations ) ;

% now using the above rbm trained weight matrices as initial weights instead of random initialisation,
% do the backpropagation training with targets by dropping the last two sets of features for which
% there are no targets and adding in a column of ones for the input layer bias unit
% First, extract the relevant features
all_features = [ training_binary_features(1:end-2,:) training_scaled_features(1:end-2,:) ] ;
training_targets = training_targets(1:end-2,:) ;

% and do the backprop training
[ final_input_w, final_softmax_w ] = rbm_backprop_training( all_features, training_targets, initial_input_weights, initial_softmax_weights, n_hid, n_iterations, learning_rate, 0.9, false) ;

% get the NN classification of the ii-th bar for this iteration
val_1 = logistic( [ training_binary_features(end-1,:) training_scaled_features(end-1,:) ] * final_input_w ) ;
val_2 = [ 1 val_1 ] * final_softmax_w ;
class_normalizer = log_sum_exp_over_cols( val_2 ) ;
log_class_prob = val_2 - repmat( class_normalizer , [ 1 , size( val_2 , 2 ) ] ) ;
class_prob = exp( log_class_prob ) ;
[ dump , choice_1 ] = max( class_prob , [] , 2 ) ;

if ( choice_1 == 3 )
choice_1 = ff_pos_vec(ii-2) ;
end

val_1 = logistic( [ training_binary_features(end,:) training_scaled_features(end,:) ] * final_input_w ) ;
val_2 = [ 1 val_1 ] * final_softmax_w ;
class_normalizer = log_sum_exp_over_cols( val_2 ) ;
log_class_prob = val_2 - repmat( class_normalizer , [ 1 , size( val_2 , 2 ) ] ) ;
class_prob = exp( log_class_prob ) ;
[ dump , choice ] = max( class_prob , [] , 2 ) ;

nn_classification( ii , 1 ) = choice ;

if ( choice == 3 )
choice = choice_1 ;
end

nn_classification( ii , 2 ) = choice ;

end % end if ii loop

toc ;

% write to file for plotting in Gnuplot
axis = (ii_begin:1:ii_end)';
output_v2 = [axis,open(ii_begin:ii_end,1),high(ii_begin:ii_end,1),low(ii_begin:ii_end,1),close(ii_begin:ii_end,1),nn_classification(ii_begin:ii_end,2),ff_pos_vec(ii_begin:ii_end,:) ] ;
dlmwrite('output_v2',output_v2)

This code is quite heavily commented, but to make things clearer here is what it does:-
1. creates a matrix of binary features and a matrix of scaled features ( in the range 0 to 1 ) by calling the C++ .oct function "windowed_nn_input_features." The binary features matrix includes the input layer bias unit
2. RBM trains separately on each of the above features matrices to get weight matrices
3. horizontally stacks the two weight matrices from step 2 to create a single, initial input layer to hidden layer weight matrix
4. matrix multiplies the input features by the combined RBM trained matrix from step 3 and feeds forward through the logistic function hidden layer to create a hidden layer output
5. adds a bias unit to the hidden layer output from step 4 and then RBM trains to get a Softmax weight matrix
6. uses the initial weight matrices from steps 3 and 5 instead of random initialisation for backpropagation training of a Feedforward neural network via the "rbm_backprop_training" function
7. uses the trained NN from step 6 to make prediction/classify the most recent candlestick bar and records this NN prediction/classification. Steps 2 to 7 are contained in a for loop which slides a moving window across the input data
8. finally writes output to file
At the moment the features I'm using are very simplistic, just for unit testing purposes. My next post will show the results of the above with a fuller set of features.

Anonymous said...

as far as I know RBM accepts binary inputs or probabilities as an input.
So your scaled features are converted to probabilities somehow ??

Krzysztof

Dekalog said...

Hi Krzysztof,

The scaled features ( in the range to 0 to 1 ) are used as probabilities of an input being on or not in the CD1 training. Uniform random numbers are generated and if scaled features > random numbers then the input is set to 1, otherwise to 0. In this way all the inputs are binary with different probabilities. The code I'm using is from Geoff Hinton's Coursera course material.