Wednesday 27 February 2013

Restricted Boltzmann Machine

In an earlier post I said that I would write about Restricted Boltzmann machines and now that I have begun adapting the Geoffrey Hinton course code I have, this is the first post of possibly several on this topic.

Essentially, I am going to use the RBM to conduct unsupervised learning on unlabelled real market data, using some of the indicators I have developed, to extract relevant features to initialise the input to hidden layer weights of my market classifying neural net, and then conduct backpropagation training of this feedforward neural network using the labelled data from my usual, idealised market types.

Readers may well ask, "What's the point of doing this?" Well, taken from my course assignment notes, and edited by me for relevance to this post, we have:-

In the previous assignment we tried to reduce overfitting by learning less (early stopping, fewer hidden units etc.) RBMs, on the other hand, reduce overfitting by learning more: the RBM is being trained unsupervised so it's working to discover a lot of relevant regularity in the input features, and that learning distracts the model from excessively focusing on class labels. This is much more constructive distraction: instead of early stopping the model after a little learning we instead give the model something much more meaningful to do. ...it works great for regularisation, as well as training speed. ... In the previous assignment we did a lot of work selecting the right number of training iterations, the right number of hidden units, and the right weight decay. ... Now we don't need to do that at all, ... the unsupervised training of the RBM provides all the regularisation we need. If we select a decent learning rate, that will be enough, and we'll use lots of hidden units because we're much less worried about overfitting now.

Of course, a picture is worth a thousand words, so below are a 2D and a 3D picture

These two pictures show the weights of the input to hidden layer after only two iterations of RBM training, and effectively represent a "typical" random initialisation of weights prior to backpropagation training. It is from this type of random start that the class labelled data would normally be used to train the NN.

These next two pictures tell a different story

These show weights after 50,000 iterations of RBM training. Quite a difference, and it is from this sort of start that I will now train my market classifier NN using the class labelled data.

Some features are easily seen. Firstly, the six columns on the "left" sides of these pictures result from the cyclic period features in the real data, expressed in binary form, and effectively form the weights that will attach to the NN bias units. Secondly, the "right" side shows the most recent data in the look back window applied to the real market data. The weights here have greater magnitude than those further back, reflecting the fact that shorter periods are more prevalent than longer periods and that, intuitively obvious perhaps, more recent data has greater importance than older data. Finally, the colour mapping shows that across the entire weight matrix the magnitude of the values has been decreased by the RBM training, showing its regularisation effect.

1 comment:

Unknown said...

Hi,
I'm taking a similar approach to tackle market data using assignments from Hinton's class. I started taking the class in August 2013 and so far I've already implemented an addaptation of assignment 2 to predict the next candle on daily market data (using 250 kmean clusters of that data as "words").
I just got to the section on boltzman machines and Im preparing to do the assignment and later to adapt it to market data.
How were your results pursuing this path? Care to share any pointers?
thanks.