The next round of cross validation tests has shown that 600 to 1000 epochs per NN is the optimum number for training purposes. Also, the NN architecture has slightly evolved into having two output nodes in the output layer, one for each class in the binary NN classifier.

In an earlier post I said that I would use a hyperbolic tangent function as the activation functions, as per Y. LeCun (1998). The actual function from this paper is$$f(x) = 1.7159 tanh( (2/3) x )$$and its derivative is

$$f'(x) = (2*1.7159/3) ( 1 - tanh^2 ( (2/3) x) )$$I have written Octave functions for this and they are now being used in the final CV test to determine the optimum regularisation term to be used during NN training.

## 1 comment:

Hey! This is my 1st comment here so I just wanted to give

a quick shout out and tell you I truly enjoy reading your posts.

Can you suggest any other blogs/websites/forums

that cover the same subjects? Thanks a lot!

My web page::rn work from homePost a Comment