Dekalog Blog

Wednesday, 24 September 2025

Expressing an Indicator in Neural Net form, Part 2.

Following on from my previous post, I have been experimenting with various tweaks to the basic set-up of expressing an indicator in the form of a neural net, shown again below to avoid the necessity of having readers flip between posts.

The tweaks explored relate to the weight matrices, activation functions, targets and loss functions, the end results of which are now briefly summarized in turn.

The weight matrices are sparse and with shared weights within each separate matrix because they simply calculate the 4 indicator outputs t which represent the indicator and 3 lagged outputs of it, hence sharing the same weights. The input weights matrix is a 40 by 20 matrix with only 10 unique parameters to train/update (expressed during training by averaging across the 40 non zero weights in the matrix). Similarly, the output weights matrix only has 1 unique trainable weight. Together with bias units (not shown in the diagram above) and the 4 decision weights, the whole neural net has a total of 16 trainable weights.

The activation units are exactly as discussed in my previous post and shown in the diagram above, namely tanh on the 2 hidden layers and a sigmoid on the final output layer.

For the targets I used the sign of the slope of a sliding, centred smooth of the OHLC bar opens, on the premise that the first trade opportunity would be acted upon at the opening price following a buy/sell signal calculated on the close of a bar. The smoothing eliminates the whipsaws that would result from a single adverse return in an otherwise smooth set of returns.

The loss is a combination of a weighted cross-entropy loss and a Sortino ratio loss. The weights for the weighted cross-entropy loss are simply the absolute values of the returns, the idea being that it is more important to get the direction right on the big returns. The Sortino loss is implemented by minimizing the negative of the Sortino ratio, i.e. maximizing the ratio. There are 2 versions of this Sortino loss: an analytical gradient that spreads the cost over each training sample per training epoch similar to the cross-entropy loss, and a global loss which uses finite differences over the 4 output decision weights. Both of these Sortino losses were coded with the help of deepseek-talkai. The losses are mixed via a mixing parameter, Lambda, which varies from 0 to 1, with 0 being a pure cross-entropy loss and 1 being pure Sortino loss.

The following .gif shows the training data equity curves for each type of loss (see the titles of each plot) for every hundredth epoch up to epoch 600 over a week's worth of 10 minute forex data limited to the trading hours described in my previous post. The legend in the top left corner identifies each equity curve.

Obviously these plots show that it is possible to improve results over those of the original indicator (shown in red in the above .gif), but of course these are in-sample results. My next post will be about cross validation and out-of sample test results.

Wednesday, 3 September 2025

Expressing an Indicator in Neural Net Form

Recently I started investigating relative rotation graphs with a view to perhaps implementing a version of this for use on forex currency pairs. The underlying idea of a relative rotation graph is to plot an asset's relative strength compared to a benchmark and the momentum of this relative strength and to plot this in a form similar to a Polar coordinate system plot, which rotates around a zero point representing zero relative strength and zero relative strength momentum. After some thought it occurred to me that rather than using relative strength against a benchmark I could use the underlying relative strengths of the individual currencies, as calculated by my currency strength indicator, against each other. Furthermore, these underlying log changes in the individual currencies can be normalised using the ideas of brownian motion and then averaged together over different look back periods to create a unique indicator.

This indicator was coded up in the "traditional" way that will be familiar to anybody who has ever tried coding a trading indicator in any coding language. However, I thought that if the calculation of this indicator could be expressed in the form of a Feedforward neural network then the optimisation opportunities available using backpropagation and regularization could be used to tweek the indicator in more useful ways than just varying a look back length and amount of averaging. After some work I was able to make this work in just these two lines of Octave code:

indicator_middle_layer = tanh( full_feature_matrix * input_weights ) ;
indicator_nn_output = tanh( indicator_middle_layer * output_weights ) ;

Of course, prior to calling these two lines of code, there is some feature engineering to create the input full_feature_matrix, and the input weights and output_weights matrices taken together are mathematically equivalent to the original indicator calculations. Finally, because this is a neural net expression of the indicator, the non-linear tanh activation function is applied to the hidden middle and output layers of the net.

The following plot shows the original indicator in black and the neural net version of it in blue

over the data shown in this plot of 10 minute bars of the EURUSD forex pair.

The red indicator in the plot above is the 1 bar momentum of the neural net indicator plot.

To judge the quality of this indicator I used the entropy measure (measured over 14,000+ 10 minute bars of EURUSD), the results of which are shown next.

entropy_indicator_original = 0.9485
entropy_indicator_nn_output = 0.9933

An entropy reading of 0.9933 is probably as good as any trading indicator could hope to achieve (a perfect reading is 1.0) and so the next thing was to quickly back test the indicator performance. Based on the logic of the indicator the obvious long (short) signals are being above (below) the zeroline, or equivalently the sign of the indicator, and for good measure I also tested the sign of the momentum and some simple moving averages thereof.

The following plot shows the equity curves of this quick test where it is visually clear that the blue equity curves are "the best" when plotted in relation to the black "buy and hold" equivalent equity curve. These represent the equity curves of a 3 bar simple moving average of the 1 bar momentum of both the original formulation of the indicator and the neural net implementation. I would point out that these equity curves represent the theoretical equity resulting from trading the London session after 9:00am BST and stopping at 7:00am EST (usually about noon BST) and then starting trading again at 9:00am EST until 17:00pm EST. This schedule avoids the opening sessions (7:00 to 9:00am) in both London and New York because, from my observations of many OHLC charts such as shown above, there are frequently wild swings where price is being pushed to significant points such as volume profile clusters, accumulations of buy/sell orders etc. and in my opinion no indicator can be expected to perform well in that sort of environment. Also avoided are the hours prior to 7:00am BST, i.e. the Asian session or overnight session.

Although these equity curves might not, at first glance, be that impressive, especially as they do not account for trading costs etc. my intent on doing these tests was to determine the configuration of a final "decision" output layer to be added to the neural net implementation of the indicator. A 3 bar simple moving average of the 1 bar momentum implies the necessity to include 4 consecutive readings of the indicator output as input to a final " decision" layer. The following simple, hand-drawn sketch shows what I mean:

A discussion of this will be the subject of my next post.

Wednesday, 11 June 2025

A Replacement for my PositionBook Charts using Tick Volumes?

At the end of my previous post I said that I would be looking into using tick volume to create a new indicator, and this post is about the work I have done on this idea.

At first I tried creating a more traditional type of indicator using tick volumes separated out into buy and sell volumes, but I quickly felt that this was not a useful investment of my time so I gave up on this idea. Instead, I have come up with a way to plot tick volume levels that are similar to my previously discussed positionbook chart type, which I was forced to give up on because Oanda have discontinued the API endpoint for downloading the required data. An example of the the new, tick volume equivalent is shown below, followed by a brief description of the methodology used to create it.

Starting with the basics, if we imagine a Doji bar, for every up (down) tick there must be a corresponding and opposing down (up) tick for the bar to open and close at the exact same price and therefore we can split the tick volume for the bar equally between buy and sell tick volumes. Similarly, if we imagine a bar that opens on the low (high) and closes on the high (low), the number of ticks within the entire bar tick range can be ascribed to buy (sell) volume and the balance divided between buy and sell.

e.g. a bar opens at the low , closes at the high, with a tick range of 10 and total tick volume of 50 then:

buy tick volume = 10 + ( 50 - 10)/2 = 30
sell tick volume = 50 - buy tick volume = 20

This idea can be generalised to the range of a candlestick body being appropriately allocated to buy or sell tick volume, with the remaining balance of the total bar volume being equally allocated to buy and sell. OK, so far so good, and nothing particularly ground breaking. For the want of explaining it in a more precise manner, using the "geometry" of a bar to allocate buy and sell volumes is something that can be found online in the formulation of more than a few indicators.

The next step step is to "smear" these buy and sell volumes equally across the whole range of the bar and then take the difference:

e.g. "smeared" buy - "smeared" sell = 30 / 10 - 20 / 10 = 1, thus allocate a tick difference value of +1 for each tick level within the 10 tick range of the bar.

Of course, over a large (e.g. 10 minute bar) this wouldn't necessarily be very informative as it is known (my volume profile bars) that the volume is usually unevenly spread across the range of any given bar. The solution to this is to apply the above methodology to the smallest bar possible, and with Oanda the smallest possible bar download is a 5 second bar. Thus what I have done is apply the above to each 5 second bar within a given 10 minute bar period and then accumulate the buy/sell/tick difference values across the individual tick levels within the 10 minute bar. This gives tick differences values that approximate the differences between each bar's separate buy and sell volume profiles.

The final step is to volume normalise the above calculations by using the total 10 minute bar tick volume such that tick differences within bars that have higher total tick volume have a greater weight than those in low tick volume bars. This is simply done by setting the total bar volume as the numerator and the tick difference as the denominator:

e.g. a tick difference of 2 at a tick level within a bar with a total 10 tick volume will get the weight

10 / 2 = 5

whilst the same tick difference in a 50 tick volume bar will get the weight

50 / 2 = 25

Finally, all the above is plotted as the backgound heatmap to a candlestick chart, but with a slight twist - exponential forgetting is applied along each individual tick level within the y-axis range such that if an individual tick level only has one price bar spanning it the colour slowly fades as we move along the x-axis, whilst if this level is revisited, just as with an exponential moving average, the more recent data is accumulatively weighted more. For the above plot the exponential alpha value is set at the equivalent of a 144 bar exponential moving average, i.e. the number of 10 minute bars in a 24 hour period. Shorter moving average equivalents just increase the speed at which the forgetting takes place, leading to shorter lines extending to the right, but accentuate the differences between levels; e.g. the following is the same plot as above with the equivalent of a 14 bar exponential moving average alpha value.

Earlier in this post I alluded to the possibility of this type of tick difference chart being a replacement for my unwillingly and forcefully retired PositionBook chart type. The similarities/equivalences between the two chart types I now discuss:

With the old PositionBook (PB) chart, traders' net positions at any given level and at 20 minute snapshot frequency were explicitly given by API data download and changes between snapshots were inferred by an optimisation routine. With these new TickDifference (TD) charts, traders' net positions are inferred via the methodology described above, i.e. higher normalised tick volumes at different tick levels imply a higher, net trader positioning at these levels, and changes over time in this positioning are approximated by the exponential forgetting factor.

In terms of plotting, both in the PB and TD charts, the intensities of the colours (blue for longs and red for shorts) reflect the relative importances of long/short positioning at different levels: the greater the intensity, the greater the difference between long and short positioning.

I shall now enter into a period of observational study of the usefulness of this chart type because, as the chart is inherently visual, I can't imagine how I could effectively test it in a more traditional, back testing manner. If any reader could suggest how this more traditional approach might be done, I'm all ears.

Saturday, 5 April 2025

Use of HDF5 Format, and Some Charting Improvements

Over the last few weeks/months I have found it necessary to revisit the basic infrastructure of my trading/computing set up due to increasing slowness of the various computing routines I have running.

The first issue I am now addressing is how I store my data on disc. When I first started I opted for csv text files, mostly due to my ignorance of other possibilities at the time and the fact that I could visually inspect the files to manually check things. However, for my data retrieval and use needs this is now becoming too slow and burdensome and so I have made the decision to switch over to the HDF5 format and use the hdf5oct package for Octave for my data storage and file I/O needs. This dramatically speeds up data loading and will enable me to consolidate my disparate csv text files into individual tradable instrument HDF5 files, where all the data for the said tradable instrument is contained in a single HDF5 file. This data migration is an ongoing process that will continue for a few months, with the associated changes in my workflow, rewriting of some scripts and functions, cronjobs, etc.

The next thing I have done is slightly improved the calculation methodology and plotting of my volume profile bars, and the following chart shows the new version volume profile bars for the last three, 10 minute bars on the EUR-USD forex pair for trading on Friday, 4th April 2025,

whilst this second chart shows the equivalent time period with 5 second OHLC candles and the associated tick volumes for each bar.

The vertical green lines on this second chart delineate the 5 second bars into the corresponding 10 minute bars in the first chart. I think it is quite easy to visually see the correspondence between the 10 minute volume profile bars and the 5 second OHLC bars.

Another thing I also plan to do in the forthcoming weeks is to use the tick volume (as shown in the second chart above) to create a new type of indicator, but more on that in due course.

Monday, 30 December 2024

A "New" Way to Smooth Price

Below is code for an Octave compiled C++ .oct function to smooth price data.

#include "octave oct.h"
#include "octave dcolvector.h"
#include "octave dmatrix.h"
#include "GenFact.h"
#include "GramPoly.h"
#include "Weight.h"

DEFUN_DLD ( double_smooth_proj_2_5, args, nargout,
"-*- texinfo -*-\n\
@deftypefn {Function File} {} double_smooth_proj_2_5 (@var{input_vector})\n\
This function takes an input series and smooths it by projecting a 5 bar rolling linear fit\n\
3 bars into the future and using these 3 bars plus the last 3 bars of the rolling input\n\
to fit a FIR filter with a 2.5 bar lag from the last projected point, i.e. a 0.5 bar\n\
lead over the last actual rolling point in the series. This is averaged with the previously\n\
calculated such smoothed point for a theoretical zero-lagged smooth. This smooth is\n\
again smoothed as above for a smooth of the smooth, i.e. a double-smooth of the\n\
original input series. The double_smooth and smooth are the function outputs.\n\
@end deftypefn" )

{
octave_value_list retval_list ;
int nargin = args.length () ;

// check the input arguments
if ( nargin > 1 ) // there must be a single, input price vector
   {
   error ("Invalid arguments. Inputs are a single, input price vector.") ;
   return retval_list ;
   }

if ( args(0).length () < 5 )
   {
   error ("Invalid 1st argument length. Input is a price vector of length >= 5.") ;
   return retval_list ;
   }
// end of input checking

ColumnVector input = args(0).column_vector_value () ;
ColumnVector smooth = args(0).column_vector_value () ;
ColumnVector double_smooth = args(0).column_vector_value () ;

// create the fit coefficients matrix
int p = 5 ;             // the number of points in calculations
int m = ( p - 1 ) / 2 ; // value to be passed to call_GenPoly_routine_from_octfile
int n = 1 ;             // value to be passed to call_GenPoly_routine_from_octfile
int s = 0 ;             // value to be passed to call_GenPoly_routine_from_octfile

  // create matrix for fit coefficients
  Matrix fit_coefficients_matrix ( 2 * m + 1 , 2 * m + 1 ) ;
  // and assign values in loop using the Weight.h recursive Gram Polynomial C++ headers
  for ( int tt = -m ; tt < (m+1) ; tt ++ )
  {
        for ( int ii = -m ; ii < (m+1) ; ii ++ )
        {
        fit_coefficients_matrix ( ii + m , tt + m ) = Weight( ii , tt , m , n , s ) ;
        }
  }

  // create matrix for slope coefficients
  Matrix slope_coefficients_matrix ( 2 * m + 1 , 2 * m + 1 ) ;
  s = 1 ;
  // and assign values in loop using the Weight.h recursive Gram Polynomial C++ headers
  for ( int tt = -m ; tt < (m+1) ; tt ++ )
  {
        for ( int ii = -m ; ii < (m+1) ; ii ++ )
        {
        slope_coefficients_matrix ( ii + m , tt + m ) = Weight( ii , tt , m , n , s ) ;
        }
  }

  Matrix smooth_coefficients ( 1 , 5 ) ;
  // fill the smooth_coefficients matrix, smooth_coeffs = ( 9/24 ) .* fit_coeffs + ( 7/12 ) .* slope_coeffs + [ 0 ; 1/24 ; 1/8 ; 5/24 ; 1/4 ] ;
  smooth_coefficients( 0 , 0 ) = ( 9.0 / 24.0 ) * fit_coefficients_matrix( 0 , 4 ) + ( 7.0 / 12.0 ) * slope_coefficients_matrix( 0 , 4 ) ;
  smooth_coefficients( 0 , 1 ) = ( 9.0 / 24.0 ) * fit_coefficients_matrix( 1 , 4 ) + ( 7.0 / 12.0 ) * slope_coefficients_matrix( 1 , 4 ) + ( 1.0 / 24.0 ) ;
  smooth_coefficients( 0 , 2 ) = ( 9.0 / 24.0 ) * fit_coefficients_matrix( 2 , 4 ) + ( 7.0 / 12.0 ) * slope_coefficients_matrix( 2 , 4 ) + ( 1.0 / 8.0 ) ;
  smooth_coefficients( 0 , 3 ) = ( 9.0 / 24.0 ) * fit_coefficients_matrix( 3 , 4 ) + ( 7.0 / 12.0 ) * slope_coefficients_matrix( 3 , 4 ) + ( 5.0 / 24.0 ) ;
  smooth_coefficients( 0 , 4 ) = ( 9.0 / 24.0 ) * fit_coefficients_matrix( 4 , 4 ) + ( 7.0 / 12.0 ) * slope_coefficients_matrix( 4 , 4 ) + ( 1.0 / 4.0 ) ;

    for ( octave_idx_type ii (4) ; ii < args(0).length () ; ii++ )
        {

         smooth( ii ) = smooth_coefficients( 0 , 0 ) * input( ii - 4 ) + smooth_coefficients( 0 , 1 ) * input( ii - 3 ) + smooth_coefficients( 0 , 2 ) * input( ii - 2 ) +
                        smooth_coefficients( 0 , 3 ) * input( ii - 1 ) + smooth_coefficients( 0 , 4 ) * input( ii ) ;

         double_smooth( ii ) = smooth_coefficients( 0 , 0 ) * smooth( ii - 4 ) + smooth_coefficients( 0 , 1 ) * smooth( ii - 3 ) + smooth_coefficients( 0 , 2 ) * smooth( ii - 2 ) +
                               smooth_coefficients( 0 , 3 ) * smooth( ii - 1 ) + smooth_coefficients( 0 , 4 ) * smooth( ii ) ;

        }

retval_list( 0 ) = double_smooth ;
retval_list( 1 ) = smooth ;

return retval_list ;

} // end of function

Rather than describe it, I'll just paste the "help" description below:-

"This function takes an input series and smooths it by projecting a 5 bar rolling linear fit 3 bars into the future and using these 3 bars plus the last 3 bars of the rolling input to fit a FIR filter with a 2.5 bar lag from the last projected point, i.e. a 0.5 bar lead over the last actual rolling point in the series. This is averaged with the previously calculated such smoothed point for a theoretical zero-lagged smooth. This smooth is again smoothed as above for a smooth of the smooth, i.e. a double-smooth of the original input series. The double_smooth and smooth are the function outputs."

The above function calls previous code of mine to calculate savitzky golay filter convolution coefficients for internal calculations, but for the benefit of readers I will simply post the final coefficients for a 5 tap FIR filter.

-0.191667  -0.016667   0.200000   0.416667   0.591667

These coefficients are for a "single" smooth. Run the output from a function using these through the same function to get the "double" smooth.

Testing on a sinewave looks like this:-

where the cyan, green and blue filters are the original FIR filter with 2.5 bar lag, a 5 bar SMA and Ehler's super smooth (see code here) applied to sinewave "price" for comparative purposes. The red filter is my "double_smooth" and the magenta is a Jurik Moving Average using an Octave adaptation of code that is/was freely available on the Tradingview website. The parameters for this Jurik average (length, phase and power) were chosen to match, as closely as possible, those of the double_smooth for an apples to apples comparison.

I will not discuss this any further in this post other than to say I intend to combine this with the ideas contained in my new use for kalman filter post.

Friday, 27 September 2024

Discontinuation of Oanda's OrderBook and PositionBook Endpoints via the V20 Framework

Longtime readers of this blog are almost certainly aware that over the last few years I have posted several times about Oanda's OrderBook and PositionBook data and what can be done with it. My first post was back in February 2022 where I posited the idea of using this data as a sentiment indicator, whilst my most recent post, March 2024, talked about substituting the data into standard, volume based indicators. In between these two dates I blogged about using the data as features for machine learning (here and here), different methods of plotting it (here with example trade and here) and an improved, associated optimisation method here.

Researching and posting about this has been interesting and I was quietly confident that there was some real value to be found doing this. However, I have recently been unpleasantly surprised and disappointed to learn (by way of my API cronjob downloading routines suddenly failing) that Oanda has decided to no longer make available the ability to download this data via their V20 API Framework. So, at a stroke, all of the above work has suddenly become redundant and effectively useless for back testing purposes or for future trading purposes.

Did I say I was disappointed? Well, that understates it somewhat! I have written to Oanda to express my displeasure at this recent change and perhaps, fingers crossed, they will reinstate this V20 functionality.

Tuesday, 18 June 2024

Downloading Dukascopy Tick Data with Node Library

As part of my investigations into forex news trading I have found it necessary to obtain forex tick level data for back testing purposes and below I provide code to achieve this using Dukascopy's Node library, being called from Octave and using some system calls. A useful youtube video about the Dukascopy Node library will give readers some background information.

function [ first_days , last_days ] = first_and_last_weekday_of_month( y )
  t1 = datenum( [ y , 1 , 1 , 0 , 0 , 0 ] ) ;
  t2 = datenum( [ y , 12 , 31 , 0 , 0 , 0 ] ) ;
  t  = datevec( t1 : t2 ) ;
  delete_ix = strcmp( 'Saturday' , datestr( t , 'dddd' ) ) ; % find all Saturdays
  t( delete_ix , : ) = [] ;
  delete_ix = strcmp( 'Sunday' , datestr( t , 'dddd' ) ) ; % find all Sundays
  t( delete_ix , : ) = [] ;
  first_day_ix = find( diff( [ 1 ; t( : , 2 ) ] ) > 0 ) ;
  first_days = [ t( 1 , : ) ; t( first_day_ix , : ) ] ;
  last_day_ix = first_day_ix - 1 ; last_day_ix( last_day_ix <= 0 ) = [] ;
  last_days = [ t( last_day_ix , : ) ; t( end , : ) ] ;
endfunction

ii_vec = [ 2020 , 2021 , 2022 , 2023 , 2024 ] ;

for ii = ii_vec

[ first_days , last_days ] = first_and_last_weekday_of_month( ii ) ;

  for jj = 1 : 12
  cd /path/to/folder ;
  command = [ 'npx dukascopy-node -i eurusd -from ' , datestr( first_days( jj , : ) , 29 ) , ' -to ' , datestr( last_days( jj , : ) , 29 ) , ...
               ' --timeframe tick --format csv --retries 5 --directory eurusd --date-format "YYYY-MM-DD HH:mm:ss:SSS" ' ] ;
  system( command ) ;

  cd /path/to/folder/eurusd ;

  ## delete header
  command = [ "sed -i '/timestamp,askPrice,bidPrice/d' eurusd-tick-" , datestr( first_days( jj , : ) , 29 ) , "-" , datestr( last_days( jj , : ) , 29 ) , ".csv" ] ;
  system( command ) ;

  FID = fopen( 'eurusd-tick-2015-10-02-2015-10-03.csv' , 'r' ) ;
  FID = fopen( [ 'eurusd-tick-' , datestr( first_days( jj , : ) , 29 ) , '-' , datestr( last_days( jj , : ) , 29 ) , '.csv' ] , 'r' ) ;
  sizeM = [ 9 , Inf ] ;
  M = fscanf( FID , "%4d-%2d-%2d %2d:%2d:%2d:%3d,%f,%f" , sizeM )' ;
  fclose( FID ) ;

  save( '-binary' , [ 'eurusd-' , num2str( ii ) , '-' , num2str( first_days( jj , 2 ) ) , '.bin' ] , 'M' ) ;
  delete( 'eurusd-tick-*' ) ;

  clear -x ii_vec ii jj first_days last_days first_and_last_weekday_of_month

  endfor ## jj loop

clear -x ii_vec ii first_and_last_weekday_of_month

endfor ## ii_vec

The result of running the above code results in a folder full of tick data saved in Octave's native binary format, one file per month, with each file's name being descriptive of the data contained within.

I hope readers will find this useful.

Pages