Dekalog Blog: VWAP

Showing posts with label VWAP. Show all posts

Friday, 27 August 2021

Another Iterative Improvement of my Volume/Market Profile Charts

Below is a screenshot of this new chart version, of today's (Friday's) price action at a 10 minute bar scale:

Just by looking at the chart it might not be obvious to readers what has changed, so the changes are detailed below.

The first change is in how the volume profile (the horizontal histogram on the left) is calculated. The "old" version of the chart calculates the profile by assuming the "model" that tick volume for each 10 minute bar is normally distributed across the high/low range of the bar, and then the profile histogram is the accumulation of these individual, 10 minute, normally distributed "mini profiles." A more complete description of this is given in my Market Profile Chart in Octave blog post, with code.

The new approach is more data centric rather than model based. Every 10 minutes, instead of downloading the 10 minute OHLC and tick volume, the last 10 minutes worth of 5 second OHLC and tick volume is downloaded. The whole tick volume of each 5 second period is assigned to a price level equivalent to the Typical price (rounded to the nearest pip) of said 5 second period, and the volume profile is then the accumulation of these volume ticks per price level. I think this is a much more accurate reflection of the price levels at which tick volume actually occurred compared to the old, model based charts. This second screenshot is of the old chart over the exact same price data as the first, improved version of the chart.

It can be seen that the two volume profile histograms of the respective charts differ from each other in terms of their overall shape and the number and price levels of peaks (Points of Control) and troughs (Low Volume Nodes).

The second change in the new chart is in how the background heatmap is plotted. The heatmap is a different presentation of the volume profile whereby higher volume price levels are shown by the brighter yellow colours. The old chart only displays the heatmap associated with the latest calculated volume profile histogram, which is projected back in time. This is, of course, a form of lookahead bias when plotting past prices over the latest heatmap. The new chart solves this by plotting a "rolling" version of the heatmap which reflects the volume profile that was in force at the time each 10 minute OHLC candle formed. It is easy to see how the Points of Control and Low Volume Nodes price levels ebb and flow throughout the trading day.

The third change, which naturally followed on from the downloading of 5 second data, is in the plotting of the candlesticks. Rather than having a normal, open to close candlestick body, the candlesticks show the "mini volume profiles" of the tick volume within each bar, plotted via Octave's patch function. The white candlestick wicks indicate the usual high/low range, and the open and close levels are shown by grey and black dots respectively. This is more clearly seen in the zoomed in screenshot below.

I wanted to plot these types of bars because recently I have watched some trading webcasts, which talked about "P", "b" and "D" shaped bar profiles at "areas of interest." The upshot of these webcasts is that, in general, a "P" bar is bullish, a "b" is bearish and a "D" is "in balance" when they intersect an "area of interest" such as Point of Control, Low Volume Node, support and resistance etc. This is supposed to be indicative of future price direction over the immediate short term. With this new version of chart, I shall be in a position to investigate these claims for myself.

Wednesday, 5 November 2014

A New MFE/MAE Indicator.

After stopping my investigation of tools for spectral analysis over the last few weeks I have been doing another mooc, this time Learning from Data, and also working on the idea of one of my earlier posts.

In the above linked post there is a video showing the idea as a "paint bar" study. However, I thought it would be a good idea to render it as an indicator, the C++ Octave .oct code for which is shown in the code box below.

DEFUN_DLD ( adjustable_mfe_mae_from_open_indicator, args, nargout,
"-*- texinfo -*-\n\
@deftypefn {Function File} {} adjustable_mfe_mae_from_open_indicator (@var{open,high,low,close,lookback_length})\n\
This function takes four input series for the OHLC and a value for lookback length. The main outputs are\n\
two indicators, long and short, that show the ratio of the MFE over the MAE from the open of the specified\n\
lookback in the past. The indicators are normalised to the range 0 to 1 by a sigmoid function and a MFE/MAE\n\
ratio of 1:1 is shifted in the sigmoid function to give a 'neutral' indicator reading of 0.5. A third output\n\
is the max high - min low range over the lookback_length normalised by the range of the daily support and\n\
resistance levels S1 and R1 calculated for the first bar of the lookback period. This is also normalised to\n\
give a reading of 0.5 in the sigmoid function if the ratio is 1:1. The point of this third output is to give\n\
some relative scale to the unitless MFE/MAE ratio and to act as a measure of strength or importance of the\n\
MFE/MAE ratio.\n\
@end deftypefn" )

{
octave_value_list retval_list ;
int nargin = args.length () ;

// check the input arguments
if ( nargin != 5 )
   {
   error ( "Invalid arguments. Arguments are price series for open, high, low and close and value for lookback length." ) ;
   return retval_list ;
   }

if ( args(4).length () != 1 )
   {
   error ( "Invalid argument. Argument 5 is a scalar value for the lookback length." ) ;
   return retval_list ;
   }
   
   int lookback_length = args(4).int_value() ;

if ( args(0).length () &lt; lookback_length )
   {
   error ( "Invalid argument lengths. Argument lengths for open, high, low and close vectors should be >= lookback length." ) ;
   return retval_list ;
   }
   
if ( args(1).length () != args(0).length () )
   {
   error ( "Invalid argument lengths. Argument lengths for open, high, low and close vectors should be equal." ) ;
   return retval_list ;
   }
   
if ( args(2).length () != args(0).length () )
   {
   error ( "Invalid argument lengths. Argument lengths for open, high, low and close vectors should be equal." ) ;
   return retval_list ;
   }
   
if ( args(3).length () != args(0).length () )
   {
   error ( "Invalid argument lengths. Argument lengths for open, high, low and close vectors should be equal." ) ;
   return retval_list ;
   }   

if (error_state)
   {
   error ( "Invalid arguments. Arguments are price series for open, high, low and close and value for lookback length." ) ;
   return retval_list ;
   }
// end of input checking
  
// inputs
ColumnVector open = args(0).column_vector_value () ;
ColumnVector high = args(1).column_vector_value () ;
ColumnVector low = args(2).column_vector_value () ;
ColumnVector close = args(3).column_vector_value () ;
// outputs
ColumnVector long_mfe_mae = args(0).column_vector_value () ;
ColumnVector short_mfe_mae = args(0).column_vector_value () ;
ColumnVector range = args(0).column_vector_value () ;

// variables
double max_high = *std::max_element( &high(0), &high( lookback_length ) ) ;
double min_low = *std::min_element( &low(0), &low( lookback_length ) ) ;
double pivot_point = ( high(0) + low(0) + close(0) ) / 3.0 ;
double s1 = 2.0 * pivot_point - high(0) ;
double r1 = 2.0 * pivot_point - low(0) ;

for ( octave_idx_type ii (0) ; ii &lt; lookback_length ; ii++ ) // initial ii loop
    {

      // long_mfe_mae
      if ( open(0) > min_low ) // the "normal" situation
      {
 long_mfe_mae(ii) = 1.0 / ( 1.0 + exp( -( ( max_high - open(0) ) / ( open(0) - min_low ) - 1.0 ) ) ) ;
      }
      else if ( open(0) == min_low )
      {
 long_mfe_mae(ii) = 1.0 ;
      }
      else
      {
 long_mfe_mae(ii) = 0.5 ;
      }
           
      // short_mfe_mae
      if ( open(0) &lt; max_high ) // the "normal" situation
      {
 short_mfe_mae(ii) = 1.0 / ( 1.0 + exp( -( ( open(0) - min_low ) / ( max_high - open(0) ) - 1.0 ) ) ) ;
      }
      else if ( open(0) == max_high )
      {
 short_mfe_mae(ii) = 1.0 ;
      }
      else
      {
 short_mfe_mae(ii) = 0.5 ;
      }
      
      range(ii) = 1.0 / ( 1.0 + exp( -( ( max_high - min_low ) / ( r1 - s1 ) - 1.0 ) ) ) ;
      
    } // end of initial ii loop

for ( octave_idx_type ii ( lookback_length ) ; ii &lt; args(0).length() ; ii++ ) // main ii loop
    { 
    // assign variable values  
    max_high = *std::max_element( &high( ii - lookback_length + 1 ), &high( ii + 1 ) ) ;
    min_low = *std::min_element( &low( ii - lookback_length + 1 ), &low( ii + 1 ) ) ;
    pivot_point = ( high(ii-lookback_length) + low(ii-lookback_length) + close(ii-lookback_length) ) / 3.0 ;
    s1 = 2.0 * pivot_point - high(ii-lookback_length) ;
    r1 = 2.0 * pivot_point - low(ii-lookback_length) ;

      // long_mfe_mae
      if ( open( ii - lookback_length + 1 ) > min_low && open( ii - lookback_length + 1 ) &lt; max_high ) // the "normal" situation
      {
 long_mfe_mae(ii) = 1.0 / ( 1.0 + exp( -( ( max_high - open( ii - lookback_length + 1 ) ) / ( open( ii - lookback_length + 1 ) - min_low ) - 1.0 ) ) ) ;
      }
      else if ( open( ii - lookback_length + 1 ) == min_low )
      {
 long_mfe_mae(ii) = 1.0 ;
      }
      else
      {
 long_mfe_mae(ii) = 0.0 ;
      }
   
      // short_mfe_mae
      if ( open( ii - lookback_length + 1 ) > min_low && open( ii - lookback_length + 1 ) &lt; max_high ) // the "normal" situation
      {
 short_mfe_mae(ii) = 1.0 / ( 1.0 + exp( -( ( open( ii - lookback_length + 1 ) - min_low ) / ( max_high - open( ii - lookback_length + 1 ) ) - 1.0 ) ) ) ;
      }
      else if ( open( ii - lookback_length + 1 ) == max_high )
      {
 short_mfe_mae(ii) = 1.0 ;
      }
      else
      {
 short_mfe_mae(ii) = 0.0 ;
      }
      
      range(ii) = 1.0 / ( 1.0 + exp( -( ( max_high - min_low ) / ( r1 - s1 ) - 1.0 ) ) ) ;

    } // end of main ii loop

retval_list(2) = range ;    
retval_list(1) = short_mfe_mae ;
retval_list(0) = long_mfe_mae ;

return retval_list ;
  
} // end of function

The way to interpret this is as follows:

if the "long" indicator reading is above 0.5, go long
if the "short" is above 0.5, go short
if both are below 0.5, go flat

An alternative, if the indicator reading is flat, is to maintain any previous non flat position. I won't show a chart of the indicator itself as it just looks like a very noisy oscillator, but the equity curve(s) of it, without the benefit of foresight, on the EURUSD forex pair are shown below.

The yellow equity curve is the cumulative, close to close, tick returns of a buy and hold strategy, the blue is the return going flat when indicated, and the red maintaining the previous position when flat is indicated. Not much to write home about. However, this second chart shows the return when one has the benefit of the "peek into the future" as discussed in my earlier post.

The colour of the curves are as before except for the addition of the green equity curve, which is the cumulative, vwap value to vwap value tick returns, a simple representation of what an equity curve with realistic slippage might look like. This second set of equity curves shows the promise of what could be achievable if a neural net to accurately predict future values of the above indicator can be trained. More in an upcoming post.

Friday, 30 March 2012

Kalman Filter Octave Coding Completed

I am pleased to say that the first phase of my Kalman filter coding, namely writing Octave code, is now complete. In doing so I have used/adapted code from the MATLAB toolbox available here. The second phase of coding, at some future date, will be to convert this code into a C++ .oct function. My code is a stripped down version of the 2D CWPA demo, which models price as a moving object with position and velocity, and which is described in detail with my model assumptions below.

The first thing I had to decide was what to actually model, and I decided on VWAP. The framework of the Kalman filter is that it tracks an underlying process that is not necessarily directly observable but for which measurements are available. VWAP calculated from OHLC bars fits this framework nicely. If one had access to high frequency daily tick data the VWAP could be calculated exactly, but since the only information available for my purposes is the daily OHLC, the daily OHLC approximation of VWAP is the observable measurement of the "unobservable" exact VWAP.

The next thing I considered was the measurement noise of the filter. Some algebraic manipulation of the VWAP approximation formula (see here) led me to choose two thirds (or 0.666) of the Hi-Lo range of the bar as the measurement noise associated with any single VWAP approximation, this being the maximum possible range of values that the VWAP can take given a bar's OHLC values.

Finally, for the process noise I employed a simple heuristic of the noise being half the bar to bar variation in successive VWAPs, the other half in this assumption being attributable to the process itself.

Having decided on the above the next step was to initialise the filter covariances, and to do this I decided to use the Median Absolute Deviation (MAD) of the noise processes as a consistent estimator of the standard deviation and use the scale factor of 1.4826 for normally distributed data (the Kalman filter assumes Gaussian noise) to calculate the noise variances (see this wiki for more details.) However, I had a concern with "look ahead bias" with this approach but a simple test dispelled these fears. This code box


   1279.9   1279.9   1279.9   1279.9   1279.9   1279.9   1279.9   1279.9
   1284.4   1284.4   1284.4   1284.4   1284.4   1284.4   1284.4   1284.4
   1284.0   1284.0   1284.0   1284.0   1284.0   1284.0   1284.0   1284.0
   1283.3   1283.3   1283.3   1283.3   1283.3   1283.3   1283.3   1283.3
   1288.2   1288.2   1288.2   1288.2   1288.2   1288.2   1288.2   1288.2
   1298.8   1298.7   1298.7   1298.8   1298.7   1298.7   1298.7   1298.7
   1305.0   1305.0   1305.0   1305.0   1305.0   1305.0   1305.0   1305.0
   1306.1   1306.2   1306.2   1306.1   1306.2   1306.2   1306.2   1306.2
   1304.9   1305.0   1305.0   1304.9   1305.0   1305.0   1305.0   1305.0
   1308.3   1308.3   1308.3   1308.3   1308.3   1308.3   1308.3   1308.3
   1312.0   1312.0   1312.0   1312.0   1312.0   1312.0   1312.0   1312.0
   1309.1   1309.1   1309.1   1309.1   1309.1   1309.1   1309.1   1309.1
   1304.3   1304.3   1304.3   1304.3   1304.3   1304.3   1304.3   1304.3
   1302.3   1302.3   1302.3   1302.3   1302.3   1302.3   1302.3   1302.3
   1306.5   1306.5   1306.5   1306.5   1306.4   1306.4   1306.4   1306.4
   1314.6   1314.5   1314.5   1314.6   1314.5   1314.5   1314.5   1314.5
   1325.1   1325.0   1325.0   1325.1   1325.0   1325.0   1325.0   1325.0
   1332.7   1332.7   1332.7   1332.7   1332.7   1332.7   1332.7   1332.7
   1336.7   1336.8   1336.8   1336.7   1336.8   1336.8   1336.8   1336.8
   1339.7   1339.8   1339.8   1339.7   1339.8   1339.8   1339.8   1339.8
   1341.6   1341.7   1341.7   1341.6   1341.7   1341.7   1341.7   1341.7
   1338.3   1338.4   1338.4   1338.3   1338.4   1338.4   1338.4   1338.4
   1340.6   1340.6   1340.6   1340.6   1340.6   1340.6   1340.6   1340.6
   1341.1   1341.1   1341.1   1341.1   1341.1   1341.1   1341.1   1341.1
   1340.4   1340.4   1340.4   1340.4   1340.3   1340.3   1340.3   1340.3
   1341.3   1341.3   1341.3   1341.3   1341.3   1341.3   1341.3   1341.3
   1349.7   1349.7   1349.7   1349.7   1349.6   1349.6   1349.6   1349.6
   1357.6   1357.6   1357.6   1357.6   1357.5   1357.5   1357.5   1357.5
   1355.2   1355.3   1355.3   1355.2   1355.3   1355.3   1355.3   1355.3
   1353.6   1353.6   1353.6   1353.6   1353.6   1353.6   1353.6   1353.6
   1356.6   1356.6   1356.6   1356.6   1356.6   1356.6   1356.6   1356.6
   1358.2   1358.2   1358.2   1358.2   1358.2   1358.2   1358.2   1358.2
   1362.8   1362.7   1362.7   1362.8   1362.7   1362.7   1362.7   1362.7
   1362.7   1362.7   1362.7   1362.7   1362.7   1362.7   1362.7   1362.7
   1362.6   1362.6   1362.6   1362.6   1362.6   1362.6   1362.6   1362.6
   1365.1   1365.1   1365.1   1365.1   1365.1   1365.1   1365.1   1365.1
   1360.8   1360.9   1360.9   1360.8   1360.9   1360.9   1360.9   1360.9
   1348.8   1348.9   1348.9   1348.8   1348.9   1348.9   1348.9   1348.9
   1340.8   1340.8   1340.8   1340.8   1340.8   1340.8   1340.8   1340.8
   1349.0   1348.9   1348.9   1349.0   1348.9   1348.9   1348.9   1348.9
   1361.7   1361.6   1361.6   1361.7   1361.5   1361.5   1361.5   1361.5
   1368.0   1368.0   1368.0   1368.0   1367.9   1367.9   1367.9   1367.9
   1379.2   1379.2   1379.2   1379.2   1379.2   1379.2   1379.2   1379.2
   1390.3   1390.4   1390.4   1390.3   1390.4   1390.4   1390.4   1390.4
   1394.1   1394.2   1394.2   1394.1   1394.2   1394.2   1394.2   1394.2
   1397.7   1397.8   1397.8   1397.7   1397.8   1397.8   1397.8   1397.8
   1400.6   1400.6   1400.6   1400.6   1400.6   1400.6   1400.6   1400.6
   1400.8   1400.8   1400.8   1400.8   1400.8   1400.8   1400.8   1400.8
   1399.2   1399.2   1399.2   1399.2   1399.2   1399.2   1399.2   1399.2
   1393.2   1393.2   1393.2   1393.2   1393.2   1393.2   1393.2   1393.2
   1389.3   1389.3   1389.3   1389.3   1389.3   1389.3   1389.3   1389.3

shows the last 50 values of the Kalman filter with different amounts of data used for the calculations for the initialisation of the filter. The leftmost column shows filter values using all available data for initialisation, the next all data except the most recent 50 values, then all data except the most recent 100 values etc. with the rightmost column being calculated using all data except for the most recent 350 values. This last column is akin to using the data through to the end of 2010, and nothing after this date. Comparison between the left and rightmost columns shows virtually insignificant differences. If one were to begin trading the right hand edge of the chart today, initialisation would be done using all available data. If one then traded for the next one and a half years and then re-initialised the filter using all this "new" data, there would be no practical difference in the filter values over this one and a half year period. So, although there may be "look ahead bias," frankly it doesn't matter. Such is the power of robust statistics and the recursive calculations of the Kalman filter combined!

This next code box shows my Octave code for the Kalman filter

data = load("-ascii","esmatrix") ;
tick = 0.25 ;

n = length(data(:,4))
finish = input('enter finish, no greater than n  ')

if ( finish > length(data(:,4)) )
   finish = 0 % i.e. all available data is used
end

open = data(:,4) ;
high = data(:,5) ;
low = data(:,6) ;
close = data(:,7) ;
market_type = data(:,230) ;

clear data

vwap = round( ( ( open .+ close .+ ( (high .+ low) ./ 2 ) ) ./ 3 ) ./ tick) .* tick ;
vwap_process_noise = ( vwap .- shift(vwap,1) ) ./ 2.0 ;
median_vwap_process_noise = median(vwap_process_noise(2:end-finish,1)) ;
vwap_process_noise_deviations = vwap_process_noise(2:end-finish,1) .- median_vwap_process_noise ;
MAD_process_noise = median( abs( vwap_process_noise_deviations ) ) ;

% convert this to variance under the assumption of a normal distribution
std_vwap_noise = 1.4826 * MAD_process_noise ;
process_noise_variance = std_vwap_noise * std_vwap_noise 

measurement_noise = 0.666 .* ( high .- low ) ;
median_measurement_noise = median( measurement_noise(1:end-finish,1) ) ;
measurement_noise_deviations = measurement_noise(1:end-finish,1) .- median_measurement_noise ;
MAD_measurement_noise = median( abs( measurement_noise_deviations ) ) ;

% convert this to variance under the assumption of a normal distribution
std_measurement_noise = 1.4826 * MAD_measurement_noise ;
measurement_noise_variance = std_measurement_noise * std_measurement_noise

% Transition matrix for the continous-time system.
F = [0 0 1 0 0 0;
     0 0 0 1 0 0;
     0 0 0 0 1 0;
     0 0 0 0 0 1;
     0 0 0 0 0 0;
     0 0 0 0 0 0];

% Noise effect matrix for the continous-time system.
L = [0 0;
     0 0;
     0 0;
     0 0;
     1 0;
     0 1];

% Process noise variance
q = process_noise_variance ;
Qc = diag([q q]);

% Discretisation of the continuous-time system.
[A,Q] = lti_disc(F,L,Qc,1); % last item is dt stepsize set to 1

% Measurement model.
H = [1 0 0 0 0 0;
     0 1 0 0 0 0];

% Variance in the measurements.
r1 = measurement_noise_variance ;
R = diag([r1 r1]);

% Initial guesses for the state mean and covariance.
m = [0 vwap(1,1) 0 0 0 0]';
P = diag([0.1 0.1 0.1 0.1 0.5 0.5]) ;

% Space for the estimates.
MM = zeros(size(m,1), length(vwap));

% create vectors for eventual plotting
predict_plot = zeros(length(vwap),1) ;
MM_plot = zeros(length(vwap),1) ;
sigmaP_plus = zeros(length(vwap),1) ;
sigmaP_minus = zeros(length(vwap),1) ;

% Filtering steps.
for ii = 1:length(vwap)
   [m,P] = kf_predict(m,P,A,Q);

   predict_plot(ii,1) = m(2,1) ;

   [m,P] = kf_update(m,P,vwap(ii,:),H,R);
   MM(:,ii) = m;

   MM_plot(ii,1) = m(2,1) ;

   % sigmaP is for storing the current error covariance for plotting purposes
   sigmaP = sqrt(diag(P)) ; 
   sigmaP_plus(ii,1) = MM_plot(ii,1) + 2 * sigmaP(1) ;
   sigmaP_minus(ii,1) = MM_plot(ii,1) - 2 * sigmaP(1) ;
end

% output in terminal for checking purposes
kalman_last_50 = [kalman_last_50,MM_plot(end-50:end,1)] 

% output for plotting in Gnuplot
x_axis = ( 1:length(vwap) )' ;
A = [x_axis,open,high,low,close,vwap,MM_plot,sigmaP_plus,sigmaP_minus,predict_plot,market_type] ;
dlmwrite('my_cosy_kalman_plot',A)

Note that this code calls three functions; lti_disc, kf_predict and kf_update; which are part of the above mentioned MATLAB toolbox. If readers wish to replicate my results, they will have to download said toolbox and put these functions where they may be called by this script.

Below is a screen shot of my Kalman filter in action.

This shows the S & P E-mini contact (daily bars) up to a week or so ago. The white line is the Kalman filter, the dotted white lines are the plus and minus 2 sigma levels taken from the covariance matrix and the red and light blue triangles show the output of the kf_predict function, prior to being updated by the kf_update function, but only shown if above (red) or below (blue) the 2 sigma level. As can be seen, while price is obviously trending most points are with these levels. The colour coding of the bars is based upon the market type as determined by my Naive Bayesian Classifier, Mark 2.

This next screen shot

shows price bars immediately prior to the first screen shot where price is certainly not trending, and it is interesting to note that the kf_predict triangles are now appearing at the turns in price. This fact may mean that the kf_predict function might be a complementary indicator to my Perfect Oscillator function

and Delta

along with my stable of other turn indicators. The next thing I will have to do is come up with a robust rule set that combines all these disparate indicators into a coherent whole. Also, I am now going to use the Kalman filter output as the input to all my other indicators. Up till now I have been using the typical price; (High+Low+Close)/3; as my input but I think the Kalman filtered VWAP for "today's" price action is a much more meaningful price input than "tomorrow's" pivot point!

Sunday, 11 March 2012

Kalman Filter

Over the years, on and off, I have tried to find code or otherwise code for myself a Kalman filter but unfortunately I have never really found what I want; the best I have at the moment is an implementation that is available from the technical papers and seminars section at the MESA Software web page. However, I recently read this R-Bloggers post which inspired me to look again for code on the web, and this time I found this, which is exactly what I want; accessible Octave like code that will enable me to fully understand (I hope!) the theory behind the Kalman filter and to be able to code my own Kalman filter function. After a little tinkering with the code (mostly plotting and inputs) a typical script run produces this plot:

which is a plot of a sine wave where

red is the underlying price (sine wave plus noise); e.g. typical price, vwap, close etc.
the yellow dots are "measurement noise;" e.g. high-low range
cyan is the Kalman filter itself
green are the 2 Sigma confidence levels for the filter
magenta is my current "MESA" implementation

I particularly like this example script as it mirrors the approach I have taken in the past with regard to creating my "idealised" sine wave time series for development purposes. I think the screen shot speaks for itself; the Kalman filter seems uncannily accurate in filtering out the noise to get the "true" underlying signal, with almost no lag at all! I shall definitely be doing some work with this in the very near future.

Tuesday, 10 January 2012

Formula for Approximating VWAP from OHLC Data

An answer to another question on the Quantitative Finance forum here gives a formula that approximates the volume weighted average price (VWAP) from OHLC data only, with an exhortation in a comment to do one's own research before using it. This blog post is my quick attempt at such research.

My simple test uses the relationship between the VWAP and the Pivot point calculated as (H+L+C)/3. This wiki entry on pivot points states "Trading above or below the pivot point indicates the overall market sentiment" and my test of being above or below is the VWAP being above or below: i.e. if today's VWAP is above today's pivot point (calculated from yesterday's H,L & C) then this is s bullish sign and therefore we can expect tomorrow's VWAP to be higher than today's. The converse is true for bearish sentiment. In Octave it is trivially simple to write a script to test whether this hypothesis is true and the terminal output of the results is shown in the box below.

The first line identifies the market being tested (a forex pair or commodity), the second "sign_correct" is the percentage of time the hypothesis appears to be true: i.e. for the first AUSCAD entry the figure 67.228 means that 67.228 % of the time the following day's VWAP is actually higher/lower than today's according to the "prediction" based on the categorisation of today's bar being bullish or bearish. The third "result" is the p-value of a 5000 repetition Monte Carlo permutation test to test the statistical significance of the given % accuracy. The permutation test was the "sign" test described in Evidence Based Technical Analysis and in the PDF available from its companion website.

Given that all the p-values are zero it can be stated that the VWAP being above or below the pivot point has a non random predictive accuracy for the next day's VWAP being higher or lower to a degree of accuracy that is statistically significant.

auscad 
sign_correct =  65.695
result = 0
------------------ 
aususd 
sign_correct =  67.228
result = 0
------------------ 
ausyen 
sign_correct =  67.100
result = 0
------------------ 
bo 
sign_correct =  63.989
result = 0
------------------ 
c 
sign_correct =  63.530
result = 0
------------------ 
cc 
sign_correct =  60.565
result = 0
------------------ 
cl 
sign_correct =  63.318
result = 0
------------------ 
ct 
sign_correct =  60.273
results = 0
------------------ 
dx 
sign_correct =  63.811
result = 0
------------------ 
ed 
sign_correct =  63.945
results = 0
------------------ 
euraus 
sign_correct =  68.024
results = 0
------------------ 
eurcad 
sign_correct =  66.854
result = 0
------------------ 
eurchf 
sign_correct =  65.707
result = 0
------------------ 
eurgbp 
sign_correct =  66.760
result = 0
------------------ 
eurusd 
sign_correct =  65.544
result = 0
------------------ 
euryen 
sign_correct =  66.444
result = 0
------------------ 
fc 
sign_correct =  61.905
result = 0
------------------ 
gbpchf 
sign_correct =  67.497
result = 0
------------------ 
gbpusd 
sign_correct =  66.936
result = 0
------------------ 
gbpyen 
sign_correct =  66.936
result = 0
------------------ 
gc 
sign_correct =  60.667
result = 0
------------------ 
hg 
sign_correct =  58.554
result = 0
------------------ 
ho 
sign_correct =  62.685
result = 0
------------------ 
kc 
sign_correct =  61.732
result = 0
------------------ 
lb 
sign_correct =  61.765
result = 0
------------------ 
lc 
sign_correct =  62.372
result = 0
------------------ 
lh 
sign_correct =  61.601
result = 0
------------------ 
ng 
sign_correct =  62.356
result = 0
------------------ 
o 
sign_correct =  60.705
result = 0
------------------ 
oj 
sign_correct =  61.848
result = 0
------------------ 
pa 
sign_correct =  62.497
result = 0
------------------ 
pb 
sign_correct =  59.116
result = 0
------------------ 
pl 
sign_correct =  60.737
result = 0
------------------ 
rb 
sign_correct =  63.107
result = 0
------------------ 
s 
sign_correct =  64.091
result = 0
------------------ 
sb 
sign_correct =  61.106
result = 0
------------------ 
si 
sign_correct =  59.563
result = 0
------------------ 
sm 
sign_correct =  63.810
result = 0
------------------ 
sp 
sign_correct =  66.954
result = 0
------------------ 
es 
sign_correct =  66.744
result = 0
------------------ 
nd 
sign_correct =  66.221
result = 0
------------------ 
ty 
sign_correct =  65.260
result = 0
------------------ 
us 
sign_correct =  65.893
result = 0
------------------ 
usdcad 
sign_correct =  67.357
result = 0
------------------ 
usdchf 
sign_correct =  67.088
result = 0
------------------ 
usdyen 
sign_correct =  66.947
result = 0
------------------ 
w 
sign_correct =  65.118
result = 0

At the moment I use the value given by the pivot point calculation as my "price input" for the day but, based on these test results and the future results of some other tests I can think of, I may change to the VWAP approximation as my price input. More in a future post.

Pages