Showing posts with label Digital Signal Processing. Show all posts
Showing posts with label Digital Signal Processing. Show all posts

Monday, 30 December 2024

A "New" Way to Smooth Price

Below is code for an Octave compiled C++ .oct function to smooth price data.
#include "octave oct.h"
#include "octave dcolvector.h"
#include "octave dmatrix.h"
#include "GenFact.h"
#include "GramPoly.h"
#include "Weight.h"

DEFUN_DLD ( double_smooth_proj_2_5, args, nargout,
"-*- texinfo -*-\n\
@deftypefn {Function File} {} double_smooth_proj_2_5 (@var{input_vector})\n\
This function takes an input series and smooths it by projecting a 5 bar rolling linear fit\n\
3 bars into the future and using these 3 bars plus the last 3 bars of the rolling input\n\
to fit a FIR filter with a 2.5 bar lag from the last projected point, i.e. a 0.5 bar\n\
lead over the last actual rolling point in the series. This is averaged with the previously\n\
calculated such smoothed point for a theoretical zero-lagged smooth. This smooth is\n\
again smoothed as above for a smooth of the smooth, i.e. a double-smooth of the\n\
original input series. The double_smooth and smooth are the function outputs.\n\
@end deftypefn" )

{
octave_value_list retval_list ;
int nargin = args.length () ;

// check the input arguments
if ( nargin > 1 ) // there must be a single, input price vector
   {
   error ("Invalid arguments. Inputs are a single, input price vector.") ;
   return retval_list ;
   }

if ( args(0).length () < 5 )
   {
   error ("Invalid 1st argument length. Input is a price vector of length >= 5.") ;
   return retval_list ;
   }
// end of input checking

ColumnVector input = args(0).column_vector_value () ;
ColumnVector smooth = args(0).column_vector_value () ;
ColumnVector double_smooth = args(0).column_vector_value () ;

// create the fit coefficients matrix
int p = 5 ;             // the number of points in calculations
int m = ( p - 1 ) / 2 ; // value to be passed to call_GenPoly_routine_from_octfile
int n = 1 ;             // value to be passed to call_GenPoly_routine_from_octfile
int s = 0 ;             // value to be passed to call_GenPoly_routine_from_octfile

  // create matrix for fit coefficients
  Matrix fit_coefficients_matrix ( 2 * m + 1 , 2 * m + 1 ) ;
  // and assign values in loop using the Weight.h recursive Gram Polynomial C++ headers
  for ( int tt = -m ; tt < (m+1) ; tt ++ )
  {
        for ( int ii = -m ; ii < (m+1) ; ii ++ )
        {
        fit_coefficients_matrix ( ii + m , tt + m ) = Weight( ii , tt , m , n , s ) ;
        }
  }

  // create matrix for slope coefficients
  Matrix slope_coefficients_matrix ( 2 * m + 1 , 2 * m + 1 ) ;
  s = 1 ;
  // and assign values in loop using the Weight.h recursive Gram Polynomial C++ headers
  for ( int tt = -m ; tt < (m+1) ; tt ++ )
  {
        for ( int ii = -m ; ii < (m+1) ; ii ++ )
        {
        slope_coefficients_matrix ( ii + m , tt + m ) = Weight( ii , tt , m , n , s ) ;
        }
  }

  Matrix smooth_coefficients ( 1 , 5 ) ;
  // fill the smooth_coefficients matrix, smooth_coeffs = ( 9/24 ) .* fit_coeffs + ( 7/12 ) .* slope_coeffs + [ 0 ; 1/24 ; 1/8 ; 5/24 ; 1/4 ] ;
  smooth_coefficients( 0 , 0 ) = ( 9.0 / 24.0 ) * fit_coefficients_matrix( 0 , 4 ) + ( 7.0 / 12.0 ) * slope_coefficients_matrix( 0 , 4 ) ;
  smooth_coefficients( 0 , 1 ) = ( 9.0 / 24.0 ) * fit_coefficients_matrix( 1 , 4 ) + ( 7.0 / 12.0 ) * slope_coefficients_matrix( 1 , 4 ) + ( 1.0 / 24.0 ) ;
  smooth_coefficients( 0 , 2 ) = ( 9.0 / 24.0 ) * fit_coefficients_matrix( 2 , 4 ) + ( 7.0 / 12.0 ) * slope_coefficients_matrix( 2 , 4 ) + ( 1.0 / 8.0 ) ;
  smooth_coefficients( 0 , 3 ) = ( 9.0 / 24.0 ) * fit_coefficients_matrix( 3 , 4 ) + ( 7.0 / 12.0 ) * slope_coefficients_matrix( 3 , 4 ) + ( 5.0 / 24.0 ) ;
  smooth_coefficients( 0 , 4 ) = ( 9.0 / 24.0 ) * fit_coefficients_matrix( 4 , 4 ) + ( 7.0 / 12.0 ) * slope_coefficients_matrix( 4 , 4 ) + ( 1.0 / 4.0 ) ;

    for ( octave_idx_type ii (4) ; ii < args(0).length () ; ii++ )
        {

         smooth( ii ) = smooth_coefficients( 0 , 0 ) * input( ii - 4 ) + smooth_coefficients( 0 , 1 ) * input( ii - 3 ) + smooth_coefficients( 0 , 2 ) * input( ii - 2 ) +
                        smooth_coefficients( 0 , 3 ) * input( ii - 1 ) + smooth_coefficients( 0 , 4 ) * input( ii ) ;

         double_smooth( ii ) = smooth_coefficients( 0 , 0 ) * smooth( ii - 4 ) + smooth_coefficients( 0 , 1 ) * smooth( ii - 3 ) + smooth_coefficients( 0 , 2 ) * smooth( ii - 2 ) +
                               smooth_coefficients( 0 , 3 ) * smooth( ii - 1 ) + smooth_coefficients( 0 , 4 ) * smooth( ii ) ;

        }

retval_list( 0 ) = double_smooth ;
retval_list( 1 ) = smooth ;

return retval_list ;

} // end of function

Rather than describe it, I'll just paste the "help" description below:-

"This function takes an input series and smooths it by projecting a 5 bar rolling linear fit 3 bars into the future and using these 3 bars plus the last 3 bars of the rolling input to fit a FIR filter with a 2.5 bar lag from the last projected point, i.e.  a 0.5 bar lead over the last actual rolling point in the series.  This is averaged with the previously calculated such smoothed point for a theoretical zero-lagged smooth.  This smooth is again smoothed as above for a smooth of the smooth, i.e.  a double-smooth of the original input series.  The double_smooth and smooth are the function outputs."

The above function calls previous code of mine to calculate savitzky golay filter convolution coefficients for internal calculations, but for the benefit of readers I will simply post the final coefficients for a 5 tap FIR filter.

-0.191667  -0.016667   0.200000   0.416667   0.591667

These coefficients are for a "single" smooth. Run the output from a function using these through the same function to get the "double" smooth.

Testing on a sinewave looks like this:-

where the cyan, green and blue filters are the original FIR filter with 2.5 bar lag, a 5 bar SMA and Ehler's super smooth (see code here) applied to sinewave "price" for comparative purposes. The red filter is my "double_smooth" and the magenta is a Jurik Moving Average using an Octave adaptation of code that is/was freely available on the Tradingview website. The parameters for this Jurik average (length, phase and power) were chosen to match, as closely as possible, those of the double_smooth for an apples to apples comparison.

I will not discuss this any further in this post other than to say I intend to combine this with the ideas contained in my new use for kalman filter post.

More in due course.

Monday, 21 December 2015

John Ehler's Sinewave Indicator Code

A reader recently inquired about my use of this indicator and so below I provide my Octave C++ .oct  version that I have been using for the past few years.
DEFUN_DLD ( sinewave_indicator, args, nargout )
{
octave_value_list retval_list ;
int nargin = args.length () ;
int vec_length = args(0).length () ;

// check the input argument
if ( nargin != 1 )
   {
   error ("Invalid argument. Input is a single price vector.") ;
   return retval_list ;
   }

if ( vec_length < 50 )
   {
   error ("Invalid argument. Input is a single price vector.") ;
   return retval_list ;
   }

if ( error_state )
   {
   error ("Invalid argument. Input is a single price vector.") ;
   return retval_list ;
   }
// end of input checking

// inputs
ColumnVector price = args(0).column_vector_value () ;

// outputs
ColumnVector sinewave( vec_length ) ;
ColumnVector sinewave_lead_1( vec_length ) ;
ColumnVector smoothperiod_out( vec_length ) ;
ColumnVector dcphase_vec( vec_length ) ;
ColumnVector sumperiod( vec_length ) ;
ColumnVector sum_period( vec_length ) ;
ColumnVector deltaphase( vec_length ) ;

// Declarations for calculations of period, phase & sine wave measurements
ColumnVector smooth( vec_length ) ;
ColumnVector period( vec_length ) ;          
ColumnVector smoothperiod( vec_length ) ; 
ColumnVector detrender( vec_length ) ; 
ColumnVector Q1( vec_length ) ; 
ColumnVector I1( vec_length ) ; 
ColumnVector jI( vec_length ) ; 
ColumnVector jQ( vec_length ) ; 
ColumnVector I2( vec_length ) ;  
ColumnVector Q2( vec_length ) ;  
ColumnVector sI2( vec_length ) ; 
ColumnVector sQ2( vec_length ) ; 
ColumnVector Re( vec_length ) ;
ColumnVector Im( vec_length ) ; 
ColumnVector sRe( vec_length ) ; 
ColumnVector sIm( vec_length ) ;  
int dcperiod ; 
double realpart ;
double imagpart ;  
double dcphase ;
double sum_deltaphase ;
int count ;

// unrolled loop to fill the first 5 elements of above calculation vectors ( unrolled for speed optimisation )
sinewave(0) = 0.0 ; sinewave(1) = 0.0 ; sinewave(2) = 0.0 ; sinewave(3) = 0.0 ; sinewave(4) = 0.0 ;
sinewave_lead_1(0) = 0.0 ; sinewave_lead_1(1) = 0.0 ; sinewave_lead_1(2) = 0.0 ; sinewave_lead_1(3) = 0.0 ; sinewave_lead_1(4) = 0.0 ;
smoothperiod_out(0) = 0.0 ; smoothperiod_out(1) = 0.0 ; smoothperiod_out(2) = 0.0 ; smoothperiod_out(3) = 0.0 ; smoothperiod_out(4) = 0.0 ;
dcphase_vec(0) = 0.0 ; dcphase_vec(1) = 0.0 ; dcphase_vec(2) = 0.0 ; dcphase_vec(3) = 0.0 ; dcphase_vec(4) = 0.0 ;

smooth(0) = 0.0 ; smooth(1) = 0.0 ; smooth(2) = 0.0 ; smooth(3) = 0.0 ; smooth(4) = 0.0 ;
period(0) = 0.0 ; period(1) = 0.0 ; period(2) = 0.0 ; period(3) = 0.0 ; period(4) = 0.0 ;            
smoothperiod(0) = 0.0 ; smoothperiod(1) = 0.0 ; smoothperiod(2) = 0.0 ; smoothperiod(3) = 0.0 ; smoothperiod(4) = 0.0 ;
detrender(0) = 0.0 ; detrender(1) = 0.0 ; detrender(2) = 0.0 ; detrender(3) = 0.0 ; detrender(4) = 0.0 ; 
Q1(0) = 0.0 ; Q1(1) = 0.0 ; Q1(2) = 0.0 ; Q1(3) = 0.0 ; Q1(4) = 0.0 ;
I1(0) = 0.0 ; I1(1) = 0.0 ; I1(2) = 0.0 ; I1(3) = 0.0 ; I1(4) = 0.0 ; 
jI(0) = 0.0 ; jI(1) = 0.0 ; jI(2) = 0.0 ; jI(3) = 0.0 ; jI(4) = 0.0 ;
jQ(0) = 0.0 ; jQ(1) = 0.0 ; jQ(2) = 0.0 ; jQ(3) = 0.0 ; jQ(4) = 0.0 ;
I2(0) = 0.0 ; I2(1) = 0.0 ; I2(2) = 0.0 ; I2(3) = 0.0 ; I2(4) = 0.0 ; 
Q2(0) = 0.0 ; Q2(1) = 0.0 ; Q2(2) = 0.0 ; Q2(3) = 0.0 ; Q2(4) = 0.0 ;
sI2(0) = 0.0 ; sI2(1) = 0.0 ; sI2(2) = 0.0 ; sI2(3) = 0.0 ; sI2(4) = 0.0 ;
sQ2(0) = 0.0 ; sQ2(1) = 0.0 ; sQ2(2) = 0.0 ; sQ2(3) = 0.0 ; sQ2(4) = 0.0 ;
Re(0) = 0.0 ; Re(1) = 0.0 ; Re(2) = 0.0 ; Re(3) = 0.0 ; Re(4) = 0.0 ;
Im(0) = 0.0 ; Im(1) = 0.0 ; Im(2) = 0.0 ; Im(3) = 0.0 ; Im(4) = 0.0 ;
sRe(0) = 0.0 ; sRe(1) = 0.0 ; sRe(2) = 0.0 ; sRe(3) = 0.0 ; sRe(4) = 0.0 ;
sIm(0) = 0.0 ; sIm(1) = 0.0 ; sIm(2) = 0.0 ; sIm(3) = 0.0 ; sIm(4) = 0.0 ;
            
 for ( octave_idx_type ii (5) ; ii < vec_length ; ii++ ) // Start the main loop
     {
       
     // smooth the price for hilbert calculations
     smooth(ii) = (4.0 * price(ii) + 3.0 * price(ii-1) + 2.0 * price(ii-2) + price(ii-3) ) / 10.0 ; 
     
     // Detrend the input
     detrender(ii) = (0.0962 * smooth(ii) + 0.5769 * smooth(ii-2) - 0.5769 * smooth(ii-4) - 0.0962 * smooth(ii-6)) * (0.075 * period(ii-1) + 0.54) ;

     // Compute  InPhase and Quadrature components 
     Q1(ii) = (0.0962 * detrender(ii) + 0.5769 * detrender(ii-2) - 0.5769 * detrender(ii-4) - 0.0962 * detrender(ii-6)) * (0.075 * period(ii-1) + 0.54) ;
     I1(ii) = detrender(ii-3) ;

     // Advance the phase of  I1 and Q1 by 90 degrees
     jI(ii) = (0.0962 * I1(ii) + 0.5769 * I1(ii-2) - 0.5769 * I1(ii-4) - 0.0962 * I1(ii-6)) * (0.075 * period(ii-1) + 0.54) ;
     jQ(ii) = (0.0962 * Q1(ii) + 0.5769 * Q1(ii-2) - 0.5769 * Q1(ii-4) - 0.0962 * Q1(ii-6)) * (0.075 * period(ii-1) + 0.54) ;

     // Phasor addition for 3 bar averaging
     I2(ii) = I1(ii) - jQ(ii) ;
     Q2(ii) = Q1(ii) + jI(ii) ;

     // Smooth the  I and Q components before applying the discriminator
     sI2(ii) = 0.2 * I2(ii) + 0.8 * sI2(ii-1) ;
     sQ2(ii) = 0.2 * Q2(ii) + 0.8 * sQ2(ii-1) ;

     // Homodyne Discriminator
     Re(ii) = sI2(ii) * sI2(ii-1) + sQ2(ii) * sQ2(ii-1) ;
     Im(ii) = sI2(ii) * sQ2(ii-1) - sQ2(ii) * sI2(ii-1) ;
     sRe(ii) = 0.2 * Re(ii) + 0.8 * sRe(ii-1) ;
     sIm(ii) = 0.2 * Im(ii) + 0.8 * sIm(ii-1) ; 

       if ( (sIm(ii) > 0.0 || sIm(ii) < 0.0) && (sRe(ii) > 0.0 || sRe(ii) < 0.0) )
       { 
       period(ii) = 360.0 / ( ((atan(sIm(ii) / sRe(ii))) * 180.0) / PI ) ;
       }
       else
       {
       period(ii) = period(ii-1) ;
       }

       if ( period(ii) > 1.5 * period(ii-1) )
       {
       period(ii) = 1.5 * period(ii-1) ;
       }

       if ( period(ii) < 0.67 * period(ii-1) )
       {
       period(ii) = 0.67 * period(ii-1) ;
       }

       if ( period(ii) < 6.0 )
       {
       period(ii) = 6.0 ;
       }

       if ( period(ii) > 50.0 )
       {
       period(ii) = 50.0 ;
       }
 
     period(ii) = 0.2 * period(ii) + 0.8 * period(ii-1) ;
     smoothperiod(ii) = 0.33 * period(ii) + 0.67 * smoothperiod(ii-1) ;

     // Compute Dominant Cycle
     dcperiod = int ( smoothperiod(ii) + 0.5 ) ;
     realpart = 0.0 ;
     imagpart = 0.0 ;
     dcphase = 0.0 ;

      for ( octave_idx_type jj (0) ; jj <= ( dcperiod - 1 ) ; jj++ )
          {
          realpart += sin( PI/180.0 * 360.0 * jj / dcperiod ) * ( smooth(ii-jj) ) ;
          imagpart += cos( PI/180.0 * 360.0 * jj / dcperiod ) * ( smooth(ii-jj) ) ;
          }

      if ( fabs( imagpart ) > 0.0 )
         {
         dcphase = atan( realpart / imagpart ) * 180.0 / PI ;
         }
      else if ( fabs( imagpart ) < 0.001 )
              {
              if ( realpart < 0.0 )
                 {
                 dcphase -= 90.0 ;
                 }
              else if ( realpart > 0.0 )
                      {
                      dcphase += 90.0 ;
                      }
              }
         dcphase += 90.0 ;

      // Compensate for one bar lag of the 4 bar weighted moving average
      dcphase += 360.0 / smoothperiod(ii) ;

      if ( imagpart < 0.0 )
         dcphase += 180.0 ;

      if ( dcphase > 315.0 )
         dcphase -= 360.0 ;
     
     // phase output 
     dcphase_vec(ii) = dcphase ;
     
     //Now compute a differential phase, resolve phase wraparound, and limit delta phase errors
     deltaphase(ii) = dcphase_vec(ii) - dcphase_vec(ii-1) ;
     
     if ( dcphase_vec(ii-1) > 270.0 && dcphase_vec(ii) < 90.0 )
        {
 deltaphase(ii) = 360.0 - dcphase_vec(ii-1) + dcphase_vec(ii) ;
 }
 
     if ( deltaphase(ii) < 1.0 )
        { 
 deltaphase(ii) = 1.0 ;
 }
 
     if ( deltaphase(ii) > 60.0 )
        {
 deltaphase(ii) = 60.0 ;
 }

     // Sum Deltaphases to reach 360 degrees. The sum is the instantaneous period.
     sum_period(ii) = 0.0 ;
     sum_deltaphase = 0.0 ;
     count = 0 ;
     
     while ( sum_deltaphase < 360.0 ) 
           {
           sum_deltaphase += deltaphase(ii-count) ;
           count ++ ;
    sum_period(ii) = count ;
           } 

      // Resolve Instantaneous Period errors and smooth
      if ( sum_period(ii) == 0.0 )
         {
  sum_period(ii) = sum_period(ii-1) ;
  }
  
      sumperiod(ii) = 0.25 * sum_period(ii) + 0.75 * sum_period(ii-1) ;

     // sinewave output
     sinewave(ii) = sin( dcphase * PI / 180.0 ) ;
     
     // one bar leading function
     sinewave_lead_1(ii) = sin( ( dcphase + 360.0 / smoothperiod(ii) ) * PI / 180.0 ) ;
     
     // period output
     smoothperiod_out(ii) = floor ( smoothperiod(ii) + 0.5 ) ;
        
     } // end of main ii loop
      
 retval_list(3) = dcphase_vec ;
 retval_list(2) = smoothperiod_out ;
 retval_list(1) = sinewave_lead_1 ;                                                                  
 retval_list(0) = sinewave ;

return retval_list ; 
                                                                       
} // end of function
This is a straightforward conversion of the code available from here. A nice intro to how it can be used is here and Ehler's own website can be found here

Tuesday, 13 October 2015

Giving up on Runge-Kutta Methods (for now?)

Over the last few weeks I have been looking at using Runge-Kutta methods for the creation of features, but I have decided to give up on this for now simply because I think I have found a better way to accomplish what I want. I was alerted to this possible approach by this post over at http://dsp.stackexchange.com/ and following up on this I remembered that a few years ago I coded up John Ehler's Linear Prediction Filter (the white paper for which might still be available here) and my crudely transcribed code given below:
 DEFUN_DLD (linearpredict, args, , "Help String")
{
octave_value retval;

 ColumnVector a = args(0).column_vector_value ();                                   // The input price                          
 ColumnVector b(a);                                                                 // This will be the output column vector returned to Octave by "retval"
 const int n = args(1).int_value();
 int lngth = 10;                                                                 
 
 double g[30];
 int zz;
 for (zz = 0; zz < 30; zz++)
     g[zz] = 0.0;

 double sigPredict[30];
 for (zz = 0; zz < 30; zz++)
     sigPredict[zz] = 0.0;

 double sigPower = 0.0;
 double mu = 0.0;
 double xBar = 0.0;
 int jj = 0;                                                          
 for (octave_idx_type ii (10); ii < a.length="" 10="" a="" average="" factor="" for="" href="https://draft.blogger.com/null" if="" ii="" jj="" lngth="" loop="" normalization="" ompute="" onvergence="" power="" sigpower="" start="" the=""> 0)
         mu = 0.25 / (sigPower * 10);

      //Compute signal estimate
      xBar = 0;
      for (jj = 1; jj <= lngth; jj++)
          xBar = xBar + a(ii - jj) * g[jj];

     //Compute gain coefficients
     for (jj = 1; jj <= lngth; jj++)
         g[jj] = g[jj] + (mu * (a(ii) - xBar) * a(ii - jj));

     //Compute signal prediction waveform
     for (jj = 0; jj <= lngth; jj++)
         sigPredict[jj] = a(ii - (10 - jj));

     //Extend signal prediction into the future
     int kk = 0;
     for (jj = lngth + 1; jj <= lngth + 5; jj++)
     {
         sigPredict[jj] = 0;

         for (kk = 1; kk <= lngth; kk++)
	     sigPredict[jj] = sigPredict[jj] + sigPredict[jj - kk] * g[kk];
     }

     b(ii) = sigPredict[lngth + n];

     }
 
retval = b;                                                                         // Assign the output column vector to the return value

return retval;                                                                       // Return the output to Octave
}
which is very similar in concept to Burg's method. I think some application of these methods show more promise than concentrating on my naive implementation of Runge-Kutta. 

Tuesday, 23 September 2014

High Resolution Tools for Spectral Analysis - Update

Following on from my initial enthusiasm for the code on the High Resolution Tools for Spectral Analysis page, I have say that I have been unable to get the code performing as I would like it for my intended application to price time series.

My original intent was to use the zero crossing period estimation function, the subject of my last few posts, to get a rough idea of the dominant cycle period and then use the most recent data in a rolling window of this length as input to the high resolution code. This approach, however, ran into problems.

Firstly, windows of just the dominant cycle length (approximately 10 to 30 data points only) would lead to all sorts of errors being thrown from the toolkit functions as well as core Octave functions, such as divide by zero warnings and cryptic error messages that even now I don't understand. My best guess here is that the amount of data available in such short windows is simply insufficient for the algorithm to work, in much the same way as the Fast Fourier transform may fail to work if given too little data that is not a power of 2 in length. It might be possible to substantially rewrite the relevant functions, but my understanding of the algorithm and the inner workings of Octave means this is well beyond my pay grade.

My second approach was to simply extend the amount of data available by using the Octave repmat function on the windowed data so that all the above errors disappeared. This had very hit and miss results - sometimes they were very accurate, other times just fair to middling, and on occasion just way off target. I suspect here that the problem is the introduction of signal artifacts via the use of the repmat function, which results in Aliasing of the underlying signal.

As a result I shall not continue with investigating this toolbox for now. Although I only investigated the use of the me.m function (there are other functions available) I feel that my time at the moment can be used more productively.

Friday, 19 September 2014

Inputs For Zero Crossing Function

Continuing with the work on the zero crossing function, for now I have decided to use a "roofing filter" to detrend and smooth the raw price data and then apply some simple cycle extractor code
DEFUN_DLD ( cycle_extractor, args, nargout,
"-*- texinfo -*-\n\
@deftypefn {Function File} {} cycle_extractor (@var{price})\n\
This function takes a single price vector input and outputs\n\
a vector of the cycle extracted from the input.\n\
@end deftypefn" )

{
octave_value_list retval_list ;
int nargin = args.length () ;
int vec_length = args(0).length () ;

// check the input argument
if ( nargin != 1 )
   {
   error ("Invalid argument. Input is a single price vector.") ;
   return retval_list ;
   }

if ( vec_length &lt; 50 )
   {
   error ("Invalid argument. Input is a single price vector.") ;
   return retval_list ;
   }

if ( error_state )
   {
   error ("Invalid argument. Input is a single price vector.") ;
   return retval_list ;
   }
// end of input checking

// inputs
ColumnVector price = args(0).column_vector_value () ;
ColumnVector cycle = args(0).column_vector_value () ; cycle(0) = 0.0 ; cycle(1) = 0.0 ;
double alpha = 0.07 ;

 for ( octave_idx_type ii (2) ; ii &lt; vec_length ; ii ++ ) // Start the main loop
     {
     cycle(ii) = ( 1.0 - 0.5 * alpha ) * ( 1.0 - 0.5 * alpha ) * ( price(ii) - 2.0 * price(ii-1) + price(ii-2) ) + 2.0 * ( 1.0 - alpha ) * cycle(ii-1) - ( 1.0 - alpha ) * ( 1.0 - alpha ) * cycle(ii-2) ;
     }
                                                                 
 retval_list(0) = cycle ;

return retval_list ; 
                                                                       
} // end of function
to create the inphase input for the zero cross function. For the imaginary or quadrature input I have decided to use the simple trigonometric identity of the derivative of a sine wave being the cosine (i.e. 90 degree phase lead), easily implemented using the bar to bar difference of the sine wave, or in our case the above simple cycle. I might yet change this, but for now it seems to work. The upper pane in the screen shot below shows the raw price in blue, the extracted cycle in black and the cycle derivative in red.
It can be seen that most of the time the cycle (inphase) either leads or is approximately in sync with the price, whilst the quadrature is nicely spaced between the zero crossings of the cycle.

The lower pane shows the measured periods as in the previous posts. The zero crossing measured periods are now a bit more erratic than before, but some simple tests show that, on average, the zero crossing measurement is consistently closer to the real period than the sine wave indicator period; however, this improvement cannot said to be statistically significant at any p-value.

Now I would like to introduce some other work I've been doing recently. For many years I have wanted to measure the instantaneous period using Maximum entropy spectral estimation, which has been popularised over the years by John Ehlers. Unfortunately I had never found any code in the public domain which I could use or adapt, until now this is. My discovery is High Resolution Tools For Spectral Analysis. This might not actually be the same as Ehlers' MESA, but it certainly covers the same general area and, joy of joys, it has freely downloadable MATLAB code along with an accessible description of the theoretical background along with some academic papers.

As is usual in situations like this, I have had to refactor some of the MATLAB code so that it can run in Octave without error messages; specifically this non Octave function
function [msg,A,B,C,D] = abcdchk(A,B,C,D)
%ABCDCHK Checks dimensional consistency of A,B,C,D matrices.
%   ERROR(ABCDCHK(A,B,C,D)) checks that the dimensions of A,B,C,D
%   are consistent for a linear, time-invariant system model.
%   An error occurs if the nonzero dimensions are not consistent.
%
%   [MSG,A,B,C,D] = ABCDCHK(A,B,C,D) also alters the dimensions
%   any 0-by-0 empty matrices to make them consistent with the others.

%   Copyright 1984-2001 The MathWorks, Inc.
%   $Revision: 1.21 $  $Date: 2001/04/15 11:59:11 $

if nargin &lt; 4, D = []; end
if nargin &lt; 3, C = []; end
if nargin &lt; 2, B = []; end
if nargin &lt; 1, A = []; end

[ma,na] = size(A);
[mb,nb] = size(B);
[mc,nc] = size(C);
[md,nd] = size(D);

if mc==0 & nc==0 & (md==0 | na==0)
   mc = md; nc = na; C = zeros(mc,nc);
end
if mb==0 & nb==0 & (ma==0 | nd==0)
   mb = ma; nb = nd; B = zeros(mb,nb);
end
if md==0 & nd==0 & (mc==0 | nb==0)
   md = mc; nd = nb; D = zeros(md,nd);
end
if ma==0 & na==0 & (mb==0 | nc==0)
   ma = mb; na = nc; A = zeros(ma,na);
end

if ma~=na & nargin>=1
   msg = 'The A matrix must be square';
elseif ma~=mb & nargin>=2
   msg = 'The A and B matrices must have the same number of rows.';
elseif na~=nc & nargin>=3
   msg = 'The A and C matrices must have the same number of columns.';
elseif md~=mc & nargin>=4
   msg = 'The C and D matrices must have the same number of rows.';
elseif nd~=nb & nargin>=4
   msg = 'The B and D matrices must have the same number of columns.';
else
   msg = '';
end
should be in the loadpath, and in the function file dlsim.m, on line 71, the call to the internal MATLAB function

x = ltitr( a , b , u , x0 ) ;

should be replaced by

x = lsimss  (a , b , [ ] , [ ] , -1 ) , u , [ ] , x0 ) ;
(thanks to Lukas for this)

Having made these changes, this much simplified demo.m script
%% SIGNAL = 2 x sinusoids + noise
%  Setting up the signal parameters and time history
N = 100 ;

mag0 = 1.8 ; mag1 = 1.5 ; o1 = 1.3 ; mag2 = 2 ; o2 = 1.35 ;

t = 0 : N - 1 ; t = t(:) ;

y0 = mag0 * randn(N,1) ;

y1 = mag1 * exp( 1i * ( o1 * t + 2 * pi * rand) ) ;

y2 = mag2 * exp( 1i * ( o2 * t + 2 * pi * rand )) ;

y = real( y0 .+ y1 .+ y2 ) ;

NN = 2048 ; th = linspace( 0, 2 * pi, NN ) ;

%% setting up filter parameters and the svd of the input-to-state response
thetamid = 1.325 ; 

[ A , B ] = rjordan( 5, 0.88 * exp( thetamid * 1i ) ) ;

%% obtaining state statistics
R = dlsim_real( A, B, y' ) ;

%% maximum entropy
spectrum = me( R, A, B, th ) ;

spectrum = spectrum / max( spectrum ) * 1.2 ;

figure(1) ; 
subplot(211) ; plot( real(y1), 'r', real(y2), 'm', y, 'b' ) ; legend( 'Underlying Sinewave 1', 'Underlying Sinewave 2', '2 Sinewaves Plus Noise' ) ; title( 'Noisy Price series' ) ;
subplot(212) ; plot( th(400:475), spectrum(400:475) ) ; title( 'The Spectrum - Peaks should be centred at 1.3 and 1.35' ) ;
produces this plot
The upper pane shows two sine waves (red and magenta), very close to each other in period and amplitude, combined with random noise to create price (in blue). Looking at just the blue price, it would seem to be almost impossible that there are in fact two sine waves hidden within noise in this short time series, yet the algorithm clearly picks these out as shown by the two peaks in the spectral plot in the lower pane. Powerful stuff!

It is my intention over the coming days to investigate using the zero crossing function to select the data length prior to using this spectral analysis to determine the instantaneous period of price.

Sunday, 6 April 2014

The Cauchy-Schwarz Inequality

In my previous post I said I was looking into my code for the dominant cycle, mostly with a view to either improving my code or perhaps replacing/augmenting it with some other method of calculating the cycle period. To this end I have recently enrolled on a discrete time signals and systems course offered by edx. One of the lectures was about the Cauchy-Schwarz inequality, which is what this post is about.

The basic use I have in mind is to use the inequality to select sections of price history that are most similar to one another and use these as training cases for neural net training. My initial Octave code is given in the code box below:-
clear all

% load price file of interest
filename = 'eurusdmatrix' ; %input( 'Enter filename for prices, e.g. es or esmatrix: ' , 's' ) ;
data = load( "-ascii" , filename ) ;

% get tick size
switch filename

case { "cc" }
tick = 1 ;

case { "gc" "lb" "pl" "sm" "sp" }
tick = 0.1 ;

case { "ausyen" "bo" "cl" "ct" "dx" "euryen" "gbpyen" "sb" "usdyen" }
tick = 0.01 ;

case { "c" "ng" }
tick = 0.001 ;

case { "auscad" "aususd" "euraus" "eurcad" "eurchf" "eurgbp" "eurusd" "gbpchf" "gbpusd" "ho" "rb" "usdcad" "usdchf" }
tick = 0.0001 ;

case { "c" "o" "s" "es" "nd" "w" }
tick = 0.25 ;

case { "fc" "lc" "lh" "pb" }
tick = 0.025 ;

case { "ed" }
tick = 0.0025 ;

case { "si" }
tick = 0.5 ;

case { "hg" "kc" "oj" "pa" }
tick = 0.05 ;

case { "ty" "us" }
tick = 0.015625 ;

case { "ccmatrix" }
tick = 1 ;

case { "gcmatrix" "lbmatrix" "plmatrix" "smmatrix" "spmatrix" }
tick = 0.1 ;

case { "ausyenmatrix" "bomatrix" "clmatrix" "ctmatrix" "dxmatrix" "euryenmatrix" "gbpyenmatrix" "sbmatrix" "usdyenmatrix" }
tick = 0.01 ;

case { "cmatrix" "ngmatrix" }
tick = 0.001 ;

case { "auscadmatrix" "aususdmatrix" "eurausmatrix" "eurcadmatrix" "eurchfmatrix" "eurgbpmatrix" "eurusdmatrix" "gbpchfmatrix" "gbpusdmatrix" "homatrix" "rbmatrix" "usdcadmatrix" "usdchfmatrix" }
tick = 0.0001 ;

case { "cmatrix" "omatrix" "smatrix" "esmatrix" "ndmatrix" "wmatrix" }
tick = 0.25 ;

case { "fcmatrix" "lcmatrix" "lhmatrix" "pbmatrix" }
tick = 0.025 ;

case { "edmatrix" }
tick = 0.0025 ;

case { "simatrix" }
tick = 0.5 ;

case { "hgmatrix" "kcmatrix" "ojmatrix" "pamatrix" }
tick = 0.05 ;

case { "tymatrix" "usmatrix" }
tick = 0.015625 ;

endswitch

open = data( : , 4 ) ;
high = data( : , 5 ) ;
low = data( : , 6 ) ;
close = data( : , 7 ) ;
period = data( : , 12 ) ;
price = vwap( open, high, low, close, tick ) ;
[ max_adj, min_adj, channel_price ] = adaptive_lookback_max_min( price, period, tick ) ;
smooth_price = smooth_2_5( price ) ;
[ max_adj, min_adj, smooth_channel_price ] = adaptive_lookback_max_min( smooth_price, period, tick ) ;

clear -exclusive channel_price smooth_channel_price period price

% randomly choose vwap prices to match
sample_index = randperm( size(channel_price,1), 1 )
lookback = period( sample_index )
sample_to_match = channel_price( sample_index-lookback : sample_index )' ;
sample_to_match_smooth = smooth_channel_price( sample_index-lookback : sample_index )' ;

cauchy_schwarz_values = zeros( size(channel_price,1) , 1 ) ;
cauchy_schwarz_values_smooth = zeros( size(channel_price,1) , 1 ) ;

for ii = 50 : size(channel_price,1)

% match_size = size( channel_price( ii-lookback : ii ) )

cauchy_schwarz_values(ii) = abs( sample_to_match * channel_price( ii-lookback : ii ) ) / ( norm(sample_to_match) * norm( channel_price( ii-lookback : ii , 1 ) ) ) ;
cauchy_schwarz_values_smooth(ii) = abs( sample_to_match_smooth * smooth_channel_price( ii-lookback : ii ) ) / ( norm(sample_to_match_smooth) * norm( smooth_channel_price( ii-lookback : ii , 1 ) ) ) ;

end

% now set the values for sample_to_match +/- 2 to zero to avoid matching with itself
cauchy_schwarz_values( sample_index-2 : sample_index+2 ) = 0.0 ;
cauchy_schwarz_values_smooth( sample_index-2 : sample_index+2 ) = 0.0 ;

N = 10 ; % must be >= 10

% get index values of the top N matches
matches = zeros( N, 1 ) ;
matches_smooth = zeros( N, 1 ) ;

% record these values
matches_values = zeros( N, 1 ) ;
matches_smooth_values = zeros( N, 1 ) ;

for ii = 1: N

[ max_val, ix ] = max( cauchy_schwarz_values ) ;
matches(ii) = ix ;
matches_values(ii) = cauchy_schwarz_values(ix) ;
cauchy_schwarz_values( ix-2 : ix+2 ) = 0.0 ;

[ max_val, ix ] = max( cauchy_schwarz_values_smooth ) ;
matches_smooth(ii) = ix ;
matches_smooth_values(ii) = cauchy_schwarz_values_smooth(ix) ;
cauchy_schwarz_values_smooth( ix-2 : ix+2 ) = 0.0 ;

end

% Plot for visual inspection
clf ;

% the matched index values
figure(1) ;
subplot(2,1,1) ; plot( cauchy_schwarz_values, 'c' ) ;
subplot(2,1,2) ; plot( cauchy_schwarz_values_smooth, 'c' ) ;
set( gcf() ,  'color' , [0 0 0] )

% the top N matched price sequences
figure(2) ;
subplot(5, 2, 1) ; plot( sample_to_match, 'c', channel_price( matches(1)-lookback : matches(1) ), 'r', channel_price( matches_smooth(1)-lookback : matches_smooth(1) ), 'y' ) ;
subplot(5, 2, 2) ; plot( sample_to_match, 'c', channel_price( matches(2)-lookback : matches(2) ), 'r', channel_price( matches_smooth(2)-lookback : matches_smooth(2) ), 'y' ) ;
subplot(5, 2, 3) ; plot( sample_to_match, 'c', channel_price( matches(3)-lookback : matches(3) ), 'r', channel_price( matches_smooth(3)-lookback : matches_smooth(3) ), 'y' ) ;
subplot(5, 2, 4) ; plot( sample_to_match, 'c', channel_price( matches(4)-lookback : matches(4) ), 'r', channel_price( matches_smooth(4)-lookback : matches_smooth(4) ), 'y' ) ;
subplot(5, 2, 5) ; plot( sample_to_match, 'c', channel_price( matches(5)-lookback : matches(5) ), 'r', channel_price( matches_smooth(5)-lookback : matches_smooth(5) ), 'y' ) ;
subplot(5, 2, 6) ; plot( sample_to_match, 'c', channel_price( matches(6)-lookback : matches(6) ), 'r', channel_price( matches_smooth(6)-lookback : matches_smooth(6) ), 'y' ) ;
subplot(5, 2, 7) ; plot( sample_to_match, 'c', channel_price( matches(7)-lookback : matches(7) ), 'r', channel_price( matches_smooth(7)-lookback : matches_smooth(7) ), 'y' ) ;
subplot(5, 2, 8) ; plot( sample_to_match, 'c', channel_price( matches(8)-lookback : matches(8) ), 'r', channel_price( matches_smooth(8)-lookback : matches_smooth(8) ), 'y' ) ;
subplot(5, 2, 9) ; plot( sample_to_match, 'c', channel_price( matches(9)-lookback : matches(9) ), 'r', channel_price( matches_smooth(9)-lookback : matches_smooth(9) ), 'y' ) ;
subplot(5, 2, 10) ; plot( sample_to_match, 'c', channel_price( matches(10)-lookback : matches(10) ), 'r', channel_price( matches_smooth(10)-lookback : matches_smooth(10) ), 'y' ) ;
set( gcf() ,  'color' , [0 0 0] )

figure(3)
subplot(5, 2, 1) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(1)-lookback : matches(1) ) ) ;
subplot(5, 2, 2) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(2)-lookback : matches(2) ) ) ;
subplot(5, 2, 3) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(3)-lookback : matches(3) ) ) ;
subplot(5, 2, 4) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(4)-lookback : matches(4) ) ) ;
subplot(5, 2, 5) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(5)-lookback : matches(5) ) ) ;
subplot(5, 2, 6) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(6)-lookback : matches(6) ) ) ;
subplot(5, 2, 7) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(7)-lookback : matches(7) ) ) ;
subplot(5, 2, 8) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(8)-lookback : matches(8) ) ) ;
subplot(5, 2, 9) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(9)-lookback : matches(9) ) ) ;
subplot(5, 2, 10) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches(10)-lookback : matches(10) ) ) ;

figure(4)
subplot(5, 2, 1) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(1)-lookback : matches_smooth(1) ) ) ;
subplot(5, 2, 2) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(2)-lookback : matches_smooth(2) ) ) ;
subplot(5, 2, 3) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(3)-lookback : matches_smooth(3) ) ) ;
subplot(5, 2, 4) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(4)-lookback : matches_smooth(4) ) ) ;
subplot(5, 2, 5) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(5)-lookback : matches_smooth(5) ) ) ;
subplot(5, 2, 6) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(6)-lookback : matches_smooth(6) ) ) ;
subplot(5, 2, 7) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(7)-lookback : matches_smooth(7) ) ) ;
subplot(5, 2, 8) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(8)-lookback : matches_smooth(8) ) ) ;
subplot(5, 2, 9) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(9)-lookback : matches_smooth(9) ) ) ;
subplot(5, 2, 10) ; plotyy( (1:1:length(price( sample_index-lookback : sample_index ))) , price( sample_index-lookback : sample_index ) , (1:1:length(price( sample_index-lookback : sample_index ))) , price( matches_smooth(10)-lookback : matches_smooth(10) ) ) ;

% Print results to terminal
results = zeros( N , 2 ) ;

for ii = 1 : N
results(ii,1) = distcorr( sample_to_match', channel_price( matches(ii)-lookback : matches(ii) ) ) ;
results(ii,2) = distcorr( sample_to_match', channel_price( matches_smooth(ii)-lookback : matches_smooth(ii) ) ) ;
end

results = [ matches_values matches_smooth_values results ]
After some basic "housekeeping" code to load the price file of interest and normalise the prices, a random section of the price history is selected and then, in a loop, the top N matches in the history are found using the inequality as the metric for matching. A value of 0 means that the price series being compared are orthogonal, and hence as dissimilar to each other as possible, whilst a value of 1 means the opposite. There are two types of matching; the raw price matched with raw price, and a smoothed price matched with smoothed price.

First off, although the above code randomly selects a section of price history to match, I deliberately hand chose a section to match for illustrative purposes in this post. Below is the section
where the section ends at the point where the vertical cursor crosses the price and begins at the high just below the horizontal cursor, for a look back period of 16 bars. For context, here is a zoomed out view.
I chose this section because it represents a "difficult" set of prices, i.e. moving sideways at the end of a retracement and perhaps reacting to a previous low acting as resistance, as well as being in a Fibonacci retracement zone.  

The first set of code outputs is this chart
which shows the Cauchy-Schwarz values for the whole range of the price series, with the upper pane being values for the raw price matching and the lower pane being the smoothed price matching. Note that in the code the values are set to zero after the max function has selected the best match and so the spikes down to zero show the points in time where the top N, in this case 10, matches were taken from.

The next chart output shows the the normalised prices that the matching is done against, with the cyan being the original sample (the same in all subplots), the red being the raw price matches and the yellow being the smoothed price matches.
The closest match is the top left subplot, and then reading horizontally and down to the 10th best in the bottom right subplot.

The next plot shows the price matches un-normalised, for the raw price matching, with the original sample being blue,
and next for the smoothed matching,
and finally, side by side for easy visual comparison.
N.b. For all the smoothed plots above, although the matching is done on smoothed prices, the unsmoothed, raw prices for these matches are plotted.

After plotting all the above, the code prints to terminal some details thus:

lookback =  16
results =

   0.95859   0.98856   0.89367   0.86361
   0.95733   0.98753   0.93175   0.86839
   0.95589   0.98697   0.87398   0.67945
   0.95533   0.98538   0.85346   0.83079
   0.95428   0.98293   0.91212   0.77225
   0.94390   0.98292   0.79350   0.66563
   0.93908   0.98150   0.71753   0.77458
   0.93894   0.97992   0.86839   0.72492
   0.93345   0.97969   0.74456   0.79060
   0.93286   0.97940   0.86361   0.61103

which, column wise, are the Cauchy-Schwarz values for the raw price matching and the smoothed price matching, and the Distance correlation values for the raw price matching and the smoothed price matching respectively.

The code used to calculate the Distance correlation is given below.
% Copyright (c) 2013, Shen Liu
% All rights reserved.

% Redistribution and use in source and binary forms, with or without
% modification, are permitted provided that the following conditions are
% met:

%    * Redistributions of source code must retain the above copyright
%      notice, this list of conditions and the following disclaimer.
%    * Redistributions in binary form must reproduce the above copyright
%      notice, this list of conditions and the following disclaimer in
%      the documentation and/or other materials provided with the distribution

% THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
% AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
% IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
% ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
% LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
% CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
% SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
% INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
% CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
% ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
% POSSIBILITY OF SUCH DAMAGE.

function dcor = distcorr(x,y)

% This function calculates the distance correlation between x and y.
% Reference: http://en.wikipedia.org/wiki/Distance_correlation 
% Date: 18 Jan, 2013
% Author: Shen Liu (shen.liu@hotmail.com.au)

% Check if the sizes of the inputs match
if size(x,1) ~= size(y,1) ;
    error('Inputs must have the same number of rows')
end

% Delete rows containing unobserved values
N = any([isnan(x) isnan(y)],2) ;
x(N,:) = [] ;
y(N,:) = [] ;

% Calculate doubly centered distance matrices for x and y
a = pdist([x,x]) ; % original MATLAB call is to pdist2( x, x )
mcol = mean(a) ;
mrow = mean(a,2) ;
ajbar = ones(size(mrow))*mcol ;
akbar = mrow*ones(size(mcol)) ;
abar = mean(mean(a))*ones(size(a)) ;
A = a - ajbar - akbar + abar ;

b = pdist([y,y]) ;
mcol = mean(b) ;
mrow = mean(b,2) ;
bjbar = ones(size(mrow))*mcol ;
bkbar = mrow*ones(size(mcol)) ;
bbar = mean(mean(b))*ones(size(b)) ;
B = b - bjbar - bkbar + bbar ;

% Calculate squared sample distance covariance and variances
dcov = sum(sum(A.*B))/(size(mrow,1)^2) ;

dvarx = sum(sum(A.*A))/(size(mrow,1)^2) ;
dvary = sum(sum(B.*B))/(size(mrow,1)^2) ;

% Calculate the distance correlation
dcor = sqrt(dcov/sqrt(dvarx*dvary)) ;
These results show promise, and I intend to apply a more rigorous test to them for the subject of a future post.

Friday, 7 October 2011

The Theoretically Perfect Moving Average and Oscillator

Readers of this blog will probably have noticed that I am partial to using concepts from digital signal processing in the development of my trading system. Recently I "rediscovered" this PDF, written by Tim Tillson, on Google Docs, which has a great opening introduction:

" 'Digital filtering includes the process of smoothing, predicting, differentiating,
integrating, separation of signals, and removal of noise from a signal. Thus many people
who do such things are actually using digital filters without realizing that they are; being
unacquainted with the theory, they neither understand what they have done nor the
possibilities of what they might have done.'

This quote from R. W. Hamming applies to the vast majority of indicators in technical
analysis."

The purpose of this blog post is to outline my recent work in applying some of the principles discussed in the linked PDF.

Long time readers of this blog may remember that back in April this year I abandoned work I was doing on the AFIRMA trend line because I was dissatisfied with the results. The code for this required projecting prices forward using my leading signal code, and I now find that I can reuse the bulk of this code to create a theoretically perfect moving average and oscillator. I have projected prices forward by 10 bars and then taken the average of the forwards and backwards exponential moving averages, as per the linked PDF, to create a series of averages that theoretically are in phase with the underlying price, and then taken the mean of all these averages to produce my "perfect" moving average. Similarly, I have done the same to create a "perfect" oscillator and the results are shown in the images below.
Sideways Market
Trending up Market
Trending down Market

The upper panes show "price" and its perfect moving average, the middle panes show the perfect oscillator, its one bar leading signal, and exponential standard deviation bands set at 1 standard deviation above and below an exponential moving average of the oscillator. The lower panes show the %B indicator of the oscillator within the bands, offset so that the exponential standard deviation bands are at 1 and -1 levels.

Even cursory examination of many charts such as these shows the efficacy of the signals generated by the crossovers of price with its moving average, the oscillator with its leading signal and the %B indicator crossing the 1 and -1 levels, when applied to idealised data. My next post will discuss the application to real price series.