Sunday 15 September 2013

Savitzky-Golay Filter Convolution Coefficients

My investigation so far has had me reading General Least Squares Smoothing And Differentiation By The Convolution Method, a paper in the reference section of the Wikipedia page on Savitzky-Golay filters. In this paper there is Pascal code for calculating the convolution coefficients for a Savitzky-Golay filter, and in the code box below is my C++ translation of this Pascal code. Nb, due to formatting issues the symbol "<" appears as ampersand lt semi-colon in this code.
// Calculate Savitzky-Golay Convolution Coefficients
#include 
using namespace std;

double GenFact (int a, int b)
{ // calculates the generalised factorial (a)(a-1)...(a-b+1)

  double gf = 1.0 ;
  
  for ( int jj = (a-b+1) ; jj &lt; a+1 ; jj ++ ) 
  {
    gf = gf * jj ; 
  }  
    return (gf) ;
    
} // end of GenFact function

double GramPoly( int i , int m , int k , int s )
{ // Calculates the Gram Polynomial ( s = 0 ), or its s'th
  // derivative evaluated at i, order k, over 2m + 1 points
  
double gp_val ;  

if ( k > 0 )
{   
gp_val = (4.0*k-2.0)/(k*(2.0*m-k+1.0))*(i*GramPoly(i,m,k-1,s) + s*GramPoly(i,m,k-1.0,s-1.0)) - ((k-1.0)*(2.0*m+k))/(k*(2.0*m-k+1.0))*GramPoly(i,m,k-2.0,s) ;
}
else
{  
  if ( ( k == 0 ) && ( s == 0 ) )
  {
     gp_val = 1.0 ;
  }     
  else
  {
     gp_val = 0.0 ;
  } // end of if k = 0 & s = 0  
} // end of if k > 0

return ( gp_val ) ;

} // end of GramPoly function

double Weight( int i , int t , int m , int n , int s )
{ // calculates the weight of the i'th data point for the t'th Least-square
  // point of the s'th derivative, over 2m + 1 points, order n

double sum = 0.0 ;

for ( int k = 0 ; k &lt; n + 1 ; k++ )
{
sum += (2.0*k+1.0) * ( GenFact(2.0*m,k) / GenFact(2.0*m+k+1.0,k+1.0) ) * GramPoly(i,m,k,0) * GramPoly(t,m,k,s) ;    
} // end of for loop

return ( sum ) ;
                                               
} // end of Weight function

int main ()
{ // calculates the weight of the i'th data point (1st variable) for the t'th   Least-square point (2nd variable) of the s'th derivative (5th variable), over 2m + 1 points (3rd variable), order n (4th variable)
  
  double z ;
  
  z = Weight (4,4,4,2,0) ; // adjust input variables for required output
  
  cout &lt;&lt; "The result is " &lt;&lt; z ;
  
  return 0 ;
}
Whilst I don't claim to understand all the mathematics behind this, the use of the output of this code is conceptually not that much more difficult than a FIR filter, i.e. multiply each value in a look back window by some coefficient, except that a polynomial of a given order is being least squares fit to all points in the window rather than, for example, an average of all the points being calculated. The code gives the user the ability to calculate the value of the polynomial at each point in the window, or its derivative(s) at that point.

I would particularly draw readers' attention to the derivative(s) link above. On this page there is an animation of the slope (1st derivative) of a waveform. If this were to be applied to a price time series the possible applications become apparent: a slope of zero will pick out local highs and lows to identify local support and resistance levels; an oscillator zero crossing will give signals to go long or short; oscillator peaks will give signals to tighten stops or place profit taking orders. And why stop there? Why not use the second derivative as a measure of the acceleration of a price move? And how about the jerk and the jounce of a price series? I don't know about readers, but I have never seen a trading article, forum post or blog talk about the jerk and jounce of price series. Of course, if one were to plot all these as indicators I'm sure it would just be confusing and nigh impossible to come up with a cogent rule set to trade with. However, one doesn't have to do this oneself, and this is what I was alluding to in my earlier post when I said that the Savitzky-Golay framework might provide unique and informative inputs to my neural net trading system. Providing enough informative data is available for training, the rule set will be encapsulated within the weights of any trained neural net.

2 comments:

Anonymous said...

Do you think the values at the extreme are reliable? (including derivatives) . hummmm

Dekalog said...

Anonymous,

Would you care to explain why you think the "unreliableness" of extreme values is a problem?