Sunday, 16 May 2010

Application of Tukey Chart


Above is a screen shot of current work in progress regarding Tukey Charts.

This is a shot of the continuous back-adjusted es mini contract from early January 2010 to mid May 2010, including the 1 day, 1000 point drop! The top chart shows price bars with the optimised Tukey Chart applied directly to prices (the cyan lines) and also the Tukey Chart applied to the Cybercycle* and scaled and super-imposed on price bars (the yellow lines). The price bars are blue if the price is above both outer ucls and red if below both outer lcls, white otherwise. The middle chart is the Cybercycle* with its Tukey Chart, and the lower chart is the Fisher Transformed Cybercycle* with its Tukey Chart. The ucl and lcl multipliers used for the Cybercycle* and Fisher Transformed Cybercycle* are the same as those used from testing detailed below. These are obviously available for MC optimisation in their own right and are on my "to do" list.

*see the technical papers section at http://www.mesasoftware.com for more information.

Monday, 10 May 2010

Monte Carlo Optimisation of Control Limits #8

The third, repeated MC test for periods 25 to 50 give these results: for periods 26 and 30 only the peaks and troughs occur between the ucl and lcl zones without actually entering the zones, and in each case the number of times this happened represents less than 0.01% of the total runs. At all other periods the peaks and troughs of the sine waves occur within their respective zones.

I think that to continue with MC testing on theoretical sine waves would result in diminishing value, so this third MC test will be the last on such theoretical wave forms. These last few tests have resulted in different optimised ucl and lcl multipliers for 3 "ranges" of periods: periods 14 and less, 15 to 24, and 25 and greater.

The next step will be to rewrite the indicator with these new optimised multipliers and test it on actual market prices.

Sunday, 9 May 2010

Monte Carlo Optimisation of Control Limits #7

The second, repeated, MC test has been run on periods 15 to 50 inclusive giving these results: 0.2% of total runs peak/trough outside the ucl and lcl zones at period 17, 0.2% are entirely within the zones at period 18 and 0.17% are entirely within the zones at period 22. For the same reasons as outlined in post #6, the results from this test lead me to decide that the optimised multipliers tested in this test shall apply to periods 15 to 24 inclusive in the final indicator. A third repeated MC test will be run on periods 25 to 50 inclusive.

Saturday, 8 May 2010

Monte Carlo Optimisation of Control Limits #6

I have now run the Monte Carlo tests of the boundary limits. The test consisted of 3000 phase randomised sine waves per period for constant periods 10 to 50 inclusive. The amplitudes of the sine waves were also randomised, up to a maximum of 20 (20*rand). This makes a total of 123,000 random sine waves. Any time the peak or trough of a sine wave occured above the resistance zone or below the support zone this was recorded. Similarly if the peak or trough of a sine wave occured between these zones without penetrating or touching them this was separately recorded. There were 683 instances (0.56 % of total runs) that peaked/troughed outside the zones, and 379 instances (0.31 %) that peaked/troughed completely within the zones without actually touching them. All of the above 683 instances occured at period 13, whilst the 379 instances were split 265/114 between periods 10/14.

Looking at the plot from post #4 it is obvious why all the failures are concentrated at the low periods, as this is the area where the plot shows most variation in the range of the multipliers. I have therefore decided that the optimised ucl and lcl multipliers that were tested in this test shall apply only to periods less than 15 in the final indicator. I shall repeat the MC test with new ucl and lcl optimised boundary multipliers for periods 15 to 50 inclusive.

Wednesday, 5 May 2010

Monte Carlo Optimisation of Control Limits #5

Writing the function was a relatively simple matter of reusing some of the code that had already been written to perform the tests detailed so far. A graphical plot of the appearance of the indicator is shown above.

To reiterate so far, the indicator:
  • is designed to show when prices are moving sideways or moving in a sideways channel by containing prices between a support and a resistance zone
  • prices have been modelled as a sine wave to represent this sideways price action
  • the upper and lower boundaries of each zone are based upon the application of a control chart directly to prices
  • Monte Carlo techniques and statistical tests have been employed to optimise the boundaries of each zone such that approximately 95% of the time the peaks and troughs of the sine wave should occur within their respective zone boundaries
The next step will be to write a Monte Carlo routine to test whether this indicator performs as expected on a new "out of sample" set of randomised sine waves, and this will be the subject of the next post.

Monte Carlo Optimisation of Control Limits #4


This post is about the selection of the ucl and lcl multipliers. From the tests performed so far it has been established that all the data for any single sine wave period can be aggregated into one overall distribution for that period. The nature of each distribution, as seen from looking at the density plots, suggests to me that taking the extreme values of each distribution would be acceptable as the sharp drop offs in the tails mean that these values would not be far removed from those that would chop off the tails but still encompass the vast majority of the data in said distribution (we are talking here about differences of only a few hundredths of a decimal point in the multiplier values). So that is exactly what I have done: aggregated all the distributions and taken the maximum and minimum values and plotted them against the sine wave period on the x-axis (plot shown above).

The thing that strikes me when looking at this plot is that there seems to be a natural upper and lower boundary that is consistent across all periods, which is quite fortuitous as this means it will not be necessary to write a complicated function with numerous if statements to check the period and apply a unique multiplier value for that period; it will simply suffice to apply the upper and lower boundary values regardless of period. For the time being at least, I have decided to set these boundaries values at the 0.025 and 0.975 quantile levels because
  • "Quantiles are useful measures because they are less susceptible to long-tailed distributions and outliers. Empirically, if the data you are analyzing are not actually distributed according to your assumed distribution, or if you have other potential sources for outliers that are far removed from the mean, then quantiles may be more useful descriptive statistics than means and other moment-related statistics."
It can be expected that approximately only 5% of the peaks and troughs of any sine wave will fall outside these levels. The quantile function in R was used to calculate these levels, with R-8 as the type used (http://en.wikipedia.org/wiki/Quantile).

The next thing to do is write this function and test it.

Monday, 3 May 2010

Monte Carlo Optimisation of Control Limits #3



Following on from the previous blog post, further investigation of the Kolmogorov-Smirnov test failures for periods 16, 24, 32, 40 and 48 density plots, for both amplitude 1 and variable amplitude, show that in each case the distributions "collapse to a unique value" and essentially there is no variation as the max and min values in each distribution are the same, 0.2071068. This is of no concern regarding the result of the Kolmogorov-Smirnov test, it just means for these periods there will not be two separate multipliers for the ucl and lcl: in effect there will just be a line of support/resistance instead of a zone, although I have an idea to address this which will be outlined later.

For Kolmogorov-Smirnov test failures for periods 27, 38, 45 and 49 refer to the density plot above, which is similar for all four period failures. It can be seen that the tails are almost identical and that the "plateau" area shows some variation: I surmise that it is this variation in the plateau that leads to the rejection of the null hypothesis. However, for the purposes of setting the ucl and lcl multipliers it is the tails that are of interest, and since the tails so similar to each other these test failures are also of no consequence.

The conclusion to be drawn from these tests is that
  • in regard to choosing the optimum values for the ucl and lcl multipliers for our theoretical sine wave for any single, constant period sine wave the phase and amplitude of the sine wave has no noticeable effect on the optimum choice, as the distributions for any single period multiplier are, at least in the tails of interest, statistically the same.
  • as a result of this the separate distributions for each single period sine wave can be combined in to one larger distribution for that period for the actual determination of the multipliers.
The actual determination of these multipliers for each period will be the subject of the next post.