Tuesday, 16 August 2011

Creation of Synthetic Data

Some time ago (the file was last edited in July 2010) I wrote an Octave .oct function to create synthetic data for testing and optimisation purposes. I was inspired to do so by the December 2005 issue of The Breakout Bulletin and it has recently come to mind again due to a posting on the StackExchange Quantitative Finance Forum here. I have posted the code for my .oct function in the code box below.

In writing this function I wanted to extend the ideas presented in the Breakout Bulletin and make them more applicable for the purposes I had/have in mind. By randomly scrambling the data any bar to bar dependency is destroyed (by design of course), but what if you want to preserve some bar to bar dependencies? This .oct function is my solution to preserving this dependency and a brief discussion of the theory behind it follows.

Firstly there is an assumption that any single bar and the market forces that caused the bar to be formed the way it did (up bar, down bar, doji etc.) are dependent on the immediately preceding market activity and the "current mode" of the market. Implicit in this assumption is that certain "types" of bars are more likely to be seen depending on market "mode," i.e. the types of bar in an up trend are likely to be distinctly different from those in a down trending or sideways trending market, so what is needed is some way to bin the bars which reflects this.

My solution is to apply a 21 bar moving median of the close and median absolute deviations from this median as bands above and below it, similar to Bollinger Bands. There are 3 levels; 1 x MAD, 2 x MAD and 3 x MAD above, and 3 below; to give a total of 8 "zones" as they are called in the code. Furthermore, a 21 bar moving median of the True Range and a 4 bar WMA of the True Range are also calculated. The first part of the code ("Code Block A Loop"), after all the required declarations, loops over the input time series data calculating all the above and assigning each bar to a specific bin based upon the "zone" in which the previous bar resides, and further assignation depends on whether the previous bar is a high or low volatility bar decided by the True Range 4 bar WMA being above or below the True Range 21 bar moving median. This gives a total of 16 different bins to which a bar can be assigned. On assignation to a bin, the open, high, low and close are recorded in that bin by their relation to the previous close thus: log10(close/previous_close), log10(open/previous_open)... etc.

The next part of the code ("Code Block B Loop") actually creates the synthetic data by randomly drawing a bar's relationships to its previous close from the "relevant bin" and calculating a "new" bar based upon these relationships. This "relevant bin" is determined by the "zone" position and volatility of the most recently calculated synthetic "new" bar. After a new, "new bar" has been created, the median, MADs and True Range calculations are updated to include this new, "new bar," which becomes the previous bar on the next iteration of the loop for Code Block B Loop.

Finally, a small part of the code adjusts the input data in the case of negative values due to the possible use of continuous back-adjusted futures contracts as the input data. This is necessary to avoid errors in trying to calculate the log10 of a negative number.

The above method of binning the input data and subsequent randomisation is my attempt to ensure that dependencies/characteristics of the original data are preserved - for example - assume a bar is above the upper 3 x MAD level and is determined to be a high volatility bar, then the next synthetically created bar will be drawn only from the binned distribution of bars that in the real data also follow a bar above the upper 3 x MAD level and is determined to be a high volatility bar.

This code is offered as is and comes with no warranty whatsoever. However, if you like it and use it I would be interested to hear from you. In particular, if you have any suggestions for the code's improvement, extension, optimisation etc. or see any errors in the code, I would really appreciate your feedback. 

#include octave/oct.h
#include octave/dColVector.h
#include algorithm
#include math.h
#include "MersenneTwister.h"

DEFUN_DLD (syn_3, args, , "Inputs are OHLC column vectors and tick size. Output is MC generated synthetic OHLC")
{
octave_value_list retval_list;

    if (args(0).length () < 5)
    {
        error ("Invalid arguments");
        return retval_list;
    }

    ColumnVector open = args(0).column_vector_value (); // open vector
    ColumnVector high = args(1).column_vector_value (); // high vector
    ColumnVector low = args(2).column_vector_value (); // low vector
    ColumnVector close = args(3).column_vector_value (); // close vector
    double tick = args(4).double_value(); // Tick size

    if (error_state)
    {
        error ("Invalid arguments");
        return retval_list;
    }

// Check for negative or zero price values due to continuous back- adjusting of price series & compensate if necessary
// Note: the "&" below acts as Address-of operator: p = &x; Read: Assign to p (a pointer) the address of x.
    double lowest_low = *std::min_element( &low(0), &low(low.length()) );
    double correction_factor = 0.0;
    if ( lowest_low <= 0.0 )
    {
    correction_factor = fabs(lowest_low) + tick;
	for (octave_idx_type ii (0); ii < args(0).length (); ii++)
	{
	open (ii) = open (ii) + correction_factor;
	high (ii) = high (ii) + correction_factor;
	low (ii) = low (ii) + correction_factor;
	close (ii) = close (ii) + correction_factor;
	}
    }

    ColumnVector moving_median_window (21);
    ColumnVector moving_MAD_window (21);
    ColumnVector moving_true_range_window (21);
    ColumnVector moving_median = args(0).column_vector_value ();
    ColumnVector moving_MAD = args(0).column_vector_value ();
    ColumnVector true_range = args(0).column_vector_value ();
//  declare and "pre-reserve" enough space for the various categorised bins for the MC proceedure
//  first, bins where curerent_4_bar_wma of true_range >= current_median_true_range
    ColumnVector zone_1_open ( args(0).length () ); // zone_1 >= median & < median + MAD 
    ColumnVector zone_1_high ( args(0).length () );
    ColumnVector zone_1_low ( args(0).length () );
    ColumnVector zone_1_close ( args(0).length () ); 
    ColumnVector zone_2_open ( args(0).length () ); // >= median + MAD & < median + 2*MAD
    ColumnVector zone_2_high ( args(0).length () );
    ColumnVector zone_2_low ( args(0).length () );
    ColumnVector zone_2_close ( args(0).length () );
    ColumnVector zone_3_open ( args(0).length () ); // >= median + 2*MAD & < median + 3*MAD
    ColumnVector zone_3_high ( args(0).length () );
    ColumnVector zone_3_low ( args(0).length () );
    ColumnVector zone_3_close ( args(0).length () );
    ColumnVector zone_4_open ( args(0).length () ); // >= median + 3*MAD
    ColumnVector zone_4_high ( args(0).length () );
    ColumnVector zone_4_low ( args(0).length () );
    ColumnVector zone_4_close ( args(0).length () );
    ColumnVector zone_5_open ( args(0).length () ); // < median & >= median - MAD 
    ColumnVector zone_5_high ( args(0).length () );
    ColumnVector zone_5_low ( args(0).length () );
    ColumnVector zone_5_close ( args(0).length () );
    ColumnVector zone_6_open ( args(0).length () ); // < median - MAD & >= median - 2*MAD
    ColumnVector zone_6_high ( args(0).length () );
    ColumnVector zone_6_low ( args(0).length () );
    ColumnVector zone_6_close ( args(0).length () );
    ColumnVector zone_7_open ( args(0).length () ); // < median - 2*MAD & >= median - 3*MAD
    ColumnVector zone_7_high ( args(0).length () );
    ColumnVector zone_7_low ( args(0).length () );
    ColumnVector zone_7_close ( args(0).length () );
    ColumnVector zone_8_open ( args(0).length () ); // < median - 3*MAD
    ColumnVector zone_8_high ( args(0).length () );
    ColumnVector zone_8_low ( args(0).length () );
    ColumnVector zone_8_close ( args(0).length () );
    int zone_1_access_int = 0;
    int zone_2_access_int = 0;
    int zone_3_access_int = 0;
    int zone_4_access_int = 0;
    int zone_5_access_int = 0;
    int zone_6_access_int = 0;
    int zone_7_access_int = 0;
    int zone_8_access_int = 0;
//  second, bins where curerent_4_bar_wma of true_range < current_median_true_range
    ColumnVector zone_1_lv_open ( args(0).length () ); // zone_1 >= median & < median + MAD 
    ColumnVector zone_1_lv_high ( args(0).length () );
    ColumnVector zone_1_lv_low ( args(0).length () );
    ColumnVector zone_1_lv_close ( args(0).length () ); 
    ColumnVector zone_2_lv_open ( args(0).length () ); // >= median + MAD & < median + 2*MAD
    ColumnVector zone_2_lv_high ( args(0).length () );
    ColumnVector zone_2_lv_low ( args(0).length () );
    ColumnVector zone_2_lv_close ( args(0).length () );
    ColumnVector zone_3_lv_open ( args(0).length () ); // >= median + 2*MAD & < median + 3*MAD
    ColumnVector zone_3_lv_high ( args(0).length () );
    ColumnVector zone_3_lv_low ( args(0).length () );
    ColumnVector zone_3_lv_close ( args(0).length () );
    ColumnVector zone_4_lv_open ( args(0).length () ); // >= median + 3*MAD
    ColumnVector zone_4_lv_high ( args(0).length () );
    ColumnVector zone_4_lv_low ( args(0).length () );
    ColumnVector zone_4_lv_close ( args(0).length () );
    ColumnVector zone_5_lv_open ( args(0).length () ); // < median & >= median - MAD 
    ColumnVector zone_5_lv_high ( args(0).length () );
    ColumnVector zone_5_lv_low ( args(0).length () );
    ColumnVector zone_5_lv_close ( args(0).length () );
    ColumnVector zone_6_lv_open ( args(0).length () ); // < median - MAD & >= median - 2*MAD
    ColumnVector zone_6_lv_high ( args(0).length () );
    ColumnVector zone_6_lv_low ( args(0).length () );
    ColumnVector zone_6_lv_close ( args(0).length () );
    ColumnVector zone_7_lv_open ( args(0).length () ); // < median - 2*MAD & >= median - 3*MAD
    ColumnVector zone_7_lv_high ( args(0).length () );
    ColumnVector zone_7_lv_low ( args(0).length () );
    ColumnVector zone_7_lv_close ( args(0).length () );
    ColumnVector zone_8_lv_open ( args(0).length () ); // < median - 3*MAD
    ColumnVector zone_8_lv_high ( args(0).length () );
    ColumnVector zone_8_lv_low ( args(0).length () );
    ColumnVector zone_8_lv_close ( args(0).length () );
    int zone_1_lv_access_int = 0;
    int zone_2_lv_access_int = 0;
    int zone_3_lv_access_int = 0;
    int zone_4_lv_access_int = 0;
    int zone_5_lv_access_int = 0;
    int zone_6_lv_access_int = 0;
    int zone_7_lv_access_int = 0;
    int zone_8_lv_access_int = 0;
    double current_median_true_range;
    double current_4_wma_true_range;

// loop to fill the first 20 spaces of true_range vector
    true_range (0) = high (0) - low (0);
    for (octave_idx_type ii (1); ii < 20; ii++)
    {
    true_range (ii) = fmax ( fmax(high(ii)-low(ii),fabs(high(ii)-close(ii-1))) , fabs(low(ii)-close(ii-1)) );
    }

// following code loops to create a 21 period moving median and a 21 period moving median MAD (Median Absolute Deviation), based upon closing price
// At the same time as these are calculated, each previous bar's close is inspected to place that close in relation to the previous values of the moving 
// median and moving MAD. Also calculated is the 21 period median true range and 4 bar WMA of the true range. Based upon this, the current bar's stats 
// are binned into the appropriate zone vector column for later MC use.

    for (octave_idx_type ii (20); ii < args(0).length (); ii++) // Code Block A loop
    {
	for (octave_idx_type jj (0); jj < 21; jj++) // loop to fill the moving_median_window
	{
        moving_median_window (jj) = close (ii-jj);
   	} // end of loop to fill the moving_median_window
        // Note: the "&" below acts as Address-of operator: p = &x; Read: Assign to p (a pointer) the address of x.
        std::nth_element( &moving_median_window(0), &moving_median_window(10), &moving_median_window(21) );
        moving_median (ii) = moving_median_window(10);

	for (octave_idx_type jj (0); jj < 21; jj++) // loop to fill the moving_MAD_window
	{
        moving_MAD_window (jj) = fabs( close (ii-jj) - moving_median (ii) );
   	} // end of loop to fill the moving_MAD_window
        // Note: the "&" below acts as Address-of operator: p = &x; Read: Assign to p (a pointer) the address of x.
        std::nth_element( &moving_MAD_window(0), &moving_MAD_window(10), &moving_MAD_window(21) );
        moving_MAD (ii) = moving_MAD_window(10);

        // true range calculations
        true_range (ii) = fmax ( fmax(high(ii)-low(ii),fabs(high(ii)-close(ii-1))) , fabs(low(ii)-close(ii-1)) );
	for (octave_idx_type jj (0); jj < 21; jj++) // loop to fill the moving_true_range_window
	{
        moving_true_range_window (jj) = true_range (ii-jj);
   	} // end of loop to fill the moving_true_range_window
        // Note: the "&" below acts as Address-of operator: p = &x; Read: Assign to p (a pointer) the address of x.
        std::nth_element( &moving_true_range_window(0), &moving_true_range_window(10), &moving_true_range_window(21) );
        current_median_true_range = moving_true_range_window(10);
        current_4_wma_true_range = ( 4*true_range(ii) + 3*true_range(ii-1) + 2*true_range(ii-2) + true_range(ii-3) ) / 10 ;

// now analyise the positions of the bar closes in relation to the moving median and moving MAD, fill the relevant MC bins and adjust bin counts

	if ( ii >= 21 & current_4_wma_true_range >= current_median_true_range ) // bin selection based on volatility loop
	{
		if ( close (ii-1) >= moving_median (ii-1) & close (ii-1) < (moving_median (ii-1) + moving_MAD (ii-1)) ) // zone_1
		{
		zone_1_open (zone_1_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_1_high (zone_1_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_1_low (zone_1_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_1_close (zone_1_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_1_access_int = zone_1_access_int + 1;
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + moving_MAD (ii-1)) & close (ii-1) < (moving_median (ii-1) + 2*moving_MAD (ii-1)) ) // zone_2
		{
		zone_2_open (zone_2_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_2_high (zone_2_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_2_low (zone_2_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_2_close (zone_2_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_2_access_int = zone_2_access_int + 1;	
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + 2*moving_MAD (ii-1)) & close (ii-1) < (moving_median (ii-1) + 3*moving_MAD (ii-1)) ) // zone_3
		{
		zone_3_open (zone_3_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_3_high (zone_3_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_3_low (zone_3_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_3_close (zone_3_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_3_access_int = zone_3_access_int + 1;	
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + 3*moving_MAD (ii-1)) ) // zone_4
		{
		zone_4_open (zone_4_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_4_high (zone_4_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_4_low (zone_4_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_4_close (zone_4_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_4_access_int = zone_4_access_int + 1;	
		}

		else if ( close (ii-1) < moving_median (ii-1) & close (ii-1) >= (moving_median (ii-1) - moving_MAD (ii-1)) ) // zone_5
		{
		zone_5_open (zone_5_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_5_high (zone_5_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_5_low (zone_5_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_5_close (zone_5_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_5_access_int = zone_5_access_int + 1;	
		}

		else if ( close (ii-1) < (moving_median (ii-1) - moving_MAD (ii-1)) & close (ii-1) >= (moving_median (ii-1) - 2*moving_MAD (ii-1)) ) // zone_6
		{
		zone_6_open (zone_6_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_6_high (zone_6_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_6_low (zone_6_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_6_close (zone_6_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_6_access_int = zone_6_access_int + 1;	
		}

		else if ( close (ii-1) < (moving_median (ii-1) - 2*moving_MAD (ii-1)) & close (ii-1) >= (moving_median (ii-1) - 3*moving_MAD (ii-1)) ) // zone_7
		{
		zone_7_open (zone_7_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_7_high (zone_7_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_7_low (zone_7_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_7_close (zone_7_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_7_access_int = zone_7_access_int + 1;	
		}

		else // zone_8
		{
		zone_8_open (zone_8_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_8_high (zone_8_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_8_low (zone_8_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_8_close (zone_8_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_8_access_int = zone_8_access_int + 1;	
		}
	}
	else
	{
		if ( close (ii-1) >= moving_median (ii-1) & close (ii-1) < (moving_median (ii-1) + moving_MAD (ii-1)) ) // zone_1
		{
		zone_1_lv_open (zone_1_lv_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_1_lv_high (zone_1_lv_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_1_lv_low (zone_1_lv_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_1_lv_close (zone_1_lv_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_1_lv_access_int = zone_1_lv_access_int + 1;
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + moving_MAD (ii-1)) & close (ii-1) < (moving_median (ii-1) + 2*moving_MAD (ii-1)) ) // zone_2
		{
		zone_2_lv_open (zone_2_lv_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_2_lv_high (zone_2_lv_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_2_lv_low (zone_2_lv_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_2_lv_close (zone_2_lv_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_2_lv_access_int = zone_2_lv_access_int + 1;	
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + 2*moving_MAD (ii-1)) & close (ii-1) < (moving_median (ii-1) + 3*moving_MAD (ii-1)) ) // zone_3
		{
		zone_3_lv_open (zone_3_lv_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_3_lv_high (zone_3_lv_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_3_lv_low (zone_3_lv_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_3_lv_close (zone_3_lv_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_3_lv_access_int = zone_3_lv_access_int + 1;	
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + 3*moving_MAD (ii-1)) ) // zone_4
		{
		zone_4_lv_open (zone_4_lv_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_4_lv_high (zone_4_lv_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_4_lv_low (zone_4_lv_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_4_lv_close (zone_4_lv_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_4_lv_access_int = zone_4_lv_access_int + 1;	
		}

		else if ( close (ii-1) < moving_median (ii-1) & close (ii-1) >= (moving_median (ii-1) - moving_MAD (ii-1)) ) // zone_5
		{
		zone_5_lv_open (zone_5_lv_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_5_lv_high (zone_5_lv_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_5_lv_low (zone_5_lv_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_5_lv_close (zone_5_lv_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_5_lv_access_int = zone_5_lv_access_int + 1;	
		}

		else if ( close (ii-1) < (moving_median (ii-1) - moving_MAD (ii-1)) & close (ii-1) >= (moving_median (ii-1) - 2*moving_MAD (ii-1)) ) // zone_6
		{
		zone_6_lv_open (zone_6_lv_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_6_lv_high (zone_6_lv_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_6_lv_low (zone_6_lv_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_6_lv_close (zone_6_lv_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_6_lv_access_int = zone_6_lv_access_int + 1;	
		}

		else if ( close (ii-1) < (moving_median (ii-1) - 2*moving_MAD (ii-1)) & close (ii-1) >= (moving_median (ii-1) - 3*moving_MAD (ii-1)) ) // zone_7
		{
		zone_7_lv_open (zone_7_lv_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_7_lv_high (zone_7_lv_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_7_lv_low (zone_7_lv_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_7_lv_close (zone_7_lv_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_7_lv_access_int = zone_7_lv_access_int + 1;	
		}

		else // zone_8
		{
		zone_8_lv_open (zone_8_lv_access_int) = log10 ( open (ii) / close (ii-1) );
		zone_8_lv_high (zone_8_lv_access_int) = log10 ( high (ii) / close (ii-1) );
		zone_8_lv_low (zone_8_lv_access_int) = log10 ( low (ii) / close (ii-1) );
		zone_8_lv_close (zone_8_lv_access_int) = log10 ( close (ii) / close (ii-1) );
		zone_8_lv_access_int = zone_8_lv_access_int + 1;	
		}

	} // end of ( ii >= 21 & current_4_wma_true_range >= current_median_true_rang ) if conditional for bin selection based on volatility

    } // end of Code Block A loop

// the next Code Block B performs the MC randomisation routine
    MTRand mtrand1; // Declare the Mersenne Twister Class - will seed from system time
    int random_zone_access_int;

// first, reset the volatility calculations to those at the beginning of the original series
    for (octave_idx_type ii (0); ii < 21; ii++) // loop to fill the moving_true_range_window
    {
    moving_true_range_window (ii) = true_range (ii);
    } // end of loop to fill the moving_true_range_window
    // Note: the "&" below acts as Address-of operator: p = &x; Read: Assign to p (a pointer) the address of x.
    std::nth_element( &moving_true_range_window(0), &moving_true_range_window(10), &moving_true_range_window(21) );
    current_median_true_range = moving_true_range_window(10);
    current_4_wma_true_range = ( 4*true_range(20) + 3*true_range(19) + 2*true_range(18) + true_range(17) ) / 10 ;

// now the MC synthetic routine
    for (octave_idx_type ii (21); ii < args(0).length (); ii++) // Code Block B loop
    {

// identify where previous close is vis a vis previous median and MAD and create a new "current bar" by MC selection from relevant zone bin
	if ( current_4_wma_true_range >= current_median_true_range )
	{
		if ( close (ii-1) >= moving_median (ii-1) & close (ii-1) < (moving_median (ii-1) + moving_MAD (ii-1)) ) // zone_1
		{
		random_zone_access_int = mtrand1.randInt( zone_1_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_1_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_1_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_1_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_1_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_1_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_1_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + moving_MAD (ii-1)) & close (ii-1) < (moving_median (ii-1) + 2*moving_MAD (ii-1)) ) // zone_2
		{
		random_zone_access_int = mtrand1.randInt( zone_2_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_2_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_2_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_2_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_2_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_2_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_2_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + 2*moving_MAD (ii-1)) & close (ii-1) < (moving_median (ii-1) + 3*moving_MAD (ii-1)) ) // zone_3
		{
		random_zone_access_int = mtrand1.randInt( zone_3_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_3_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_3_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_3_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_3_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_3_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_3_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + 3*moving_MAD (ii-1)) ) // zone_4
		{
		random_zone_access_int = mtrand1.randInt( zone_4_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_4_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_4_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_4_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_4_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_4_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_4_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;	
		}

		else if ( close (ii-1) < moving_median (ii-1) & close (ii-1) >= (moving_median (ii-1) - moving_MAD (ii-1)) ) // zone_5
		{
		random_zone_access_int = mtrand1.randInt( zone_5_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_5_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_5_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_5_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_5_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_5_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_5_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;	
		}

		else if ( close (ii-1) < (moving_median (ii-1) - moving_MAD (ii-1)) & close (ii-1) >= (moving_median (ii-1) - 2*moving_MAD (ii-1)) ) // zone_6
		{
		random_zone_access_int = mtrand1.randInt( zone_6_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_6_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_6_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_6_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_6_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_6_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_6_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;	
		}

		else if ( close (ii-1) < (moving_median (ii-1) - 2*moving_MAD (ii-1)) & close (ii-1) >= (moving_median (ii-1) - 3*moving_MAD (ii-1)) ) // zone_7
		{
		random_zone_access_int = mtrand1.randInt( zone_7_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_7_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_7_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_7_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_7_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_7_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_7_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;		
		}

		else // zone_8
		{
		random_zone_access_int = mtrand1.randInt( zone_8_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_8_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_8_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_8_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_8_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_8_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_8_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;		
		}
	}
	else
	{
		if ( close (ii-1) >= moving_median (ii-1) & close (ii-1) < (moving_median (ii-1) + moving_MAD (ii-1)) ) // zone_1
		{
		random_zone_access_int = mtrand1.randInt( zone_1_lv_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_1_lv_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_1_lv_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_1_lv_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_1_lv_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_1_lv_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_1_lv_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + moving_MAD (ii-1)) & close (ii-1) < (moving_median (ii-1) + 2*moving_MAD (ii-1)) ) // zone_2
		{
		random_zone_access_int = mtrand1.randInt( zone_2_lv_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_2_lv_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_2_lv_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_2_lv_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_2_lv_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_2_lv_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_2_lv_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + 2*moving_MAD (ii-1)) & close (ii-1) < (moving_median (ii-1) + 3*moving_MAD (ii-1)) ) // zone_3
		{
		random_zone_access_int = mtrand1.randInt( zone_3_lv_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_3_lv_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_3_lv_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_3_lv_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_3_lv_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_3_lv_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_3_lv_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		}

		else if ( close (ii-1) >= (moving_median (ii-1) + 3*moving_MAD (ii-1)) ) // zone_4
		{
		random_zone_access_int = mtrand1.randInt( zone_4_lv_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_4_lv_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_4_lv_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_4_lv_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_4_lv_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_4_lv_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_4_lv_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;	
		}

		else if ( close (ii-1) < moving_median (ii-1) & close (ii-1) >= (moving_median (ii-1) - moving_MAD (ii-1)) ) // zone_5
		{
		random_zone_access_int = mtrand1.randInt( zone_5_lv_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_5_lv_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_5_lv_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_5_lv_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_5_lv_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_5_lv_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_5_lv_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;	
		}

		else if ( close (ii-1) < (moving_median (ii-1) - moving_MAD (ii-1)) & close (ii-1) >= (moving_median (ii-1) - 2*moving_MAD (ii-1)) ) // zone_6
		{
		random_zone_access_int = mtrand1.randInt( zone_6_lv_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_6_lv_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_6_lv_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_6_lv_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_6_lv_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_6_lv_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_6_lv_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;	
		}

		else if ( close (ii-1) < (moving_median (ii-1) - 2*moving_MAD (ii-1)) & close (ii-1) >= (moving_median (ii-1) - 3*moving_MAD (ii-1)) ) // zone_7
		{
		random_zone_access_int = mtrand1.randInt( zone_7_lv_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_7_lv_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_7_lv_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_7_lv_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_7_lv_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_7_lv_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_7_lv_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;		
		}

		else // zone_8
		{
		random_zone_access_int = mtrand1.randInt( zone_8_lv_access_int - 1 ); // generate random access int

			if ( random_zone_access_int < 0 ) // check random access int doesn't exceed lower boundary for zone vector column  
			{
			random_zone_access_int = 0; 
			}
			if ( random_zone_access_int > zone_8_lv_access_int - 1 ) // check random access int doesn't exceed upper boundary for zone vector column  
			{
			random_zone_access_int = zone_8_lv_access_int - 1; 
			}

		open (ii) = ( floor( ( close (ii-1) * pow(10, zone_8_lv_open (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		high (ii) = ( floor( ( close (ii-1) * pow(10, zone_8_lv_high (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		low (ii) = ( floor( ( close (ii-1) * pow(10, zone_8_lv_low (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;
		close (ii) = ( floor( ( close (ii-1) * pow(10, zone_8_lv_close (random_zone_access_int)) ) / tick + 0.5 ) ) * tick;		
		}

	} // end of ( current_4_wma_true_range >= current_median_true_rang ) if conditional 

// create new "current bar" moving_median, moving MAD values and volatility values, overloading previous code 

	for (octave_idx_type jj (0); jj < 21; jj++) // loop to fill the moving_median_window
	{
        moving_median_window (jj) = close (ii-jj);
   	} // end of loop to fill the moving_median_window
        // Note: the "&" below acts as Address-of operator: p = &x; Read: Assign to p (a pointer) the address of x.
        std::nth_element( &moving_median_window(0), &moving_median_window(10), &moving_median_window(21) );
        moving_median (ii) = moving_median_window(10);

	for (octave_idx_type jj (0); jj < 21; jj++) // loop to fill the moving_MAD_window
	{
        moving_MAD_window (jj) = fabs( close (ii-jj) - moving_median (ii) );
   	} // end of loop to fill the moving_MAD_window
        // Note: the "&" below acts as Address-of operator: p = &x; Read: Assign to p (a pointer) the address of x.
        std::nth_element( &moving_MAD_window(0), &moving_MAD_window(10), &moving_MAD_window(21) );
        moving_MAD (ii) = moving_MAD_window(10);

        // true range calculations
        true_range (ii) = fmax ( fmax(high(ii)-low(ii),fabs(high(ii)-close(ii-1))) , fabs(low(ii)-close(ii-1)) );
	for (octave_idx_type jj (0); jj < 21; jj++) // loop to fill the moving_true_range_window
	{
        moving_true_range_window (jj) = true_range (ii-jj);
   	} // end of loop to fill the moving_true_range_window
        // Note: the "&" below acts as Address-of operator: p = &x; Read: Assign to p (a pointer) the address of x.
        std::nth_element( &moving_true_range_window(0), &moving_true_range_window(10), &moving_true_range_window(21) );
        current_median_true_range = moving_true_range_window(10);
        current_4_wma_true_range = ( 4*true_range(ii) + 3*true_range(ii-1) + 2*true_range(ii-2) + true_range(ii-3) ) / 10 ;

    } // end of Code Block B loop

// if compensation due to negative or zero original price values due to continuous back- adjusting of price series took place, re-adjust
    if ( correction_factor != 0.0 )
    {
	for (octave_idx_type ii (0); ii < args(0).length (); ii++)
	{
	open (ii) = open (ii) - correction_factor;
	high (ii) = high (ii) - correction_factor;
	low (ii) = low (ii) - correction_factor;
	close (ii) = close (ii) - correction_factor;
	moving_median (ii) = moving_median (ii) - correction_factor;
	moving_MAD (ii) = moving_MAD (ii) - correction_factor;
	}
    }

retval_list(5) = moving_MAD;
retval_list(4) = moving_median;
retval_list(3) = close;
retval_list(2) = low;
retval_list(1) = high;
retval_list(0) = open;
return retval_list;
} 
A final thought: although not implemented in the above code it would be possible to apply some form of "quality control" to the output. Statistical measures of the input time series could be taken and thresholds established and only those synthetic outputs that fall within these threshold conditions could be accepted as a valid synthetic time series output.

Below is a screenshot of a time series and synthetic data generated from it using the above function code. For the moment I won't say which is the original and which is the synthetic data - perhaps readers would like to post their guesses as comments?

No comments: