Wednesday 15 April 2020

Generic Octave_Oanda_API Function

My last two posts have shown Octave functions that use the Oanda API to access and download data. In the first of these posts I said that I would post more code for further functions as and when I write them. However, on further reflection this would be unnecessary as the generic form of any such function is:

1) create the required headers
## set up the headers
query = [ 'curl -s --compressed -H "Content-Type: application/json"' ] ; ## -s is silent mode for Curl for no paging to terminal
query = [ query , ' -H "Authorization: Bearer XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"' ] ;
2) create the required API call
## construct the actual API call for instrument, year, month, day, hour and min
query = [ query , ' "https://api-fxtrade.oanda.com/v3/instruments/' , toupper( cross ) , "/orderBook?time=" , num2str( year ) , '-' , ...
num2str( month , "%02d" ) , '-' , num2str( day , "%02d" ) , 'T' , num2str( hour , "%02d" ) , '%3A' , num2str( min , "%02d" ) , '%3A00.000Z"' ] ;
3) extract the data from the returned JSON object
## call to use external Unix systems/Curl and return result
[ ~ , ret_JSON ] = system( query , RETURN_OUTPUT = 'TRUE' ) ;

## convert the returned JSON object to Octave structure
ret_JSON = load_json( ret_JSON ) ;
Providing the required libraries are installed on your system (see first post) the above is all that is needed for any API call - just change the details in the second code box - in the example above the code is to download the historical orderbook for a given forex currency cross/tradable at a specific date/time, given as function input variables.

I have found that the main difficulty lies in parsing the returned data structure, which can consist of nested Structures/Cell-Arrays. Below is code, liberally commented, that shows how to parse the above orderbook API call and add the 20 orderbook levels above/below the orderbook price to historical orderbook files already on disc
clear all ;
cd /home/dekalog/Documents/octave/oanda_data/20m ;

orderbooks = glob( '*_historical_orderbook_snapshots' ) ;
## {
##   [1,1] = aud_jpy_historical_orderbook_snapshots
##   [2,1] = aud_usd_historical_orderbook_snapshots
##   [3,1] = eur_aud_historical_orderbook_snapshots
##   [4,1] = eur_chf_historical_orderbook_snapshots
##   [5,1] = eur_gbp_historical_orderbook_snapshots
##   [6,1] = eur_jpy_historical_orderbook_snapshots
##   [7,1] = eur_usd_historical_orderbook_snapshots
##   [8,1] = gbp_chf_historical_orderbook_snapshots
##   [9,1] = gbp_jpy_historical_orderbook_snapshots
##   [10,1] = gbp_usd_historical_orderbook_snapshots
##   [11,1] = nzd_usd_historical_orderbook_snapshots
##   [12,1] = usd_cad_historical_orderbook_snapshots
##   [13,1] = usd_chf_historical_orderbook_snapshots
##   [14,1] = usd_jpy_historical_orderbook_snapshots
##   [15,1] = xag_usd_historical_orderbook_snapshots
##   [16,1] = xau_usd_historical_orderbook_snapshots
## }
data_20m = glob( '*_ohlc_20m' ) ;

for ii = 1 : 1 ## begin the instrument loop

str_split = strsplit( data_20m{ ii } , "_" ) ;
cross = strjoin( str_split( 1 : 2 ) , "_" ) ; ## get the tradable, e.g. 'eur_usd'

## create a unix system command to read the last row of 20 min ohlc of filename
unix_command = [ "tail -1" , " " , data_20m{ ii } ] ; 
[ ~ , data ] = system( unix_command ) ;
data = strsplit( data , { "," , "\n" } ) ; ## gives a cell arrayfun
## covert last date/time to numeric format
data_20m_last = [ str2num(data{1}) , str2num(data{2}) , str2num(data{3}) , str2num(data{4}) ,str2num(data{5}) ] ;

## create a unix system command to read the last row of historical_orderbook_snapshots of filename
unix_command = [ "tail -1" , " " , orderbooks{ ii } ] ; 
[ ~ , data ] = system( unix_command ) ;
data = strsplit( data , { "," , "\n" } ) ; ## gives a cell arrayfun
## covert last date/time to numeric format
data_ordbk_last = [ str2num(data{1}) , str2num(data{2}) , str2num(data{3}) , str2num(data{4}) ,str2num(data{5}) ] ;

time_diff = datenum( data_20m_last ) - datenum( data_ordbk_last ) ;
## only run following code if there is more orderbook data to download to match 20min data already on file
 if ( time_diff > 0 )

 no_rows_diff = ceil( time_diff * 72 ) ; ## there are 72 x 20 minute bars per day
 ## create a unix system command to read the last no_rows_diff of 20 min ohlc of filename
 unix_command = [ "tail -" , num2str( no_rows_diff ) , " " , data_20m{ ii } ] ; 
 [ ~ , data ] = system( unix_command ) ;
 data = strsplit( data , { "," , "\n" } ) ; ## gives a cell arrayfun
 data = data( 1 : size( data , 2 ) - 1 ) ; ## get rid of last empty cell
 data = reshape( data , 22 , no_rows_diff )' ;
 
  for jj = 1 : no_rows_diff
   if ( str2double(data{jj,1})==data_ordbk_last(1) && str2double(data{jj,2})==data_ordbk_last(2) && ...
        str2double(data{jj,3})==data_ordbk_last(3) && str2double(data{jj,4})==data_ordbk_last(4) && ...
        str2double(data{jj,5})==data_ordbk_last(5) )
    begin_ix = jj + 1 ;
    break ;
   endif
  endfor ## end of jj = 1 : no_rows_diff loop

 new_orderbook_data = zeros( no_rows_diff - ( begin_ix - 1 ) , 88 ) ;

 kk = 0 ; ## initialise kk counter to loop over new_orderbook_data rows
  for jj = begin_ix : no_rows_diff ## loop over structure S from begin_ix to no_rows_diff
   kk = kk + 1 ; ## increment counter
   
   ## write dates and times to new_orderbook_data
   new_orderbook_data( kk , 1 ) = str2double( data{ jj , 1 } ) ;
   new_orderbook_data( kk , 2 ) = str2double( data{ jj , 2 } ) ;
   new_orderbook_data( kk , 3 ) = str2double( data{ jj , 3 } ) ;
   new_orderbook_data( kk , 4 ) = str2double( data{ jj , 4 } ) ;
   new_orderbook_data( kk , 5 ) = str2double( data{ jj , 5 } ) ;
   
   ## download the orderbook structure, S
   S = get_historical_orderbook( cross , new_orderbook_data( kk , 1 ) , new_orderbook_data( kk , 2 ) , new_orderbook_data( kk , 3 ) , ...
                                  new_orderbook_data( kk , 4 ) , new_orderbook_data( kk , 5 ) ) ;

   ## write the orderBook Price to new_orderbook_data
   new_orderbook_data( kk , 6 ) = str2double( S.orderBook.price ) ;

    ######## find where str2double( S.orderBook.price ) is within S ########
    for ix = 1 : size( S.orderBook.buckets , 2 )
     if ( str2double( S.orderBook.buckets{ ix }.price ) >= str2double( S.orderBook.price ) )
      mid_ix = ix ;  
      break ;
     endif
    endfor ## end ix loop

    if ( ( str2double( S.orderBook.price ) - str2double( S.orderBook.buckets{ mid_ix - 1 }.price ) ) < ...
         ( str2double( S.orderBook.buckets{ mid_ix }.price ) ) - str2double( S.orderBook.price ) ) ## refine accuracy of mid_ix
      mid_ix = mid_ix - 1 ;
    endif
    ########## index for str2double( S.orderBook.price ) found #############

  ## actual writing to file: +/- 20 lines around mid_ix, the orderbook_price
  orderbook_begin_ix = mid_ix - 20 ; orderbook_end_ix = mid_ix + 20 ;
  
  ## format of file to write is:
  ## year month day hour min orderbook_price long% short%
   xx = 7 ; ## initialise column counter
   for zz = orderbook_begin_ix : orderbook_end_ix
    new_orderbook_data( kk , xx ) = str2double( S.orderBook.buckets{ zz }.longCountPercent ) ;
    new_orderbook_data( kk , xx + 41 ) = str2double( S.orderBook.buckets{ zz }.shortCountPercent ) ;
    xx = xx + 1 ; ## increment column counter
   endfor ## end of zz filling loop
   
  endfor ## end of jj for loop to fill one line of new_orderbook_data

 endif ## end of ( time_diff > 0 ) if statement

dlmwrite( orderbooks{ ii } , new_orderbook_data , '-append' ) ;

endfor ## end of ii instrument for loop

## The downloaded Structure S looks like
## parse the character structure S and write to new_orderbook_data
##  isstruct(S)
##  ans = 1
##  fieldnames(S)
##  ans =
##  {
##    [1,1] = orderBook
##  }
##
##  >> fieldnames(S.orderBook)
##  ans =
##  {
##    [1,1] = instrument
##    [2,1] = time
##    [3,1] = unixTime
##    [4,1] = price
##    [5,1] = bucketWidth
##    [6,1] = buckets
##  }
##
##  S.orderBook.instrument
##  ans = AUD_JPY
##
##  S.orderBook.time
##  ans = 2020-04-05T21:00:00Z
##
##  S.orderBook.unixTime
##  ans = 1586120400
##
##  S.orderBook.price
##  ans = 65.080
##
##  S.orderBook.bucketWidth
##  ans = 0.050
##
##  iscell( S.orderBook.buckets )
##  ans = 1
The code uses loops, which is usually frowned upon in favour of vectorised code, but I am not aware of how to vectorise the parsing of the structure. This code also shows the use of the unix_command to use -tail to read just the last 'n' lines of the files on disc and thus avoid loading complete, and perhaps very large, files.

No comments: