Fetching Price Data, Tracking Algo State

Blueshift callback functions are usually called with one or two arguments - context and data. These are internally maintained objects that provide useful functionalities to the algo. User strategy can use the context object to query about the current state of the algo, including its profit-loss, positions, leverage or other related information. For fetching price data, the data object is used. See below for more.

Note

The context and data variables are maintained by the platform and are made available to most callback APIs automatically. You need not instantiate these objects on your own. Also do not override their internal methods, else the algo will crash with error.

Context Object

The Blueshift engine uses the internal context object to track and keep the various state metrics of the running strategy up-to-date.

The context object is an instance of an internal class as below. User strategy code never instantiates this object. This object is created and maintained by the platform core engine and provides interfaces to the current context of the running algorithm, including querying the current order status and portfolio status. The context object is the first (and sometimes the only) argument to all platform callback functions.

class blueshift.core.algorithm.context.AlgoContext

The algorithm context encapsulates the context of a running algorithm. This includes tracking internal objects like blotter, broker interface etc, as well as account, portfolio and positions details (see below). The context object is also useful to store user-defined variables for access anywhere in the strategy code.

Warning

Once the context is initialised, its core attributes (i.e. non-user defined attributes) are read-only. Attempting to overwrite them will throw AttributeError and will crash the algo.

Context Attributes

AlgoContext.name

return the name (str) of the current algo run.

AlgoContext.mode

return the run mode (enum) of the current run.

See also

see Algo Modes and Other Constants for allowed values and interpretation.

AlgoContext.execution_mode

return the execution mode (enum) of the current run.

See also

see Algo Modes and Other Constants for allowed values and interpretation.

AlgoContext.trading_calendar

Returns the current trading calendar object.

See also

See documentation for Trading Calendar.

AlgoContext.intraday_cutoff

Returns the intraday cutoff time for accepting orders with rolling asset definition. Order placement using rolling asset will be refused after this cut-off time.

AlgoContext.record_vars

The recorded var dataframe (pandas.DataFrame) as generated by a call to the API function record. The column names are recorded variable names. Variables are recorded on a per-session (i.e. daily) basis. A maximum of 10 recorded variables are allowed.

Warning

Adding recorded variables may slow down the speed of a backtest run.

AlgoContext.pnls

Returns historical (daily) profit-and-loss information since inception. This is a pandas.Dataframe with the following columns:

  • algo_returns: daily returns of the strategy

  • algo_cum_returns: cumulative returns of the strategy

  • algo_volatility: annualised daily volatility of the strategy

  • drawdown: current drawdown of the strategy (percentage)

Note

The timestamp for each day is the end-of-day, except the current day with the timestamp of most recent computation.

AlgoContext.orders

return a list of all open and closed orders for the current blotter session. This is a dict with keys as order IDs (str) and values as order object.

AlgoContext.open_orders

return all orders currently open from the algorithm. This is a dict with keys as order IDs (str) and values as order object.

AlgoContext.open_orders_by_asset = <function AlgoContext.open_orders_by_asset>

Portfolio and Account

The context objects provides interfaces to algo account and portfolio through attributes context.account and context.portfolio accessible from the user strategy.

AlgoContext.account

Return the account object (a view of the underlying trading account).

The account object has the following structure. All these attributes are read-only.

Attribute

Type

Description

margin

float

Total margin posted with the broker

leverage

float

Gross leverage (gross exposure / liquid asset value)

gross_leverage

float

Gross leverage (gross exposure / liquid asset value)

net_leverage

float

Net leverage (net exposure / liquid asset value)

gross_exposure

float

Gross (unsigned sum) exposure across all assets at last updated prices

long_exposure

float

Total exposures in long positions

short_exposure

float

Total exposures in short positions

long_count

int

Total assets count in long positions

short_count

int

Total assets count in short positions

net_exposure

float

Net (signed sum) exposure across all assets at last updated prices

net_liquidation

float

Sum of cash and margin

commissions

float

Net commissions paid (if available)

charges

float

Net trading charges paid (if available)

total_positions_exposure

float

Gross (unsigned sum) exposure across all assets at last updated prices

available_funds

float

Net cash available on the account

total_positions_value

float

Total value of all holdings

Warning

Running multiple strategies in the same account may lead to misleading values of these attributes.

AlgoContext.portfolio

Return the current portfolio object. Portfolio is a view of the current state of the algorithm, including positions.

The attributes (read-only) of the portfolio object are as below:

Attribute

Type

Description

portfolio_value

float

Current portfolio net value

positions_exposure

float

Present gross exposure

cash

float

Total undeployed cash

starting_cash

float

Starting capital

returns

float

Cumulative Algo returns

positions_value

float

Total value of holdings

pnl

float

Total profit or loss

mtm

float

Unrealized profit or loss

start_date

Timestamp

Start date of the algo

positions

dict

Positions dict (see below)

The positions attribute is a dictionary with the current positions. The keys of the dictionary are Asset objects. The values are Position objects.

The following example shows how to access account and positions data within the strategy code.

def print_report(context):
    account = context.account
    portfolio = context.portfolio
    positions = portfolio.positions

    for asset in positions:
        position = positions[asset]
        print(f'position for {asset}:{position.quantity}')

    print(f'total portfolio {portfolio.portfolio_value}')
    print(f'exposure:{account.net_exposure}')

def before_trading_starts(context, data):
    print_report(context)

Data Object

The data object is the second argument to platform callback functions (where applicable). This provides an interface to the user strategy to query and fetch data.

Fetching Current Data

class blueshift.data.readers.data_portal.DataPortal

DataPortal class defines the interface for the data object in the callback functions. It defines two basic methods - current and history. User strategy should use these methods to query and fetch data from within a running algo.

abstract current(assets, columns, **kwargs)

Return last available price. If either assets and columns are multiple, a series is returned, indexed by assets or fields, respectively. If both are multiple, a dataframe is returned. Otherwise, a scalar is returned. Only OHLCV column names are supported in general. However, for futures and options, open_interest is supported as well.

Parameters:
  • assets (asset object or a list of assets.) – An asset or a list for which to fetch data.

  • columns (str or a list.) – A field name or a list of OHLCV columns.

Returns:

current price of the asset(s).

Return type:

float (int in case of volume), pandas.Series or pandas.DataFrame.

Warning

The data returned can be a NaN value, an empty series or an empty DataFrame, if there are missing data for the asset(s) or column(s). Also, the returned series or frame may not contain all the asset(s) or column(s) if such asset or column has missing data. User strategy must always check the returned data before further processing.

Querying Historical Data

class blueshift.data.readers.data_portal.DataPortal

DataPortal class defines the interface for the data object in the callback functions. It defines two basic methods - current and history. User strategy should use these methods to query and fetch data from within a running algo.

abstract history(assets, columns, nbars, frequency, **kwargs)

Returns given number of bars for the assets. If more than one asset or more than one column supplied, returns a dataframe, with assets or fields as column names. If both assets and columns are multiple, returns a multi-index dataframe with columns as the column names and asset as the second index level. For a single asset and a single field, returns a series. Only OHLCV column names are supported. However, for futures and options, open_interest is also supported.

Parameters:
  • assets (asset object or a list of assets.) – An asset or a list for which to fetch data.

  • columns (str or a list.) – A field name or a list of OHLCV columns.

  • nbars (int) – Number of bars to fetch.

  • frequency (str) – Frequency of bars (either ‘1m’ or ‘1d’).

Returns:

historical bars for the asset(s).

Return type:

pandas.Series or pandas.DataFrame.

Warning

The data returned can be an empty series or an empty DataFrame, if there are missing data for the asset(s) or column(s). Also, the returned series or frame may not contain all the asset(s) or column(s) if such asset or column has missing data. In addition, for multi-indexed DataFrame, user strategy code must not assume aligned data with same timestamps for different assets (however, columns will always be aligned for a given asset). User strategy must always check the returned data before further processing.

Changed in version 2.1.0: - The frequency parameter now supports extended specifications, in addition to (1d and 1m). This can be added as Pandas frequency format, e.g. 5T, 30T or 1H for 5-minute, 30-minute and 1-hour candles respectively. This is primarily designed for live trading and using this in backtest can be very slow (as the underlying data is stored only in minute or daily format and other frequencies are resampled on-the-fly). The allowed parameters in live trading depends on the broker support, i.e. will fail if the broker does not support the particular candle frequency. Also, this is available only for data query with a single asset. For multiple assets, this will raise an error.

Important

  • These methods support the OHLCV (“open”,”high”,”low”,”close” and “volume”) columns for all assets (except options), for backtests as well as live trading. For non-tradable assets (e.g. market index), “volume” may be zeros or missing. For options, open, high, low and volume fields are not available (see below for extra fields).

  • For backtest, apart from OHLCV columns, “open_interest” is also available for futures and options assets. This may not be available in live trading if the broker supports streaming data but does not support open interest in streaming data.

  • For backtest, when options asset(s), we can specify greeks, implied vol and ATMF forward as field names as well. Use “implied_vol” for the implied volatility levels and “atmf” for the prevailing at-the-money futures level. The supoprted greeks are “delta”, “vega”, “gamma” and “theta”. The greek levels are computed using Black 76 model, on-the-fly, and may not be stable near the expiry. These fields may not be available in live trading (but user strategy can compute them easily on-the-fly).

  • These methods will handle a rolling asset specifications by picking the correct asset at each timestamp for which data is returned. For example, querying for symbol(‘ABC-ICE+100’) for last 20 minutes will return the 100 out ATM call as applicable for each of the minutes (although the underlying levels, and hence the actual strikes may vary).

The following example shows how to access current and historical data and what are the expected data types of the return values in various cases.

from blueshift.api import symbol

def initialize(context):
    context.universe = [symbol("AAPL"), symbol("MSFT")]

def before_trading_start(context, data):
    # this returns a float value
    px = data.current(context.universe[0], 'close')

    # this returns an int value
    px = data.current(context.universe[0], 'volume')

    # this returns a pandas.Series with the columns in the index
    px = data.current(context.universe[0], ['open','close'])

    # this returns a pandas.Series with the assets in the index
    px = data.current(context.universe, 'close')

    # this returns a pandas.DataFrame with assets in the index
    px = data.current(context.universe, ['open','close'])

    # px1 is a Series with timestamps as index
    px1 = data.history(context.universe[0], "close", 3, "1m")

    # px2 is DataFrame with timestamp index and field names as columns
    px2 = data.history(context.universe[0], ['open','close'], 3, "1m")

    # px3 is a DataFrame with timestamp index and assets as columns
    px3 = data.history(context.universe, "close", 3, "1m")

    # px4 is a multi-index Frame with field names as columns, asset as the second index level
    px4 = data.history(context.universe, ["open","close"], 3, "1m")

    # to fetch all fields for an asset, use `xs`
    # this returns a regular Dataframe with columns as field names
    asset_o_price = px4.xs(context.universe[0])

    # to fetch a field for all assets, use subsetting
    # this returns a regular Dataframe with columns as assets
    close_prices = px4['close']

Subscribing Streaming Data

class blueshift.data.readers.data_portal.DataPortal

DataPortal class defines the interface for the data object in the callback functions. It defines two basic methods - current and history. User strategy should use these methods to query and fetch data from within a running algo.

subscribe(assets, level=1, *args, **kwargs)

Subscribe to price data feed. The`level` parameter determines the subscription type.

Important

Typically level=1 means trade price and level=2 means full quotes subscriptions. However, this is implementation dependent and level greater 1 may not be supported and may mean different things. A value of 1 for level is supported always (when the underlying source supports streaming data in the first place of course).

Parameters:
  • assets (asset object or a list of assets.) – An asset or a list for which to subscribe.

  • level (int) – The level of data subscription.

unsubscribe(assets, level=1, *args, **kwargs)

Unsubscribe from price data feed. This will also remove any temporary data stored for the asset(s).

Important

See the note from subscribe method above.

Parameters:
  • assets (asset object or a list of assets.) – An asset or a list for which to subscribe.

  • level (int) – The level of data subscription.

Note

Usually you do not need to subscribe (or unsubscribe) explicitly in your strategy code. A data query (via the current or the history methods) automatically initiates a data subscription (for faster data fetch the next time).

Important

Subscribing to or unsubscribing from streaming data only works in live trading for sources/ brokers that support streaming data.

Blueshift manages data in multiple layers. Actual raw data is stored internally, through a class named DataStore. The core implementation of the DataStore class uses Apache Arrow. The DataStore class provides low-level APIs to read and write data (from disk or a streaming source such as a websocket, or from an in-memory object). A high-level class Library handles (potentially multiple) DataStore instances with defined dispatch functionalities (to route a data query to appropriate store, among many). This Library class implements the DataPortal interface above. The algo simulation queries this Library instance to fetch current data or data history.

The efficiency of data query depends on the type of the underlying DataStore for backtests. Two formats are usually used - one optimized for reading wide data (many assets but single columns) and another optimized for reading long data (single assets but multiple columns). Typically, equity assets are stored in the wide format and derivatives are stored in the narrow format.