Jupyter Notebook on Blueshift

Blueshift includes a Jupyter Notebook interface for open ended research. Blueshift Jupyter Notebook interface allows us to have an open ended exploratory analysis capabilities to form trade ideas before moving on to testing ( backtesting or paper trading) and deployment on the platform.

Blueshift works differently on Jupyter Notebook than what you would have written in a regular strategy code. The major difference is there is no running algo or associated algo event loop. This means none of the event handler functions are available. Also, since there is no running algo, none of the algo specific APIs - including context, data or anything that is imported from blueshift.api are not available either. Instead we rely on the a set of functions imported from the blueshift.research module, to query the built-in datasets and build our exploratory models using the Python packages available on the platform.

NoteBook Workflow

The Blueshift research workflow using the Jupyter notebook usually starts with selecting a dataset on which to run further analysis.

from blueshift.research import list_datasets, use_dataset

# list the available datasets
list_datasets()

# select dataset. You can change it at any point in time
use_dataset('nse')

Once a dataset is selected, we can use the other API functions from the research module.

from blueshift.research import use_dataset, symbol, history

# select dataset. You can change it at any point in time
use_dataset('nse')

# get a reference to the asset object
asset = symbol('ACC')
prices = history(asset, 'close', 20, '1m')
prices.plot()

Important

The blueshift.api API functions are not usable in the Blueshift notebooks environment. Similarly, in a strategy code, you should not import from blueshift.research. Also, in notebooks, additional Python packages for plotting - matplotlib and plotly - are available.

Research (NoteBook) APIs

The following are the available API functions in the Blueshift notebook environment

blueshift.research.list_datasets(): List available dataset names.

blueshift.research.use_dataset(name): Set the current dataset by name.

blueshift.research.symbol(sym, dt=None, *args, **kwargs): Get the asset for a given instrument symbol.

blueshift.research.sid(sec_id): Get the asset for a given security ID from the pipeline store.

blueshift.research.current(assets, columns='close', dt=None, last_known=True)

Return last available price. If either assets or columns is a list, a series is returned, indexed by assets or fields, respectively. If both are lists, a dataframe is returned. Otherwise, a scalar is returned. Only OHLCV column names are supported in general. However, for futures and options, open_interest, implied_vol and greeks are supported as well.

Parameters:

assets (asset object or a list of assets.) – An asset or a list for which to fetch data.
columns (str or a list.) – A field name or a list of OHLCV columns.
dt (pd.Timestamp or string that can be convereted to Timestamp.) – The timestamp at which to fetch the data.
last_known (bool) – If missing, last known good value (instead of NaN).

Returns:

current price of the asset(s).

Return type:

float (int in case of volume), pandas.Series or pandas.DataFrame.

blueshift.research.history(assets, columns, nbars, frequency, dt=None, adjusted=True)

Returns given number of bars for the assets. If more than one asset or more than one column supplied, returns a dataframe, with assets or fields as column names. If both assets and columns are multiple, returns a multi-index dataframe with columns as the column names and asset as the second index level. For a single asset and a single field, returns a series. Only OHLCV column names are supported. However, for futures and options, open_interest, implied_vol and greeks are also supported.

Parameters:

assets (asset object or a list of assets.) – An asset or a list for which to fetch data.
columns (str or a list.) – A field name or a list of OHLCV columns.
nbars (int) – Number of bars to fetch.
frequency (str) – Frequency of bars (either ‘1m’ or ‘1d’).
dt (pd.Timestamp or string that can be convereted to Timestamp.) – The timestamp at which to fetch the data.
adjusted (bool) – Whether to apply adjustments.

Returns:

historical bars for the asset(s).

Return type:

pandas.Series or pandas.DataFrame.

# this assumes we have already selected the dataset
from blueshift.research import symbol, current, history

# fetch an asset by symbol "ABC"
asset = symbol('ABC')

# fetch historical data as of a given date
df = history(asset, ['close','high'], 10, '1m', dt="2023-05-05 14:30:00")
df.close.plot()

blueshift.research.fundamentals(assets, metrics, nbars, frequency, dt=None)

Returns given number of bars for the fundamental metrics. This always returns a dict with asset(s) as the key(s), or an empty dict of no data available.

Note

Available only if fundamental data source is supported. If the source data of the store supports point-in-time, the data returned is point-in-time as well.

Parameters:

assets (asset object or a list of assets.) – An asset or a list for which to fetch data.
metrics (str or a list.) – A field name or a list of Fundamental metrics.
nbars (int) – Number of records to fetch.
frequency (str) – Frequency of bars (either ‘Q’ or ‘A’).
dt (pd.Timestamp or string that can be convereted to Timestamp.) – The timestamp at which to fetch the data.

Returns:

historical fundamental metrics for the asset(s).

Return type:

dict.

# this assumes we have already selected the dataset
from blueshift.research import symbol, fundamentals
from blueshift.protocol import FundamentalColumns

# see available columns
print(FundamentalColumns.Quarterly)
print(FundamentalColumns.Annual)
print(FundamentalColumns.Ratios)

# fetch an asset by symbol "ABC"
asset = symbol('ABC')

# fetch historical data for profit before tax and profit after 
# tax as of a given date
metrics = [FundamentalColumns.Quarterly.pbt, FundamentalColumns.Quarterly.pat]
df = fundamentals(asset, metrics, 10, 'Q', dt="2023-05-05 14:30:00")
df[asset].pbt.plot()

blueshift.research.list_sectors(frequency)

List the sectors for a given fundamental data source. Returns a list of strings (sector names).

Note

Available only if fundamental data source is supported.

Parameters:: frequency (str) – Frequency of data (either ‘Q’ or ‘A’).
Returns:: available sectors.
Return type:: list.

blueshift.research.get_sectors(assets, dt=None): Get the sectors for the given assets where available. Returns a pandas dataframe with assets as the index and sector names as a column with name ‘sector’.

Note

Available only if fundamental data source is supported.

blueshift.research.run_pipeline(pipeline, start_date, end_date)

Run a pipeline between given dates. This will return a pandas multi-index dataframe with timestamp as the first index, the selected assets the second index and the pipeline factor(s) as the column(s).

Important

This function will work only in the research environment, using this in a strategy will throw error.

# import Pipeline constructor and built-in factors/ filters
# this assumes we have already selected the dataset

from blueshift.research import run_pipeline
from blueshift.pipeline import Pipeline
from blueshift.library.pipelines import select_universe, period_returns

# create the pipeline
def create_pipeline():
    pipe = Pipeline()
    liquidity = select_universe(200, 200) # a built-in liquidy screener
    mom = period_returns(20) # return over period factor
    mom_filter = mom > 0 # positive momentum

    pipe.add(mom,'momentum')
    pipe.set_screen(liquidity & mom_filter)

    return pipe

# run the pipeline
pipe = create_pipeline()
results = run_pipeline(pipe, '2022-05-04', '2023-05-05')

# returns multi-index df with dates as the first level and 
# filtered assets as the second level indices, with factors 
# added using pipe.add() as columns
print(results.describe())

blueshift.research.get_data_portal(): get the current data portal object for selected dataset.

Jupyter Notebook on Blueshift

NoteBook Workflow

Research (NoteBook) APIs

Example Notebooks