Jupyter Notebook on Blueshift¶
Blueshift includes a Jupyter Notebook interface for open ended research.
Blueshift Jupyter Notebook interface allows us to have an open ended exploratory
analysis capabilities to form trade ideas before moving on to testing (
backtesting or paper trading) and deployment on the platform.
Blueshift works differently on Jupyter Notebook than what you would have written
in a regular strategy code. The major difference is there is no running algo or
associated algo event loop. This means none of the event handler functions are
available. Also, since there is no running algo, none of the algo specific APIs
- including context, data or anything that is imported from blueshift.api
are not available either. Instead we rely on the a set of functions imported
from the blueshift.research module, to query the built-in datasets and
build our exploratory models using the Python packages available on the platform.
NoteBook Workflow¶
The Blueshift research workflow using the Jupyter notebook usually starts with selecting a dataset on which to run further analysis.
from blueshift.research import list_datasets, use_dataset
# list the available datasets
list_datasets()
# select dataset. You can change it at any point in time
use_dataset('nse')
Once a dataset is selected, we can use the other API functions from the research module.
from blueshift.research import use_dataset, symbol, history
# select dataset. You can change it at any point in time
use_dataset('nse')
# get a reference to the asset object
asset = symbol('ACC')
prices = history(asset, 'close', 20, '1m')
prices.plot()
Important
The blueshift.api API functions are not usable in the Blueshift notebooks environment. Similarly, in a strategy code, you should not import from blueshift.research. Also, in notebooks, additional Python packages for plotting - matplotlib and plotly - are available.
Research (NoteBook) APIs¶
The following are the available API functions in the Blueshift notebook environment
- blueshift.research.list_datasets()¶
- List available dataset names. 
- blueshift.research.use_dataset(name)¶
- Set the current dataset by name. 
- blueshift.research.symbol(sym, dt=None, *args, **kwargs)¶
- Get the asset for a given instrument symbol. 
- blueshift.research.sid(sec_id)¶
- Get the asset for a given security ID from the pipeline store. 
- blueshift.research.current(assets, columns='close', dt=None, last_known=True)¶
- Return last available price. If either assets or columns is a list, a series is returned, indexed by assets or fields, respectively. If both are lists, a dataframe is returned. Otherwise, a scalar is returned. Only OHLCV column names are supported in general. However, for futures and options, - open_interest,- implied_voland greeks are supported as well.- Parameters:
- assets (asset object or a - listof assets.) – An asset or a list for which to fetch data.
- columns ( - stror a- list.) – A field name or a list of OHLCV columns.
- dt (pd.Timestamp or string that can be convereted to Timestamp.) – The timestamp at which to fetch the data. 
- last_known (bool) – If missing, last known good value (instead of NaN). 
 
- Returns:
- current price of the asset(s). 
- Return type:
- float(- intin case of volume),- pandas.Seriesor- pandas.DataFrame.
 
- blueshift.research.history(assets, columns, nbars, frequency, dt=None, adjusted=True, **kwargs)¶
- Returns given number of bars for the assets. If more than one asset or more than one column supplied, returns a dataframe, with assets or fields as column names. If both assets and columns are multiple, returns a multi-index dataframe with columns as the column names and asset as the second index level. For a single asset and a single field, returns a series. Only OHLCV column names are supported. However, for futures and options, - open_interest,- implied_voland greeks are also supported.- Parameters:
- assets (asset object or a - listof assets.) – An asset or a list for which to fetch data.
- columns ( - stror a- list.) – A field name or a list of OHLCV columns.
- nbars (int) – Number of bars to fetch. 
- frequency (str) – Frequency of bars (either ‘1m’ or ‘1d’). 
- dt (pd.Timestamp or string that can be convereted to Timestamp.) – The timestamp at which to fetch the data. 
- adjusted (bool) – Whether to apply adjustments. 
 
- Returns:
- historical bars for the asset(s). 
- Return type:
- pandas.Seriesor- pandas.DataFrame.
 - # this assumes we have already selected the dataset from blueshift.research import symbol, current, history # fetch an asset by symbol "ABC" asset = symbol('ABC') # fetch historical data as of a given date df = history(asset, ['close','high'], 10, '1m', dt="2023-05-05 14:30:00") df.close.plot() 
- blueshift.research.fundamentals(assets, metrics, nbars, frequency, dt=None)¶
- Returns given number of bars for the fundamental metrics. This always returns a dict with asset(s) as the key(s), or an empty dict of no data available. - Note - Available only if fundamental data source is supported. If the source data of the store supports point-in-time, the data returned is point-in-time as well. - Parameters:
- assets (asset object or a - listof assets.) – An asset or a list for which to fetch data.
- metrics ( - stror a- list.) – A field name or a list of Fundamental metrics.
- nbars (int) – Number of records to fetch. 
- frequency (str) – Frequency of bars (either ‘Q’ or ‘A’). 
- dt (pd.Timestamp or string that can be convereted to Timestamp.) – The timestamp at which to fetch the data. 
 
- Returns:
- historical fundamental metrics for the asset(s). 
- Return type:
- dict. 
 - # this assumes we have already selected the dataset from blueshift.research import symbol, fundamentals from blueshift.protocol import FundamentalColumns # see available columns print(FundamentalColumns.Quarterly) print(FundamentalColumns.Annual) print(FundamentalColumns.Ratios) # fetch an asset by symbol "ABC" asset = symbol('ABC') # fetch historical data for profit before tax and profit after # tax as of a given date metrics = [FundamentalColumns.Quarterly.pbt, FundamentalColumns.Quarterly.pat] df = fundamentals(asset, metrics, 10, 'Q', dt="2023-05-05 14:30:00") df[asset].pbt.plot() 
- blueshift.research.list_sectors(frequency)¶
- List the sectors for a given fundamental data source. Returns a list of strings (sector names). - Note - Available only if fundamental data source is supported. - Parameters:
- frequency (str) – Frequency of data (either ‘Q’ or ‘A’). 
- Returns:
- available sectors. 
- Return type:
- list. 
 
- blueshift.research.get_sectors(assets, dt=None)¶
- Get the sectors for the given assets where available. Returns a pandas dataframe with assets as the index and sector names as a column with name ‘sector’. - Note - Available only if fundamental data source is supported. 
- blueshift.research.run_pipeline(pipeline, start_date, end_date)¶
- Run a pipeline between given dates. This will return a pandas multi-index dataframe with timestamp as the first index, the selected assets the second index and the pipeline factor(s) as the column(s). - Important - This function will work only in the research environment, using this in a strategy will throw error. - # import Pipeline constructor and built-in factors/ filters # this assumes we have already selected the dataset from blueshift.research import run_pipeline from blueshift.pipeline import Pipeline from blueshift.library.pipelines import select_universe, period_returns # create the pipeline def create_pipeline(): pipe = Pipeline() liquidity = select_universe(200, 200) # a built-in liquidy screener mom = period_returns(20) # return over period factor mom_filter = mom > 0 # positive momentum pipe.add(mom,'momentum') pipe.set_screen(liquidity & mom_filter) return pipe # run the pipeline pipe = create_pipeline() results = run_pipeline(pipe, '2022-05-04', '2023-05-05') # returns multi-index df with dates as the first level and # filtered assets as the second level indices, with factors # added using pipe.add() as columns print(results.describe()) 
- blueshift.research.get_data_portal()¶
- get the current data portal object for selected dataset.