Skip to content

A Crash Course in Python

Note

If you are already familiar with the Python programming language, you can skip this section entirely!

Where do we begin

Python, with its English-like syntax and duck-typing, is an easy language to pick up for beginners. Python for developing strategy on Blueshift® is even easier as we require only a small set of the language to pick up. Let's get started. If you already have Python installed locally, that's great. Else we can use the Blueshift® code editor itself to some the examples below.

How does it work

Python is a programming language. This means it is a specification of grammar that defines how to express your logic in that language, and also how to interpret thoughts expressed in that language by other people. The most popular implementation of Python is called CPython. In this implementation, you write down a series of sentences (called source) in Python and then submit it to a program called a Python interpreter. This is usually a binary program specific to the operating system you are using. This interpreter program, loosely speaking, carries out a series of steps to run your source code. First, it analyzes the source through a series of complex steps, that includes checking and understanding your code structure, through processes like building a syntax tree and carrying out symanctic analysis. Once satisfied, it generates an intermediate output called bytecode. This bytecode is a translation of your source in to a more compact form so that your program can be run in a fast and efficient manner and is usually independent of your operating system platform. Finally, it takes the bytecode input and runs them through the Python virtual machine (part of the Python program you submitted your source to, and specific to your operating system) to do whatever you intended it to do, or die trying.

How Blueshift® runs your code

On Blueshift®, you write your code in the code editor. Blueshift® then extracts the main entry-point functions from your source and stores them internally. As you run the algo, Blueshift® automatically calls those main entry-point functions in a defined way. If any of your main entry-point functions also call other user defined functions or refer to user defined objects, they also get called or referred to in the process.

Apart from the main entry-point functions, Blueshift® also exposes a list of useful functions (called API functions) that your code can use for different tasks, like getting past data, ordering stuff and checking positions etc.

For more details, see how Blueshift works.

Variables, types and classes

Any programming usually involve taking some input, storing them in the computer memory, manipulating them and produce an output. A variable is just a name to refer to such a memory location where something is stored for manipulation. In below code snippet, a and b are variables.

1
2
3
a = 12
b = a + 13
a = "alohomora"
Variables can be of different types. In Python, the type of a variable can even change while the program is still running. For example the variable a first stores the number 12 and then changes type when it stores a string (a series of characters) at the last line!

Apart from basic built-int variables like number and string, Python has a lot of other types that are useful in different scenarios. In addition, it allows you to define a custom type - also called class. For more details, look here. But for our purpose, below is an example of a class. First we define a type called MyType. Then we create an object of that type.

1
2
3
4
class MyType():
    pass

x = MyType()
A special thing about such custom types (like x) is that we can add another variable to them as attribute. For example below we store a number as an attribute (named some_var) of the object x we just created.

1
2
x.some_var = 12
print(x.some_var)
This is the same trick we used to store our stock symbol in the quickstart code example!

1
context.stock = symbol('AAPL')
Here context is a special object supplied by the Blueshift® platform. We store the variable stock as an attribute of context, so that we can refer to it later from other places.

Python data structures

Apart from variables, Python also provide a bunch of very useful way to structure their storage for various purposes. The most useful ones are list, dictionary and set.

A list is a way of storing references to a bunch of objects (built-ins like number or strings or custom class instances) so that we can refer them in a defined order (first item, second item etc.). This is useful for strategy development to define a collection of similar items, like our trading universe.

1
2
context.universe = [symbol('AAPL'), symbol('MSFT'), symbol('AMZN')]
print(context.universe[1])
Here we define a list of stocks to trade as a list named universe and store it as an attribute of the context object.

List are useful for accessing a bunch of objects in a given order. But if we want to access them at random, using a key, we need a mapping structure. This is called dict (dictionary) in Python.

1
2
3
4
context.universe = [symbol('AAPL'), symbol('MSFT'), symbol('AMZN')]
context.weights = {context.universe[0]:0.25, context.universe[1]:0.25, context.universe[2]:0.5 }
second_asset = context.universe[1]
print(context.weights[second_asset])
Here we store the universe of stocks as list, and assign a portfolio weight for each one them as dictionary weight (also stored as an attribute of context). This allows us to store the weight linked to a specific asset, and access them directly using the asset as key.

Conditions and loops

One of the key feature of any programming language is its control structure, which allows us to define the manipulation we want to carry out on the variables. If the operation we want to perform on a variable depends on one or more conditions, this is called conditional structure. In Python, we use if..elif..else structure to do that. In case we want to carry out some operations repeatedly over a certain bunch of items (or for a given number of times), we can use the loop control structure in Python. In this case the for loop is most useful. Example below defines the portfolio weights of our universe based on value of a signal generated for each of the asset, looping through assets in our universe.

1
2
3
4
5
6
7
8
signal = 1 # signal set through some other function not shown here
for asset in context.universe:
    if signal > 0.5:
        context.weights[asset] = 1
    elif signal < -0.5:
        context.weights[asset] = -1
    else:
        context.weights[asset] = 0
Here the part of the first line following the # symbol is treated as comment and is ignored by the Python interpreter. For more details on control statements see here.

Defining functions

One of the most important features of any programming language is function. A function in this context can be loosely defined as a set of instructions that act as a single entity. A function, once defined, can be called in multiple places in your source. Usually a function will take in some variables as input (arguments) and return, after its computation, a variable (return value). A function in Python has it's own namespace, meaning any variables you define within a function is not available outside the function in general.

On Blueshift®, the main entry-point functions are, obviously, functions. For example, we define the starting function, called initialize as:

1
2
def initialize(context):
    context.universe = [symbol('AAPL'), symbol('MSFT'), symbol('AMZN')]
Note, here in function takes in the special variable context as argument, defines a list of stocks as our trading universe, and set it as an attribute to the context variable. You can also define your own function, and call it from another, as below:

1
2
3
4
5
6
7
8
9
def initialize(context):
    context.universe = [symbol('AAPL'), symbol('MSFT'), symbol('AMZN')]
    status = my_func(context)

def my_func(context):
    for stock in context.universe:
        #do something here

    return True
Here we define a function called my_func which takes in a single argument and returns a boolean value. This function is called in the initialize function. For more on functions, see here

More on classes

On Blueshift®, you are mostly good even if you are not very familiar with classes in Python, as long as you are comfortable with variables, basic data structures, control structures and functions. But sometimes it becomes very useful to arrange our codes in custom classes. As discussed above, in Python classes define custom types. We have already seen how to define a custom class in Python. We can also inherit while defining a class, i.e. start from an already defined class (or type) and add our own customizations (attributes or methods). One use of custom classes on Blueshift® is defining custom filter and factor in pipeline methods. An example below

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
from blueshift.pipeline import Pipeline, CustomFilter, CustomFactor
from blueshift.pipeline.data import EquityPricing

def average_volume_filter(lookback, amount):
    class AvgDailyDollarVolumeTraded(CustomFilter):
        inputs = [EquityPricing.close, EquityPricing.volume]
        def compute(self,today,assets,out,close_price,volume):
            dollar_volume = np.mean(close_price * volume, axis=0)
            high_volume = dollar_volume > amount
            out[:] = high_volume
    return AvgDailyDollarVolumeTraded(window_length = lookback)
Here we import the base class CustomFilter and create our own custom class called AvgDailyDollarVolumeTraded, inheriting from CustomFilter. Within the class definition, we define an attribute input (which you can easily recognize to be a list). We also define a method (a function that is part of a class definition) called compute. The base class CustomFilter already has a method called compute. By redefining it here, we override the original one. In this method we define our logic of how to compute average daily volume of each stocks (supplied as assets), based on the average of price (supplied as close_price) and traded volume (volume). Also notice, we define this whole class within an outer function called average_volume_filter that takes in two parameters - the lookback period to compute the average and an amount to filter on.

For more discussion on classes see here

Other useful resources

The material presented in this page is a succinct description of various features of the Python language. It may help you to get started, but hardly do any justice to its vastness and flexibility. There are lots of free and paid resources to learn Python online. A few of them listed below. We strongly recommend to visit these links to get familiar with the language.