Compiled by Viraj Bhagat
Python, a programming language which was conceived in the late 1980s by Guido Van Rossum, has witnessed humongous growth, especially in the recent years due to its ease of use, extensive libraries, and elegant syntax.
How did a programming language land up with a name like ‘Python’?
Well, Guido, the creator of Python, needed a short, unique, and a slightly mysterious name and thus decided on “Python” while watching a comedy series called “Monty Python’s Flying Circus”.
If you are curious on knowing the history of Python as well as what is Python and its applications, you can always refer to the first chapter of the Python Handbook, which serves as your guide as you start your journey in Python.
We are moving towards the world of automation and thus, there is always a demand for people with a programming language experience. When it comes to the world of algorithmic trading, it is necessary to learn a programming language in order to make your trading algorithms smarter as well as faster.
It is true that you can outsource the coding part of your strategy to a competent programmer but it will be cumbersome later when you have to tweak your strategy according to the changing market scenario.
In this article we cover the following:
- Choosing a Programming Language
- Why use Python for Trading?
- Popularity of Python over the years
- Benefits and Drawbacks of Python in Algorithmic Trading
- Python vs. C++ vs. R
- Applications of Python in Finance
- Coding in Python for Trading
- Installation Guide for Python
- Popular Python Libraries aka Python Packages
- Working with data in Python
- Creating a sample trading strategy and backtesting in Python
- Evaluating the sample trading strategy
- How to get started with Python in Trading
Choosing a Programming Language
Before we understand the core concepts of Python and its application in finance as well as using Python for trading, let us understand the reason we should learn Python.
Having knowledge of a popular programming language is the building block to becoming a professional algorithmic trader. With rapid advancements in technology every day, it is difficult for programmers to learn all the programming languages.
One of the most common questions that we receive at QuantInsti is
“Which programming language should I learn for algorithmic trading?”
The answer to this question is that there is nothing like a “BEST” language for algorithmic trading. There are many important concepts taken into consideration in the entire trading process before choosing a programming language:
- modularity and
- various other trading strategy parameters
Each programming language has its own pros and cons and a balance between the pros and cons based on the requirements of the trading system will affect the choice of programming language an individual might prefer to learn.
Every organization has a different programming language based on its business and culture.
- What kind of trading system will you use?
- Are you planning to design an execution based trading system?
- Are you in need of a high-performance backtester?
Based on the answers to all these questions, one can decide on which programming language is the best for algorithmic trading.
Why use Python for Trading?
Python has become a preferred choice for trading recently as Python is open-source and all the packages are free for commercial use.
Python has gained traction in the quant finance community. Python makes it easy to build intricate statistical models with ease due to the availability of sufficient scientific libraries.
Some popular Python libraries are:
First updates to Python trading libraries are a regular occurrence in the developer community. There are countless communities out there.
Some of the frequented Python communities are:
- Python meetups - There are about 1,637 Python user groups worldwide in almost 191 cities, 37 countries and over 860,333 members.
- There are over 1.3 Million out of the 2 Million repositories on Github for Python
- There are over 1.7 Million questions about Python being answered on StackOverflow
And we have not even considered the vast majority of local communities for Python out there via various portals, groups, platforms, forums, etc.
Quant traders require a scripting language to build a prototype of the code. In that regard,
- Python has a huge significance in the overall trading process
- Python finds applications in prototyping quant models particularly in quant trading groups in banks and hedge funds
"Python is fast enough for our site and allows us to produce maintainable features in record times, with a minimum of developers,"
Why do quant traders prefer Python for trading?
Using Python for Trading helps them:
- build their own data connectors,
- execution mechanisms,
- with backtesting,
- risk management and order management,
- walk forward analysis, and
- optimization testing modules.
Algorithmic trading developers are often confused about whether to choose an open-source technology or a commercial/proprietary technology. Before deciding on this it is important to consider:
- the activity of the community surrounding a particular programming language,
- the ease of maintenance,
- the ease of installation,
- documentation of the language, and
- the maintenance costs.
Part of the popularity of Python as a programming language is due to the acknowledgement of it by some of the giants in the domain.
Here’s a small list of some of the biggies across the globe that use Python:
"Python has been an important part of Google since the beginning, and remains so as the system grows and evolves. Today dozens of Google engineers use Python, and we're looking for more people with skills in this language."
Python is increasingly finding widespread appeal. And Python is already being looked upon in the domain of trading.
Popularity of Python over the years
Python on the TIOBE Index
TIOBE ratings are calculated by counting hits of the most popular search engines. Twenty-five search engines are used to calculate the TIOBE index. The TIOBE Programming Community index is an indicator of the popularity of programming languages. The index started in the year 2001 and are updated once a month.
According to the TIOBE Index:
Python has gained 2.2 million developers over the past year
The TIOBE Index also ranks Python in the top spot for the fourth time!
Python on the PYPL IndexThe PopularitY of Programming Language aka PYPY Index is created by analyzing how often language tutorials are searched on Google. The PYPL Index is updated once a month.
According to PYPL:
- Worldwide, Python is the most popular language
- Python grew the most in the last 5 years (16.6%)
- Python is the top language in five countries (US, India, Germany, United Kingdom, France)
Python on Stackoverflow
Stackoverflow stated that:
“If we look at technologies that developers report that they do not use but want to learn, Python takes the top spot for the fourth year in a row.”
Python on Github
Github’s 2020 report states that Python ranks #2 in its list of The languages that dominated - Top languages over the years
Let us list down a few benefits of Python first.
- Parallelization and huge computational power of Python give scalability to the trading portfolio.
- Python makes it easier to write and evaluate algo trading structures because of its functional programming approach. Python code can be easily extended to dynamic algorithms for trading.
- Python can be used to develop some great trading platforms whereas using C or C++ is a hassle and time-consuming job.
- Trading using Python is an ideal choice for people who want to become pioneers with dynamic algo trading platforms.
- For individuals new to algorithmic trading, the Python code is easily readable and accessible.
- It is comparatively easier to fix new modules to Python language and make it expansive in trading.
- The existing modules also make it easier for algo traders to share functionality amongst different programs by decomposing them into individual modules which can be applied to various trading architectures.
- When using Python for trading it requires fewer lines of code due to the availability of extensive Python libraries.
- Python makes coding comparatively easier in trading. Quant traders can skip various steps which other languages like C or C++ might require.
- This also brings down the overall cost of maintaining the trading system.
- With a wide range of scientific libraries in Python, algorithmic traders can perform any kind of data analysis at an execution speed that is comparable to compiled languages like C++.
Just like every coin has two faces, there are some drawbacks of using Python for trading.
In Python, every variable is considered as an object, so every variable will store unnecessary information like size, value and reference pointer. When storing millions of variables if memory management is not done effectively, it could lead to memory leaks and performance bottlenecks.
However, for someone who is starting out in the field of programming, the pros of using Python for trading exceed the drawbacks making it a supreme choice of programming language for algorithmic trading platforms.
Python vs. C++ vs. R
Python is a relatively new programming language when compared to C++ and R. However, it is found that people prefer Python due to its ease of use. Let's understand the difference between Python and C++ first.
- A compiled language like C++ is often an ideal programming language choice if the backtesting parameter dimensions are large. However, Python makes use of high-performance libraries like Pandas or NumPy for backtesting to maintain competitiveness with its compiled equivalents.
- Between the two, Python or C++, the language to be used for backtesting and research environments will be decided on the basis of the requirements of the algorithm and the available libraries.
- Choosing C++ or Python will depend on the trading frequency. Python language is ideal for 5-minute bars. But when moving downtime sub-second time frames Python might not be an ideal choice.
- If speed is a distinctive factor to compete with your competition then using C++ is a better choice than using Python for Trading.
- C++ is a complicated language, unlike Python which even beginners can easily read, write and learn.
We have seen above that Python is preferred to C++ in most of the situations. But what about other programming languages, like R?
Well, the answer is that you can use either based on your requirements but as a beginner Python is preferred as it is easier to grasp and has a cleaner syntax.
Python already consists of a myriad of libraries, which consists of numerous modules which can be used directly in our program without the need of writing code for the function.
Trading systems evolve with time and any programming language choices will evolve along with them. If you want to enjoy the best of both worlds in algorithmic trading i.e. benefits of a general-purpose programming language and powerful tools of the scientific stack - Python would most definitely satisfy all the criteria.
According to SlashData,
- Python has gained 1.6 million developers over the past year
- Python is the fastest-growing language with more than six million developers
- 70% of developers focussed on machine learning (ML) report using Python, likely due to ML libraries like Google-developed TensorFlow, Facebook's PyTorch, and NumPy.
According to Payscale,
The average salary for someone skilled with Python could be around $92K in the United States.
Applications of Python in Finance
Python has huge applications in the field of web and software development. Python is also being extensively used nowadays due to its applications in the field of machine learning, where machines are trained to learn from the historical data and act accordingly on some new data.
Hence, Python finds its use across various domains such as:
- Medicine (to learn and predict diseases),
- Marketing(to understand and predict user behaviour) and
- now even in Trading (to analyze and build strategies based on financial data).
Today, finance professionals are enrolling for Python for trading courses to stay relevant in today’s world of finance. Gone are the days when computer programmers and Finance professionals were in separate divisions.
Python - making the headlines
- The Algorithmic Trading Market size was valued at USD 11.66 Billion in 2020 and is projected to reach USD 26.27 Billion by 2028, growing at a CAGR of 10.7% from 2021 to 2028.
- Companies are hiring computer engineers and training them in the world of finance. Trading algorithmically has become the dominant way of trading in the world.
- Already 70% of the US stock exchange order volume consists of algorithmic trading.
Thus, it makes sense for Equity traders and the like to acquaint themselves with any programming language to better their own trading strategy.
Coding in Python for Trading
After going through the advantages of using Python, let’s understand how you can actually start using it. Let's talk about the various components of Python.
- Anaconda is a distribution of Python, which means that it consists of all the tools and libraries required for the execution of our Python code.
- Downloading & installing libraries and tools individually can be a tedious task, which is why we install Anaconda.
- Anaconda consists of a majority of the Python packages which can be directly loaded to the IDE to use them.
- IDE or Integrated Development Environment, is a software platform for Python where we can write and execute our codes.
- It basically consists of a code editor, to write codes, a compiler or interpreter to convert our code into machine-readable language.
- It has a debugger to identify any bugs or errors in your code.
- Spyder IDE can be used to create multiple projects of Python.
- Jupyter is an open-source application for Python that allows us to create, write and implement codes in a more interactive format.
- It can be used to test small chunks of code, whereas we can use the Spyder IDE to implement bigger projects.
Conda is a package management system for Python which can be used to install, run and update Python libraries.
Note: Spyder IDE and Jupyter Notebook are a part of the Anaconda distribution; hence they need not be installed separately.
Installation Guide for Python
To understand how to install Python packages and resolve frequent issues which users face during installation, you can check out our article on How To Install Python Packages. It will also covers Pip install, dir() function and PyPi.
Let us now begin with the installation process of Anaconda. Follow the steps below to install and set up Anaconda on your Windows system:
Visit the Anaconda website to download Anaconda. Click on the version you want to download according to your system specifications (64-bit or 32-bit).
Run the downloaded file and click “Next” and accept the agreement by clicking “I agree”.
In select installation type, choose “Just Me (Recommended)” and choose the location where you wish to save Anaconda and click on Next.
In Advanced Options, checkmark both the boxes and click on Install. Once it is installed, click “Finish”.
Now, you have successfully installed Anaconda on your system and it is ready to run. You can open the Anaconda Navigator and find other tools like Jupyter Notebook and Spyder IDE.
Once we have installed Anaconda, we will now move on to one of the most important components of the Python landscape, i.e. Python Libraries.
Note: Anaconda provides support for Linux as well as macOS. The installation details for the OS are provided on the official website in detail.
Popular Python Libraries aka Python Packages
Libraries are a collection of reusable modules or functions which can be directly used in our code to perform a certain function without the necessity to write a code for the function.
As mentioned earlier, Python has a huge collection of libraries which can be used for various functionalities like computing, machine learning, visualizations, etc. However, we will talk about the most relevant Python libraries required for coding trading strategies before actually getting started with Python.
We will be required to:
- import financial data,
- perform numerical analysis,
- build trading strategies,
- plot graphs, and
- perform backtesting on data.
For all these functions, here are a few most widely used Python libraries for trading:
- NumPy or NumericalPy, is a Python library mostly used to perform numerical computing on arrays of data.
- The array is an element which contains a group of elements.
- One can perform different operations on it using the functions of NumPy.
- Pandas is a Python library mostly used with DataFrame, which is a tabular or a spreadsheet format where data is stored in rows and columns.
- Pandas can be used to import data from Excel and CSV files directly into the Python code.
- Pandas can also be used to perform data analysis and manipulation of the tabular data.
- Matplotlib is a Python library used to plot 2D graphs like bar charts, scatter plots, histograms etc.
- It consists of various functions to modify the graph according to our requirements too.
- TA-Lib or Technical Analysis library is an open-source Python library and is extensively used to perform technical analysis on financial data using technical indicators such as RSI (Relative Strength Index), Bollinger bands, MACD etc.
- It not only works with Python but also with other programming languages such as C/C++, Java, Perl etc.
Here are some of the functions available in TA-Lib:
- BBANDS - For Bollinger Bands,
- AROONOSC - For Aroon Oscillator,
- MACD - For Moving Average Convergence/Divergence,
- RSI - For Relative Strength Index.
- Zipline is a Python library for trading applications.
- It is an event-driven system that supports both backtesting and live trading.
- Zipline is well documented, has a great community, supports Interactive Broker and Pandas integration.
These are but a few of the libraries which you will be using as you start using Python to perfect your trading strategy.
To know about the myriad number of Python libraries in more detail, you can browse through this blog on Popular Python Trading platforms.
Working with data in Python
Knowing how to retrieve, format and use data is an essential part of trading using Python, as without data there is nothing you can go ahead with.
Financial data is available on various online websites. This data is also called as time-series data as it is indexed by time (the timescale can be monthly, weekly, daily, 5 minutely, minutely, etc.).
Apart from that, we can directly upload data from Excel sheets too which are in CSV format, which stores tabular values and can be imported to other files and codes.
Now, we will learn how to import both time-series data and data from CSV files through the examples given below.
Importing data in Python
Here’s an example on how to import time series data from Yahoo finance along with the explanation of the command in the comments:
Note: In Python, we can add comments by adding a ‘#’ symbol at the start of the line.
To fetch data from Yahoo finance, you need to first pip install yfinance. You can fetch data from Yahoo finance using the download method.
Now, let’s look at another example where we can import data from an existing CSV file:
Creating a sample trading strategy and backtesting in Python
One of the simplest trading strategies involves Moving averages. But before we dive right into the coding part, we shall first discuss the mechanism on how to find different types of moving averages.
Then we finally move on to one moving average trading strategy which is moving average convergence divergence, or in short, MACD.
Let’s start with a basic understanding of moving averages.
What are Moving Averages?
Moving Average also called Rolling average, is the mean or average of the specified data for a given set of consecutive periods. As new data becomes available, the mean of the data is computed by dropping the oldest value and adding the latest one.
So, in essence, the mean or average is rolling along with the data, and hence the name ‘Moving Average’.
An example of calculating the simple moving average is as follows:
Let us assume a window of 10, ie n = 10
In the financial market, the price of securities tends to fluctuate rapidly and as a result, when we plot the graph of the price series, it is very difficult to predict the trend or movement in the price of securities.
In such cases moving average will be helpful as it smoothens out the fluctuations, enabling traders to predict movement easily.
Slow Moving Averages: The moving averages with longer durations are known as slow-moving averages as they are slower to respond to a change in trend. This will generate smoother curves and contain lesser fluctuations.
Fast Moving Averages: The moving averages with shorter durations are known as fast-moving averages and are faster to respond to a change in trend.
Consider the chart shown above, it contains:
- the closing price of a stock IBM (blue line),
- the 10-day moving average (magnum line),
- the 50-day moving average (red line) and
- the 200-day moving average (green line).
It can be observed that the 200-day moving average is the smoothest and the 10-day moving average has the maximum number of fluctuations. Going further, you can see that the 10-day moving average line is a bit similar to the closing price graph.
Types of Moving Averages
There are three most commonly used types of moving averages, the simple, weighted and the exponential moving average. The only noteworthy difference between the various moving averages is the weights assigned to data points in the moving average period.
Let’s understand each one in further detail:
Simple Moving Average (SMA)
A simple moving average (SMA) is the average price of a security over a specific period of time. The simple moving average is the simplest type of moving average and calculated by adding the elements and dividing by the number of time periods.
All elements in the SMA have the same weightage. If the moving average period is 10, then each element will have a 10% weightage in the SMA.
The formula for the simple moving average is given below:
SMA = Sum of data points in the moving average period / Total number of periods
Exponential Moving Average (EMA)
The logic of exponential moving average is that latest prices have more bearing on the future price than past prices. Thus, more weight is given to the current prices than to the historic prices. With the highest weight to the latest price, the weights reduce exponentially over the past prices.
This makes the exponential moving average quicker to respond to short-term price fluctuations than a simple moving average.
The formula for the exponential moving average is given below:
EMA = (Closing price - EMA*(previous day)) x multiplier + EMA*(previous day)
Weightage multiplier = 2 / (moving average period +1)
Weighted Moving Average (WMA)
The weighted moving average is the moving average resulting from the multiplication of each component with a predefined weight.
The exponential moving average is a type of weighted moving average where the elements in the moving average period are assigned an exponentially increasing weightage.
A linearly weighted moving average (LWMA), generally referred to as weighted moving average (WMA), is computed by assigning a linearly increasing weightage to the elements in the moving average period.
Now that we have an understanding of moving average and their different types, let’s try to create a trading strategy using moving average.
Moving Average Convergence Divergence (MACD)
Moving Average Convergence Divergence or MACD was developed by Gerald Appel in the late seventies. It is one of the simplest and effective trend-following momentum indicators.
In MACD strategy, we use two series, MACD series which is the difference between the 26-day EMA and 12-day EMA and signal series which is the 9 day EMA of MACD series.
We can trigger the trading signal using MACD series and signal series.
- When the MACD line crosses above the signal line, then it is recommended to buy the underlying security.
- When the MACD line crosses below the signal line, then a signal to sell is triggered.
Implementing the MACD strategy in Python
Import the necessary Python libraries and read the stock market data
Calculate and plot the MACD series which is the difference 26-day EMA and 12-day EMA and signal series which is 9 day EMA of the MACD series.
Create a trading signal
When the value of MACD series is greater than signal series then buy, else sell.
Calculate cumulative returns
Evaluating the sample trading strategy
So far, we have created a trading strategy using Python as well as backtested it on historical data.
But does this mean it is ready to be deployed in the live markets?
Well, before we make our strategy live, we should understand its effectiveness, or in simpler words, the potential profitability of the strategy.
While there are many ways to evaluate a trading strategy, we will focus on the following,
- Annualised return,
- Annualised volatility, and
- Sharpe ratio.
Let’s understand them in detail as well as try to evaluate our own strategy based on these factors:
Annualised Return or Compound Annual Growth Rate (CAGR)
To put it simply, CAGR is the rate of return of your investment which includes the compounding of your investment. Thus it can be used to compare two strategies and decide which one suits your needs.
CAGR can be easily calculated with the following formula:
CAGR = [(Final value of investment / Initial value of investment)^(1/number of years)] - 1
For example, we invest in 2000 which grows to 4000 in the first year but drops to 3000 in the second year. Now, if we calculate the CAGR of the investment, it would be as follows:
CAGR = (3000/2000)^(½) - 1 = 0.22 = 22%
For our strategy, we will try to calculate the daily returns first and then calculate the CAGR. The code, as well as the output, is given below:
The CAGR is 17.38%
Before we define annualised volatility, let’s understand the meaning of volatility. A stock’s volatility is the variation in the stock price over a period of time.
For the strategy, we are using the following formula:
Annualised Volatility = square root (trading days) * square root (variance)
The code, as well as the output, is given below:
The annualised volatility is 29.67%
Sharpe Ratio is basically used by investors to understand the risk taken in comparison to the risk-free investments, such as treasury bonds etc.
The sharpe ratio can be calculated in the following manner:
Sharpe ratio = [r(x) - r(f)] / δ(x)
r(x) = annualised return of investment x
r(f) = Annualised risk free rate
δ(x) = Standard deviation of r(x)
The Sharpe Ratio should be high in case of similar or peers. The code, as well as the output, is given below:
The Sharpe ratio is 0.62
How to get started with Python in Trading
Here are some very helpful resources that will guide you about getting started with Python in the domain of Trading.
- WEBINAR RECORDING - How to use Python for Trading and Investment
- BLOGS & TUTORIALS - Python for Trading
- VIDEO PLAYLIST - Python for Trading
- FREE BOOK - Python Basics: With Illustrations From The Financial Markets
- LEARNING TRACK: Algorithmic Trading for Everyone
Python is widely used in the field of machine learning and now trading. In this article, we have covered all that would be required for getting started with Python. It is important to learn Python so that you can code your own trading strategies and test them.
Python's extensive libraries and modules smoothen the process of creating machine learning algorithms without the need to write huge codes.
To start learning Python and code different types of trading strategies, you can select the “Algorithmic Trading For Everyone” learning track on Quantra.
In case you are interested in an instructor led online classroom format, EPAT by QuantInsti is the algorithmic trading course for you. Get in touch with a course counsellor to know more details about EPAT.
Disclaimer: All data and information provided in this article are for informational purposes only. QuantInsti® makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information in this article and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.