Python, a programming language which was conceived in the late 1980s by Guido Van Rossum, has witnessed humongous growth, especially in the recent years due to its ease of use, extensive libraries, and elegant syntax.
How did Python get its name?How did a programming language land up with a name like ‘Python’? Well, Guido, the creator of Python, though he needed a short, unique, and a slightly mysterious name and thus decided on “Python” while watching a comedy series called “Monty Python’s Flying Circus”.
Role of Python TodayApart from its huge applications in the field of web and software development, one of the reasons why Python is being extensively used nowadays is due to its applications in the field of machine learning, where machines are trained to learn from the historical data and act accordingly on some new data. Hence, it finds its use across various domains such as Medicine (to learn and predict diseases), Marketing(to understand and predict user behaviour) and now even in Trading (to analyze and build strategies based on financial data).
However, many people often query about getting started with Python. Some are curious while some are equally anxious about getting started with Python.
In this article, we will learn how we can get started with using Python for Trading. The topics covered in this article will include:
- Why use Python for Trading?
- Installation Guide for Python
- Most used libraries in Python for trading
- How to import data from various sources into Python?
- Plotting the graphs in Python
- Backtesting in Python
- Resources and books for reference
Why use Python for Trading?Before we understand why Python is a good choice for coding trading strategies, let us first understand the factors we consider most important while trading, which is, identifying trends in price movements, the speed of execution of a trade and using various strategies to generate fruitful trades. There are high chances of human error in these if done manually, that is where Algorithmic Trading helps to automate all of these processes and also allows you to test your strategy on historical/past data also called as backtesting. And this is why Python is used - to code your trading strategies, to predict future price movements and to back-test your strategies on historical data.
Why use Python instead of R?Now you must be wondering why Python should be used instead of other similar programming languages like R? Well, the answer is you can use any of the language based on your requirements but as a beginner Python is preferred more than R as it is easier to grasp and has a cleaner syntax. Python already consists of a myriad of libraries, which consists of numerous modules which can be used directly in our program without the need of writing a code for the function.
You can refer to this blog and get a more detailed explanation as to why Python is one of the most preferred languages for Algorithmic Traders.
Components of PythonNow that you understand the advantages of using Python, let’s understand how you can actually start using it. The first step is definitely to have Python on your system to start using it. For that, we have provided a step by step guide on how to install and run Python on your system. But before we move into it, let’s understand the components which we will be installing and using before getting started with Python.
Anaconda – Anaconda is a distribution of Python, which means that it consists of all the tools and libraries required for the execution of our Python code. Downloading and installing libraries and tools individually can be a tedious task, which is why we install Anaconda as it consists of a majority of the Python packages which can be directly loaded to the IDE to use them.
Spyder IDE - IDE or Integrated Development Environment, is a software platform where we can write and execute our codes. It basically consists of a code editor, to write codes, a compiler or interpreter to convert our code into machine-readable language and a debugger to identify any bugs or errors in your code. Spyder IDE can be used to create multiple projects of Python.
3. Jupyter Notebook – Jupyter is an open-source application that allows us to create, write and implement codes in a more interactive format. It can be used to test small chunks of code, whereas we can use the Spyder IDE to implement bigger projects.
- Conda – Conda is a package management system which can be used to install, run and update libraries.
Spyder IDE and Jupyter Notebook are a part of the Anaconda distribution; hence they need not be installed separately.
To learn about the applicability of IBridgePy, which is an open-sourced software used to connect to Interactive Brokers C++ API for execution of python codes in live markets check out this video.
Installation Guide for PythonLet us now begin with the installation process of Anaconda. Follow the below steps to install and set up Anaconda on your Windows system:
Step 1Go to https://www.anaconda.com/download/ to download Anaconda. Click on the version you want to download according to your system specifications (64-bit or 32-bit)
Step 2Run the downloaded file and click on next and accept the agreement.
Step 3In select installation type, choose “Just Me (Recommended)” and choose the location where you wish to save Anaconda and click on Next.
Step 4In Advanced Options, checkmark both the boxes and click on Install. Once it is installed, click on Finish.
Now, you have successfully installed Anaconda on your system and it is ready to run. You can open the Anaconda Navigator and find other tools like Jupyter Notebook and Spyder IDE.
Libraries in PythonLibraries are a collection of reusable modules or functions which can be directly used in our code to perform a certain function without the necessity to write a code for the function. As mentioned earlier, Python has a huge collection of libraries which can be used for various functionalities like computing, machine learning, visualizations, etc. However, we will talk about the most relevant libraries required for coding trading strategies before actually getting started with Python. We will be required to import financial data, perform numerical analysis, build trading strategies, plot graphs and perform backtesting on data. For all these functions, here are a few most widely used libraries:
NumPy – NumPy or NumericalPy, is most used to perform numerical computing on arrays of data. The array is an element which contains a group of elements and we can perform different operations on it using the functions of NumPy.
Pandas – Pandas is mostly used with DataFrame, which is a tabular or a spreadsheet format where data is stored in rows and columns. Pandas can be used to import data from Excel and CSV files directly into the Python code and perform data analysis and manipulation of the tabular data.
Matplotlib – Matplotlib is used to plot 2D graphs like bar charts, scatter plots, histograms etc. It consists of various functions to modify the graph according to our requirements too.
4. Zipline – Zipline is a Python library for trading applications that power the Quantopian service mentioned above. It is an event-driven system that supports both backtesting and live trading.
How to import data to Python?This is one of the most important questions which needs to be answered before getting started with Python, as without the data there is nothing you can go ahead with. Financial data is available on various online websites like Yahoo Finance, Google Finance which is also called as time series data as it is indexed by time (the timescale can be monthly, weekly, daily, 5 minutely, minutely, etc.). Apart from that, we can directly upload data from Excel sheets too which are in the CSV format, which stores tabular values and can be imported to other files and codes.
Now, we will learn how to import both time series data and data from CSV files through the examples given below.
Here’s an example on how to import time series data directly from Yahoo Finance:
import pandas as pd pd.core.common.is_list_like = pd.api.types.is_list_like from pandas_datareader import data df = data.get_data_yahoo(‘AAPL’, ‘2017-01-01’, ‘2018-01-01’) df.head()
Date Open High ... Adj Close Volume 2017-01-03 115.800003 116.330002 ... 113.013916 28781900 2017-01-04 115.849998 116.510002 ... 112.887413 21118100 2017-01-05 115.919998 116.860001 ... 113.461502 22193600 2017-01-06 116.779999 118.160004 ... 114.726402 31751900 2017-01-09 117.949997 119.430000 ... 115.777237 33561900 [5 rows x 6 columns]
Now let’s understand this code.
As mentioned earlier, pandas is a library which can be used to import data. We, therefore, load the pandas library into our code so that we can use it for the same. We do so in the first line using ‘import pandas as pd’, the ‘as pd’ refers to the fact that we can address the pandas library as pd in our code. It will be just a short form used for pandas.
This line is added for the latest version of pandas-datareader to work.
Here we are importing the function ‘data’ from the pandas module ‘pandas_datareader’. ‘data’ will store the financial data as a DataFrame which is it will store it in a tabular format.
Here we import the time series data of Apple, denoted as AAPL, for 1 year from 1st January 2017 to 1st January 2018 from Yahoo Finance using the function:
df.get_data_yahoo(‘symbol of the stock’, ‘start date’, ‘end date’).
df.head() is used to print or display the first ‘n’ rows of the data which we imported. The default value which is ‘5’, is considered when we leave the field blank.
Now, let us see another example where we can import data from an existing CSV file:
import pandas as pd file_1 = pd.read_csv(‘C:/Data/filename.csv’)
Here, we are importing ‘filename.csv’ and storing it in ‘file’ using ‘pd.read_csv()’, within which we have specified the path of filename.csv
How to plot graphs in Python?Line charts, bar graphs, candlestick charts are heavily used in representing trading data which requires us to explore the ‘Data Visualization’ feature of Python while getting started with Python for trading. The data which we load into our code can be converted into graphical representations like line graphs, scatter plots etc. Just as we used the ‘pandas’ library earlier to import data, we will use ‘matplotlib.pyplot’ to plot the data in 2-D. We will understand this in more detail using the below examples.
Creating a line graph in Python:
import matplotlib.pyplot as pltOutput: This displays the Line Graph for the Adj Close column in Data.
#here 'df' signifies the financial data which we imported earlier from Yahoo Finance plt.plot(df['Adj Close'])
#We can give a title to our graph which in this case is 'Plotting Data' plt.title('Plotting data')
#Used to label x-axis plt.xlabel('Time')
#Used to label y-axis plt.ylabel('Price') plt.show()
Creating a scatter plot in Python:
import matplotlib.pyplot as plt import numpy as np
n = 1000
#Here np.random.rand creates two NumPy arrays x and y each of which have 1000 random values between 0 and 1.
x = np.random.rand(n)
y = np.random.rand(n)
plt.scatter(x, y, marker=’o’)
Creating a Histogram in Python:
import matplotlib.pyplot as plt #Values for Histogram frame = [21,54,32,67,89,42,65,78,54,54,21,67,42,78,21,54,54,89,78,54,42,78,21,67]
#This will create the Histogram with the given parameters where histogram type is ‘Bar’ and the width of each bar will be 0.6 plt.hist(frame, bins=20, histtype='bar', rwidth=0.6)
For more detailed usage of functions from pandas, matplotlib and NumPy, you can take this 'Python for Trading' course which will cover all the basics required.
Backtesting in PythonOnce your strategy is in place, you need to ensure its efficiency so that we can use it for trading purposes. For this, we can conduct backtesting, which is a process in which the strategy will be tested on historical data through which we will be able to see how accurate the results were in predicting the data. Few of the Python trading libraries for backtesting are as follows:
- PyAlgoTrade – This library can be used to perform event-driven backtesting, which is performing backtesting on data which is based on the occurrence of some events, collected through various sources like Yahoo Finance, Google Finance as well as data imported through CSV files. It also allows for paper-trading, in which you can test your strategies in a simulated environment without risking real money.
- Pybacktest – This can be used for vectorized backtesting, where you can perform operations on the entire dataframe at once, unlike event-driven where new data gets generated based on events. Pybacktest can be used along with Pandas to specify trading strategies.
Books and References
- A Byte of Python
- A Beginner’s Python Tutorial
- Python Programming for the Absolute Beginner, 3rd Edition
- Python for Data Analysis, By Wes McKinney
ConclusionPython is widely used in the field of machine learning and now trading. In this article, we have covered all that would be required for getting started with Python. It is important to learn it so that you can code your own trading strategies and test them. Its extensive libraries and modules smoothen the process of creating machine learning algorithms without the need to write huge codes.
Next StepNow, that you know how to get started with Python for trading through this article, it’s time to dig deeper and learn. The Executive Programme In Algorithmic Trading (EPAT™) is designed for anyone who wants to start learning Algorithmic Trading in Python. It covers important concepts from scratch including financial concepts like market microstructure, trading strategies, statistics in financial markets and how to implement all these in Python. One can also understand the concepts of data analysis and modelling in Python in detail.n historical data, etc.
Disclaimer: All investments and trading in the stock market involve risk. Any decisions to place trades in the financial markets, including trading in stock or options or other financial instruments is a personal decision that should only be made after thorough research, including a personal risk and financial assessment and the engagement of professional assistance to the extent you believe necessary. The trading strategies or related information mentioned in this article is for informational purposes only.