By Sunil Guglani (Assistant Vice President at NCDEX)
Are you new to Algo trading? Confused about how to start, where to start?
There are lots of courses and content available on backtesting, technical indicators, Python for finance on the web.
Also, there is a good amount of stuff available on the benefits of Algo trading over manual trading. But if you want to develop your first algo, what are the stages you have to go through, there is limited information on this.
Note: These stages are purely based on my experience, it may/may not differ from standard procedures (which I could not find ☺).
I will try to explain this by taking a real trading strategy. Here are the stages I will cover:
- Formulate the Trading Concept/Logic
- Filtering criteria to choose the scripts
- Verification of Logic (at High Level)
- Backtesting
- Optimization of Parameters
- Paper Trading or Forward Testing or Simulation Trading in the real environment
- Deployment in the real environment
Formulate the Trading Concept/Logic
First thing first, algo trading is not - rocket science..yes.. it is really not…it is just a puzzle, which you need to solve more effectively, efficiently than a few other people. And we know the puzzle..Buy at low and sell at high.
Not necessary to follow any existing theory or indicators or strategy. You don’t need to be a market expert. You can make your own if you are comfortable. (my first algo was an arbitrage across the exchange. I was an amateur and I didn’t even know the meaning of arbitrage at that time. ☺)
You need to answer following in this stage:
- What will be my logic to achieve my goal(Buy at Low and Sell at High)?
- What will be the frequency of my trades?
- Which segment will it work most effectively(Equity/Commodity/FX/Crypto..)
- How much should be backtesting period?
- Which automation tools/languages will be most useful for this logic?
Example: I think FMCG (Fast moving consumer goods) stocks are generally less volatile in nature and they revolve around the same price range. It makes me think that they are range bound and so they are mean reverting. Therefore I will try to apply some mean reversion technique in my strategy. Also, intraday prices are highly volatile therefore I will choose daily closing prices for calculation and trade.
Considering my above idea, my answers to the following are :
What will be my logic to achieve my goal (Buy at Low and Sell at High)?
I will be using mean reversion trading strategy. For this, I will use Bollinger band. Buy: if Close price goes below -2 standard deviation from the mean. square off position: if Close price goes above +2 standard deviation from the mean.
What will be the frequency of my trades?
Daily.
Which segment will it work most effectively (Equity, Commodity, FX or Crypto)?
I will use the equity cash segment.
How much should be backtesting period?
3 years (longer the better).
Which automation tools/languages will be most useful for this logic?
I am comfortable with Python. We should use the tools where we are most comfortable with.
Filtering criteria to choose the scripts
Great. You have achieved a big milestone. You have penned down the core logic and did high level planning for your algo.
If you observe, your logic will work best on some specific conditions and specific scripts. Remember you have already decided the segment in first stage.
The selection/filtration of scripts can be done before or during the trading hours (real-time). It all depends on your core logic.
Example:
As we have decided, we will use FMCG scripts and high liquid stocks only, therefore filtering criteria to be used as follows: a. Industry(FMCG). b. Volume>100000
Verification of Logic (at High Level)
Algo development is an expensive process, it requires a lot of effort and time. It is better to verify your Core logic at High level using a Visualization tool or excel.
It will save a lot of time later. You can use Tableau, power BI or just excel to write the logic and verify your logics. You can also use charting tools like trading view, investing.com or something similar.
It is a 5 year graph of ITC Ltd on daily frequency. And SMA, STD(-2,+2) is applied on the 20 days lookback period. (Courtesy: tradingview.com). The visual below gives us confirmation about our logic.
Backtesting
Backtesting is the way of verification of your logic using historical price information. This stage helps in understanding how well the logic would have worked if you used this in the past. This stage gives an indication about the performance of your logic. It also gives you the opportunity to optimise the logic and its parameters.
One should consider the following parameters in this stage:
Various performance parameters
- Returns
- Max drawdown
- Max continuous drawdown
- Max profit
- Max continuous profit
- Number of transactions
- Average returns per transaction
- Transaction charges
- Slippages,etc.
Few more points to focus are:
- Stop loss/Trailing stop loss
- Target price
- Entry Criteria
- Exit criteria
A very common mistake algo traders do is using the current price information in generating the trading signals. We should always use historical data in our logic and next price information for buy and sell signals.
Activities to perform are:
We need to perform the following activities in this stage:
- Import the necessary libraries
- Fetch the historical OHLC data for an instrument.
- Write the supporting functions to achieve our logic
- Generate the buy/sell signals using candlestick patterns
- Visualize the output
- Checking returns of the strategy
Analysis and visualisation of backtesting output data to understand the scope of optimisation in your algorithm.
Note: To review steps used in backtesting please refer to my article
Example:
Import the necessary libraries
import pandas as pd import numpy as np from nsepy import get_history import datetime import seaborn as sns import matplotlib.pyplot as plt import plotly import plotly.graph_objs as go from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot ------------------------------------------------------ ''' Function to fetch data from nsepy ''' def Fetch_Historical_Data_nsepy(script,script_cat,fromdate,todate,mode): if mode=='nsepy': if (script_cat=='NSE_EQ')|(script_cat=='BSE_EQ'): try: df_stock_data = get_history(symbol=script, start=fromdate, end=todate) except (RuntimeError,ConnectionError): print ("Fetch_Historical_Data function, nsepy function: ",RuntimeError) df_stock_data = get_history(symbol=script, start=fromdate, end=todate) return df_stock_data elif script_cat=='NSE_INDEX': try: df_index_data = get_history(symbol=script, start=fromdate, end=todate, index=True) except (RuntimeError,ConnectionError): print ("Fetch_Historical_Data function, nsepy function: ",RuntimeError) df_index_data = get_history(symbol=script, start=fromdate, end=todate, index=True) return df_index_data ''' Function to fetch data for the algorithm ''' def fetch_data(script) : todate=datetime.datetime.now().date() fromdate=todate-datetime.timedelta(days=365*4) script_cat='NSE_EQ' script=script mode='nsepy' df=Fetch_Historical_Data_nsepy(script,script_cat,fromdate,todate,mode) return df ------------------------------------------------------ ''' Function to fetch data from nsepy ''' def Fetch_Historical_Data_nsepy(script,script_cat,fromdate,todate,mode): if mode=='nsepy': if (script_cat=='NSE_EQ')|(script_cat=='BSE_EQ'): try: df_stock_data = get_history(symbol=script, start=fromdate, end=todate) except (RuntimeError,ConnectionError): print ("Fetch_Historical_Data function, nsepy function: ",RuntimeError) df_stock_data = get_history(symbol=script, start=fromdate, end=todate) return df_stock_data elif script_cat=='NSE_INDEX': try: df_index_data = get_history(symbol=script, start=fromdate, end=todate, index=True) except (RuntimeError,ConnectionError): print ("Fetch_Historical_Data function, nsepy function: ",RuntimeError) df_index_data = get_history(symbol=script, start=fromdate, end=todate, index=True) return df_index_data ''' Function to fetch data for the algorithm ''' def fetch_data(script) : todate=datetime.datetime.now().date() fromdate=todate-datetime.timedelta(days=365*4) script_cat='NSE_EQ' script=script mode='nsepy' df=Fetch_Historical_Data_nsepy(script,script_cat,fromdate,todate,mode) return df
Fetch the historical OHLC data for an instrument.
''' Function to fetch data from nsepy ''' def Fetch_Historical_Data_nsepy(script,script_cat,fromdate,todate,mode): if mode=='nsepy': if (script_cat=='NSE_EQ')|(script_cat=='BSE_EQ'): try: df_stock_data = get_history(symbol=script, start=fromdate, end=todate) except (RuntimeError,ConnectionError): print ("Fetch_Historical_Data function, nsepy function: ",RuntimeError) df_stock_data = get_history(symbol=script, start=fromdate, end=todate) return df_stock_data elif script_cat=='NSE_INDEX': try: df_index_data = get_history(symbol=script, start=fromdate, end=todate, index=True) except (RuntimeError,ConnectionError): print ("Fetch_Historical_Data function, nsepy function: ",RuntimeError) df_index_data = get_history(symbol=script, start=fromdate, end=todate, index=True) return df_index_data ''' Function to fetch data for the algorithm ''' def fetch_data(script) : todate=datetime.datetime.now().date() fromdate=todate-datetime.timedelta(days=365*4) script_cat='NSE_EQ' script=script mode='nsepy' df=Fetch_Historical_Data_nsepy(script,script_cat,fromdate,todate,mode) return df
Write the supporting functions to achieve our logic
''' # Function to backtest any strategy by using parameter #Dataframe, #strategy_type(long/short),entry_criteria,exit_criteria #positional_field ,price_field(example Close price),stoploss_pct(stop loss pct),target_pct,only_profit(should wait for trade to be in profit True/False) ''' def backtest_strategy_stoploss(df, strategy_type,lst_entry_criteria,lst_exit_criteria, positional_field,price_field,stoploss_pct,target_pct,only_profit): df['buy_price']=0.0000 df['sell_price']=0.0000 df['buy_time']=None df['sell_time']=None exit_reason_field=positional_field+'_exit_flag' df[positional_field]=0 df[exit_reason_field]='' pos=0 last_buy_price=0.00 last_sell_price=0.00 for d in range(0,len(df)): entry_flag=lst_entry_criteria[d] exit_flag=lst_exit_criteria[d] curr_price=df[price_field].iloc[d] curr_time=df.index[d] stoploss_exit=False target_exit=False only_profit_exit=False exit_reason='' if stoploss_pct!=0: if (strategy_type=='long')&(last_buy_price>0): if ((curr_price-last_buy_price)*100/last_buy_price)0): if ((last_sell_price-curr_price)*100/curr_price)0): if ((curr_price-last_buy_price)*100/last_buy_price)>target_pct: target_exit=True exit_reason='TRM' elif (strategy_type=='short')&(last_sell_price>0): if ((last_sell_price-curr_price)*100/curr_price)>target_pct: target_exit=True exit_reason='TRM' if only_profit==True: if (strategy_type=='long')&(last_buy_price>0): if ((curr_price-last_buy_price)*100/last_buy_price)>0: only_profit_exit=True elif (strategy_type=='short')&(last_sell_price>0): if ((last_sell_price-curr_price)*100/curr_price)>0: only_profit_exit=True else: only_profit_exit=True if exit_flag: exit_reason='ECM' if entry_flag&(pos==0) : if strategy_type=='long': df['buy_price'].iat[d]= df[price_field].iloc[d] last_buy_price=df[price_field].iloc[d] df[positional_field].iat[d]=1 elif strategy_type=='short': df['sell_price'].iat[d]= df[price_field].iloc[d] last_sell_price=df[price_field].iloc[d] df[positional_field].iat[d]=-1 pos=1 elif (exit_flag|stoploss_exit|target_exit)& only_profit_exit & (pos==1) : df[exit_reason_field].iat[d]=exit_reason if strategy_type=='long': df['sell_price'].iat[d]= df[price_field].iloc[d] last_sell_price=df[price_field].iloc[d] df[positional_field].iat[d]=-1 elif strategy_type=='short': df['buy_price'].iat[d]= df[price_field].iloc[d] last_buy_price=df[price_field].iloc[d] df[positional_field].iat[d]=1 pos=0 df_temp=df[df[positional_field]!=0].copy() df_temp['buy_time']=df_temp.index df_temp['sell_time']=df_temp.index df_buy=df_temp[df_temp.buy_price>0][['buy_price','buy_time']] df_buy.reset_index(drop=True,inplace=True) df_sell=df_temp[df_temp.sell_price>0][['sell_price','sell_time']] df_sell.reset_index(drop=True,inplace=True) long= pd.concat([df_buy,df_sell],axis=1,copy=True) if len(long)>0: if ~(long['sell_price'].iloc[-1]>0): long['sell_price'].iat[-1]=curr_price long['sell_time'].iat[-1]=curr_time long["returns"]=(long["sell_price"]-long["buy_price"])*100/long["buy_price"] long['cum_returns']=(long['sell_price']/long['buy_price']).cumprod() long['investment_period']=(long['sell_time']-long['buy_time']) short= pd.concat([df_buy,df_sell],axis=1,copy=True) if len(short)>0: if ~(short['buy_price'].iloc[-1]>0): short['buy_price'].iat[-1]=curr_price short['buy_time'].iat[-1]=curr_time short["returns"]=(short["sell_price"]-short["buy_price"])*100/short["buy_price"] short['cum_returns']=(short['sell_price']/short['buy_price']).cumprod() short['investment_period']=(short['buy_time']-short['sell_time']) if strategy_type=='long': return df,long else: return df,short ''' # Function to generate backtest reports ''' def backtest_reports_local(df_summary,lot_size,trx_charge): #print("investment period", df_summary['investment_period'].sum()) #print("number of transactions", df_summary['investment_period'].count()) print("Sum of returns in %", df_summary['returns'].sum()) print("Average returns per transaction in %", df_summary['returns'].mean()) print("Absolute returns", df_summary['returns_abs'].sum()) print("Absolute returns per trx", df_summary['returns_abs'].sum()/df_summary['returns_abs'].count()) print("Max drawdown for a trx", df_summary[df_summary.returns_abs<0]['returns_abs'].min()) print("Max returns for a trx", df_summary[df_summary.returns_abs>0]['returns_abs'].max()) print("Losing trx", df_summary[df_summary.returns_abs<0]['returns_abs'].count()) print("Winning trx", df_summary[df_summary.returns_abs>0]['returns_abs'].count()) print("Win/Lose ratio ", (df_summary[df_summary.returns_abs>0]['returns_abs'].count())/(df_summary[df_summary.returns_abs<0]['returns_abs'].count())) df_summary.index=df_summary.buy_time df_summary['returns2']=np.round((np.int64(df_summary['returns']/.5))*.5,0) g1=df_summary[['returns2']].cumsum().plot(figsize=(12,8)) #fig.autofmt_xdate() #ax.fmt_xdata = mdates.DateFormatter('%Y-%m-%d') plt.tight_layout() plt.show(g1) g2=df_summary[['returns_abs']].cumsum().plot(figsize=(12,8)) plt.tight_layout() plt.show(g2) ''' df_summary['investment_period2']=(np.int64(df_summary['investment_period'])) plt.title("Investment period vs Count of transactions") g2=sns.countplot(x="investment_period2", data=df_summary) plt.show(g2) #print ('Total returns', df_summary[['returns_abs']].cumsum()) ''' plt.title("Percentage Returns vs Count of transactions") g3=sns.countplot(x="returns2", data=df_summary) plt.show(g3)
Collecting everything
''' main Function ''' def bb_backtest_positional(script,lot_size,trx_charges,slippages): summary_min_grand=pd.DataFrame() df=fetch_data(script) draw_cndl_chart(df) for strat_type in ['long']: df_temp=df.copy() price_field='Close' strategy_type=strat_type df_temp,boll_entry,boll_exit=bb(df_temp,price_field,strategy_type) if strategy_type=='long': final_entry=boll_entry final_exit=boll_exit elif strategy_type=='short': final_entry=boll_entry final_exit=boll_exit positional_field='pos_bb' price_field='Close' stoploss_pct=0 target_pct=200 only_profit=True df_temp,summary_min=backtest_strategy_stoploss(df_temp, strategy_type,list(final_entry),list(final_exit), positional_field,price_field,stoploss_pct,target_pct,only_profit) summary_min['returns_abs']=(summary_min['sell_price']-summary_min['buy_price'])*lot_size summary_min['returns_abs']=summary_min['returns_abs']-trx_charges summary_min['strat_type']=strat_type summary_min_grand=summary_min_grand.append(summary_min) return summary_min_grand
Output is as follows:
Candlestick chart
**************BACKTEST RESULTS OF : ITC ************************
Sum of returns in % 12.582204198954024
Average returns per transaction in % 1.7974577427077176
Absolute returns 39.50000000000003
Absolute returns per trx 5.642857142857147
Max drawdown for a trx -36.30000000000001
Max returns for a trx 34.200000000000045
Losing trx 1
Winning trx 6
Win/Lose ratio 6.0
Optimization of Parameters
We should keep optimizing our algo parameters on a regular basis. By using historical data and using various methods like ML/backtesting performance.
Things to consider are :
Avoid overfitting of parameters. Overfitting that you optimize the logic and parameters to the extent that the program will work best in some specific situations and scenarios. But it may go bad if the scenario changes.
Paper Trading or Forward Testing or Simulation Trading in the real environment
Paper trading is the way of verification of your logic in the real environment. One doesn’t need to invest actual money in this stage. And It gives very accurate and precise results. One can expect same/similar results in the real environment. But it is time-consuming activity. One can do this by using the feature provided by his/her broker, or you can also develop your framework to test the same.
Deployment in the real environment
Deployment in the real environment requires multiple facets to be managed, which are generally not being considered in backtesting
Functionally following aspects are required to be managed:
- Order management
- Risk Management
- Money/Fund Management
- Diversification of assets
- Portfolio management
- User Management
- Slippages
Technically following aspects are required to be managed:
- Establish Connection with the broker API.
- Passing the buy/sell orders using the broker connection,
- Establish Connection with the data API (if data vendor is different from broker)
- Accessing the real time and historical data using data API connection.
This concludes the article. If I missed out on anything, feel free to share your thoughts. I will try to incorporate it.
Conclusion:
To summarize the following algo trading strategies, Algo can be considered as a software product and all the stages can be easily mapped with “Software Product Development Life Cycle”
All the stages have their own importance and are critical to the success of the algo.
- Formulate the trading logic/idea
- Filtering criteria to choose the scripts
- Verification of Logic( Visualization)
- Backtesting
- Optimization of Parameters
- Paper Trading or Forward Testing or Simulation Trading
- Deployment in the real environment
Disclaimer: All investments and trading in the stock market involve risk. Any decisions to place trades in the financial markets, including trading in stock or options or other financial instruments is a personal decision that should only be made after thorough research, including a personal risk and financial assessment and the engagement of professional assistance to the extent you believe necessary. The trading strategies or related information mentioned in this article is for informational purposes only.