Machine Learning In Trading Q&A By Dr. Ernest P. Chan

20 min read

It is common knowledge that Machine Learning and AI have transformed the world we live in. From series recommendations on Netflix to assisting doctors in cancer research, machine learning has become an indispensable part of the world. And it has already made its presence felt in the world of trading.

Over the years, we have received a lot of queries for our Machine Learning course on Quantra. Recently, we hosted a Q&A session with Dr Ernest P. Chan (Managing Member of QTS Capital Management, LLC) which received a tremendous response from the viewers. While Dr Ernest Chan answered most of the questions, our EPAT Faculty, Varun Divakar has also helped address multiple queries.

Today, we bring to you an exhaustive list of Q&As about Machine Learning for Trading, hoping you will be better informed about the role of ML in trading.

This article covers:



50 Questions and Answers: Machine Learning In Trading

We have clubbed together some of the common questions together so that the reader can understand the answers better in the below categories:

  1. Starting with ML for Trading
  2. Career and Jobs
  3. Applications
  4. Data
  5. Models
  6. Trading Strategy
  7. Trading in Markets
  8. Plan, Tools and Factors
  9. Backtesting
  10. Prediction


Starting with ML for Trading

Question #1: What is trading all about and how can it be done profitably?
Dr Ernest Chan: If you check the stock exchanges of your regions, you will find that the stock prices are always dynamic, ie they either move up or down on a daily basis. In fact, some have wild swings in an hour too. We call this volatility.

Now, the stock price is a result of a variety of factors, such as revenues, profits or losses in a financial year of the company, or even the industry they are in matters. Now, trading is about understanding the influence of these factors as well as studying the divergence in the market and predicting the outcome to make a profit. 

The profitability of your trades increases with accuracy in predicting the outcome of the movement of the market. While you do have examples of certain individuals booking a massive profit in one day, these are few and far in between. You should always create a trading strategy before entering the market and stick to it if you don’t want to be overwhelmed by the market sentiments.

Question #2: As someone with little or no knowledge about Algorithmic trading and with no background in programming, how can we start in machine learning for financial trade field?
Can you give us a book recommendation?
Varun Divakar: The great thing about the times you are living in is the presence of the internet. We have a lot of credible websites which provide you with relevant knowledge so that you can start learning Machine Learning and applying it in the trading landscape.

Sites such as investing.com, Forbes, Wall Street will give you general information about the financial world, while Coursera, MIT, even Oxford has machine learning courses for Finance professionals. However, you should first start with basic online reading material and once you are comfortable with ML terminologies, go for courses.

There seems to be little difference between the online or offline mode of teaching, thus an online course would be good enough. Try to understand your requirement before enrolling for a course from the plethora of e-learning providers. We do have free as well as paid courses on Quantra which you can check out. When it comes to books on algorithmic trading, you can refer this by no means an exhaustive list on this blog.

Question #3: What are the most used programming languages and frameworks in the machine learning for financial trade? Is this really for retail or intraday trader like me?
Varun Divakar: While high-frequency trading firms use C/C++, individual traders prefer R or Python. We are noticing a gradual shift towards Python as more and more people are coding their strategies in Python.

The reason Python is preferred by traders is that it is relatively simpler to learn and execute. An added advantage of selecting Python is the large global community who are active in various technical forums and websites. In fact, you will find a solution to any problem you encounter already posted online. Thus, Python would be a better tool for trading than R. Of course, it can be used for retail as well as intra-day trading.

Question #4: Basics, prerequisites to learn this. How much of Stats and Coding are required?

  • Varun Divakar: There is this blog which encapsulates the skills required to become an algorithmic trader:
  • Quantitative skills - Mathematical prowess i.e. the knowledge of statistics, mathematical models, statistical research methods govern your quantitative skills
  • Programming knowledge - Coding or programming is another key skill that is very much essential to pick in order to become a successful algorithmic trader.
  • Financial markets acumen - Any type of trading methodology needs a clear understanding and knowledge of the financial markets.
  • Problem-solving capabilities - Trading is the perfect platform to help you put into practice your problem-solving capabilities.
  • Data management - Getting access to quality data is important for any kind of trader. While there are ample sources of data available for daily price data, access to historical intraday data can be restricted. It is important to understand the patterns of data from across various markets and exchanges across the globe.
  • System architecture - As important as it is to know how to use a trading system, it is also important to know the inside of a trading system.
  • Managing your risks - Risk management is another important component. The strategies will help you make money, but your risk management system is the one that is often responsible for preserving it.
  • Compliance and regulations - Every country and region have its own set of regulations and compliance requirements that it must adhere to trade in the respective trading destinations.

You can, of course, refer this article for more details.


Career and Jobs

Question #1: What are the major differences between Quant research and Data science? Is there any chance that these two are overlapped to some extent?
Varun Divakar: It is true that they do overlap to some extent. Data science casts a wider net than Quant research and to give you a simple example, while the end goal of Data Science is mostly accuracy or some other metric, in Quant, your end goal is profit and the data is mostly time series.

Question #2: How to find a job in the field?
Varun Divakar: ML is enjoying a boom period as different industries are finding novel ways to use ML and solve issues plaguing them. As with any other field which is in demand, you need to demonstrate that you possess the right skills to be fit for the job. Thus, as it is a technical field, once you acquire a thorough knowledge in the field, create models and then publish blogs/code on GitHub and technical forums to gain traction. This, in turn, would help you through the job-seeking process and gain employment.

Question #3: I have been learning Artificial Neural Networks (ANN) and have recently implemented my own Long Short Term Memory Net in Python to evaluate stock market data. I would like to know if I should continue on my path of self-learning, teaching myself AI through the information available online, or if you recommend I go back to school for a PhD in AI. I have already gotten my Master's in Petroleum Engineering. I am 24 years old and want to continue my work in the technology field for many years to come.
Dr Ernest Chan: You do not need a PhD in AI in order to apply AI to trading. My PhD is in physics, yet I joined one of the world's best AI group at the time as a researcher.

Question #4: What is the difference between Technical analysis & Algo Trading? What is the software required to apply the knowledge during Algo trading training? Are they paid or free? Which companies in INDIA offers trading accounts with Algo trading platform?
Varun Divakar: If I have to describe it in one line, Technical analysis is using indicators to buy or sell, and Algo Trading is automating the process of creating indicators and sending the buy-sell signals to exchange. Python or R are programming languages used to code your algo trading strategy. They are both free. Zerodha and IB (Interactive Brokers) are the companies which offer trading accounts with algo trading platforms in India

Question #5: I have a good understanding of financial markets. I want to learn ML to trade in stocks and currencies. What should an investor keep in mind before implementing ML in investing? Also, what is the success rate of trades you have experienced using this technology? Which asset class is best suited to implement this technique? I also want to know what are the job prospects for a candidate who has decent financial knowledge and understanding of AI?
Dr Ernest Chan: It depends on the problem you are trying to solve and how well you can frame the problem statement for the algo to solve. In addition to the above, you should also pay attention to the quality of data that you are using to train the model. You can use ML/AI in all asset classes. Currently, the job prospects for AI/ML-based traders are extremely good.


Applications

Question #1: How can we apply Machine Learning/Artificial Intelligence in Stock or F&O option? How to do an algo trading setup?
Dr Ernest Chan: The advantage of Machine Learning is that it continues to learn even after it is deployed. A simple example is the facebook algorithm to recognise faces in a photo and suggest facebook profiles to be tagged. In a similar manner, once we train a machine learning algorithm on historical data it can start to predict the trends or the profit and stop-loss targets for trade.

Likewise, in F&O you can use Machine Learning to create different types of strategies such as covered call, protected puts. An algo trading setup can be done by estimating the Highs and Lows for the next few periods and then hedging the risk accordingly. I would like to add here that even if you are a genius at looking at the chart patterns and predicting the future price using various indicators, you can always benefit from ML/AI algorithms by coding your strategy in Python and then letting the machine do the work for you.

Question #2: Can an AI beat the stock market like DeepMind can beat Go? How much improvement and impact we will have if we implement AI, machine learning and deep learning?
Varun Divakar: There is one major difference between trading and the board game GO. While the board game has a fixed set of rules, trading has an emotional aspect attached to it, which may or may not be rational.

Thus, while it may prove to be difficult, over time, one cannot rule out the possibility that AI should be able to outperform the stock market. Thus, you can generate better returns than the stock market as long as your strategy is robust. In terms of impact, it depends on your ability to define the problem statement and provide accurate inputs. It can, in some cases, improve efficiency by 100%.

Question #3: How is the cloud used to support algorithm trading?
Varun Divakar: There can be occasions where executing a trading strategy can be computationally intensive. In such scenarios, the cloud is extensively used to circumvent architectural issues.

Question #4: How different is AI/Machine Learning in financial applications when compared to applications in more traditional areas such as computer vision and image classification?  How important is the input data?
Dr Ernest Chan: The main differences are that the signal-to-noise ratio in finance is much lower than other applications due to arbitrage activities. Also due to arbitrage, there is "alpha decay", so the predictive accuracy typically declines in time. Financial market undergoes regime changes occasionally, so not all past data are suitable as features for training the ML classifier.

Question #5: Can 'Machine learning' be effectively applied for Trading, as Trading not only depends on the historical data; It also depends on other dynamic factors like government regulations, market conditions etc.
Varun Divakar: Yes, it can be applied with market sentiment indicators like NLP on twitter data.


Data

Question #1: The market is unpredictable. ML requires Input data to train the algorithm to arrive at better results, so how can an unpredictable market data be the input to the ML algorithm?
Dr Ernest Chan: ML is often used to make predictions in a noisy environment. For example, self-driving cars cannot expect a pedestrian to act in a completely predictive manner!

Question #2: What is the best way to weave in data from social media sites while taking care of fake news?
Varun Divakar: For using data from social media you can use tweepy. You can use botometer, which is a library used to check if a particular tweet was made by a bot or not.

Question #3: What is a good dataset, features to apply ML on? How would you approach order flow patterns recognition using ML? What is a good timeframe to start applying ML on?
Varun Divakar: You can refer to the research paper on using SVM in HFT trading in SSRN.com This should give you a good idea of how to use such models. You can also try free Machine learning in a trading course on Quantra.

Question #4: How to prepare training data from OHLC, volume, Open Interest data for one very liquid instrument? Suppose we have one minute candle then how what is the minimum data length needed for a reliable prediction model? How to choose the correct MI model?
Varun Divakar: The previous day's 1-minute candlestick data should be sufficient. Use a classifier if you want to predict the trend and use a CART model if your data is unscaled and huge.

Question #5: How to incorporate multiple bars, i.e. recognize bar patterns, into algo?
Varun Divakar: You can use Japanese candlestick pattern from Talib (technical analysis library) in Python. In fact, the library has a comprehensive set of functions to create candlestick patterns.


Models

Question #1: How to use ML/AI for quantity management (number of shares to be bought/sold) so as to maximize profit? How to create an ML model which can work well with the uncertainty of future stock prices based on historical trends? How can a model learn to improve its operations based on trading decisions taken by it earlier?
Varun Divakar: You can always use the prediction's probability to scale the lot sizes to maximise the profit. You need a large volume of data that comprises of different life cycles of a stock, to make an ML algo that is robust. You need to create another model that can take this model's prediction as input and predict the behaviour.

Question #2: What do you think is the best type of ML or DL model to apply to trade systems? And why?
Dr Ernest Chan: When it comes to Machine learning algorithms, Classification models usually outperform the Regression ones and CART models along with XGBoost are very good in terms of trading.

Question #3: What are the independent and output variables of the machine learning model and how frequently does it run? What are the sub-models, if any? Let's say that we have fit an ML model based on currency pair prices, indicators, etc. for the last 252 days. The model can now predict the next day price. How can we use it in market conditions? We simply put the previous day's data and we buy/long according to the prediction? Do we do that each day and for how long is it valid? Do we update the model(fit) the model for each passing day?
How will this work in real conditions?
Varun Divakar: For example, technical indicators or any OHLCV data for that matter can be used for prediction as the independent variables and trend prediction be the dependent variable. The model should be run continuously with new data being appended as fast as possible.

Question #4: What should be the time horizon used to train supervised ML trading models. Should we ensure bullish and bearish periods are covered for training?
Varun Divakar: You should train them as frequently as possible to make sure the algorithm is always up to date.

Question #5: Considering the noise and randomness in the stock market, while making an ML model, what should we be trying to predict - the price, the direction, the magnitude? And what models are best to use - supervised, unsupervised, reinforcement, a combination of two?
Dr. Ernest Chan: Predicting the trend using a classifier is one of the most used techniques. But it depends on the problem you are trying to solve in your trading.

Question #6: Is it possible to present us a simple real ML model and how it's implemented on a trading platform like FXCM or Interactive Brokers?
Varun Divakar: You can refer to the Quantra course on Strategy Implementationusing Interactive Brokers.


Trading Strategy

Question #1: How explored is the machine learning area for quantitative trading? Is there still room for more quant firms or is the market getting overcrowded (just like in HFT)? And for statistical arbitrage strategies, specifically?
Dr Ernest Chan: Just like every trader has his own logic for defining the amount of time s/he will stay in the trade, selecting an indicator or defining stop-loss targets, we have different firms trying different approaches to creating an optimum strategy. Thus, there is a lot of room as every firm is trying to do something new, with the same case for arbitrage strategies.

Question #2: Can I apply a continuous improving machine learning algo in my current strategies that already have an alpha, in order to help me update the strategy parameters to keep the alpha alive in the ever-changing market conditions?
Dr Ernest Chan: Yes, absolutely. The advantage of machine learning over conventional algorithms is that depending on the strategy devised, it keeps on learning and improving.

Question #3: Is there a strategy to trade options by VWAP Algos with Entry and trailing Exit.
Now as reinforcement learning gains more traction in other fields how is it applicable in trading?
Varun Divakar: Use Long short-term memory (LSTM) models for entry and exits. You can also use a deep learning model where you can simply input the prices and the volume associated with the price, and the model will give you the VWAP. A deep reinforcement model can also be used to give the entry and exit points for a trade. For example, if the algo predicts that the market is expected to go below 10 points from the existing price, then you should enter 5-6 points below the existing price.

Similarly, in the next time period, if the market is expected to go as high as 15 points then you put the exit point around that price. Of course, it will depend on your risk appetite as well and the above example is just a generalisation and not a rule as such.

Question #4: Are there machine learning strategies that are consistently profitable with Sharpe ratio greater than 1.0 in forex and cryptocurrency trading?
Varun Divakar: Yes. There are many traders who have consistently done so.


Trading in Markets

Question #1: How much time to learn setup Algo business if spending 5-10hr/day 5days/wk for novice level trader with little programming experience?
Varun Divakar: If you are putting in the effort, then you can become a certified algo trader in 6 months.

Question #2: What is the best algo for predicting the stock market?
Dr Ernest Chan: There is no “best” algo for predicting the stock market. Every trader has his/her own trading style which is based on how you can develop your models by leveraging your experience and computing power of the machine. Any algo which performs according to your expectations would be the best algo for you.

Question #3: Is there any proof that algorithmic trading works on markets like futures, stocks, forex, cryptocurrencies and etc.?
Varun Divakar: In terms of acceptance towards algo trading, 80% of U.S. stock trades is algo based and rising. You can refer to SSRN.com which has published numerous articles on various facets of algo trading. Thus, I feel it is as good as any other proof that algorithmic trading works in all kinds of markets.

Question #4: Does machine learning work in a volatile market? How?
Varun Divakar: While machine learning works in all types of markets, it depends on the skills of the person creating and coding the trading strategy. Thus, as long as you can optimize the entry and exit points and deploy a strategy which predicts the stock market with a reasonable degree of accuracy, your profitability will increase.

Question #5: I have an algo that does buy low sell higher. Once in the trade, it can unwind due to stop-loss (trade has negative P&L from the beginning) or giveback (trade's making money from day 1). The proportion of winning/losing trades are around 80/20, but the losing trades are very potent on a per-trade basis. I have other signals to weed out the losing portion but it also weeds out some of the winning trades. Will ML or AI help in this?
Varun Divakar: You will need to train an ML model to recognise the winning trades by targeting the precision and recall metrics during Cross-Validation.

Question #6: I'd like to identify positive news stories regarding publicly traded companies from a couple of market sites to help me determine which stocks may see a short term gain.
I then want to bounce those stocks off some metrics to then feed me signals in real-time. Would NLP be a means to help me ID which news stories may have a positive impact? Or would you recommend another way to filter news for stories which would most likely cause a short term real or sentiment-based bump?
Varun Divakar: You can use NLP techniques and import data from Tweepy to understand the market sentiment and create the model.

Question #7: Should one expect machine learning that best works on relatively static systems like credit fraud be sustainably successful with something so dynamic like markets?
Dr Ernest Chan: It is certainly more challenging to use ML to predict the financial market than to predict credit fraud. But there are people who have succeeded.


Plan, Tools and Factors

Question #1: Can we trade solely based on machine learning? Will machine learning end inefficiencies of the market that can be traded by retail traders?
Dr. Ernest Chan: Machine learning has a variety of applications in different sectors, including finance. The main objective of using machine learning for trading is to remove the emotional component of manual trading as well as finding inefficiencies in the market faster than a normal human. In that respect, to answer the question, yes, we can create a trading strategy using machine learning and once deployed, it can trigger buy or sell orders on its own.

To answer the second question, there is a very good chance that machine learning will end the inefficiencies of the market. One must understand, however, that each trader has his/her own risk appetite and it will be reflected in their trading strategies. Thus, it might be a different scenario in the future with machine learning trading strategies dominating the market, but there will always be different opportunities present.

Question #2: Are there any systematic methods, process or tools to develop alpha?
Varun Divakar: You can try Quantra Blueshift for a start. It is a free and fast backtesting platform with minute-level data covering multiple asset classes and markets.

Question #3: How do you set a contingency plan to pull the plug?
Varun Divakar: Stop-loss criterion and order placing rate should be considered for killing an algo. Continuous performance monitoring of the trade is one way to go about it. For example, If in a time period the trader loses more than 0.5% then you cut it. Ultimately, it depends on your risk appetite.

Question #4: Which of the plethora of different neural net tools would you recommend?
Varun Divakar: Deep Reinforcement model works well in Trading. LSTM, RNN as well as GAN can be used for trading too.

Question #5: What are the factors for reliable reinforcement learning trading? 
Varun Divakar: Order book actions along with tick data are some of the factors required for reliable reinforcement trading.

Question #6: Are there any techniques to offset slippage in options trading? Can you customize orders so they react to certain underlying events as instead of the option pricing?
Varun Divakar: Yes, you can use Reinforcement model which can be trained to react to certain events, for eg buy or sell events. To offset slippage, you can use an algorithm to predict whether we should use a limit order or market order.

Question #7: What kind of machine learning tools can be used in stock alpha trading?
Varun Divakar: You can use Deep Learning model to create indicators/strategies that will help in creating alpha based models. To optimise the alpha, you can try decision algorithms.

Question #8: Is it possible to use a gradient to check for peaks?
Varun Divakar: Yes, it is possible to use a gradient to check for peaks. In the Python SciPy library, you can use minimise and maximise functions to find the local minima and local maxima which can be used to check for troughs and peaks.

Question #9: How to use Machine Learning for a positive risk-adjusted alpha generation?
Varun Divakar: You can specify the problem statement and then generate a Decision Tree Model that can solve it. Once you are satisfied with the model, use that decision tree to create an alpha.


Backtesting

Question #1: How can we avoid the problem of overfitting and what special points shall we consider while backtesting?
Varun Divakar: You can use regularization or increase the data points. In simple terms, before we start the exercise, we should clean the data and try to make sure that the model focuses more on finding a pattern than memorising the data. Along with this, you should also perform hyperparameter tuning as well as cross-validation.

Question #2: Would you recommend to build own backtester? Or should I use backtesting tools that are already available? What software/ tools for backtesting would you recommend?
Dr Ernest Chan: It is better to use those already available as building one from scratch is a very time-consuming process. You can always try Quantra Blueshift, which is more than a backtesting platform.


Prediction

Question #1: Currently working on a project about time series forecasting, stock market price prediction using Generative Adversarial Network (GANs). Not really with the intention to predict, but more as a proof of concept. Besides the OHLC plus Volume, I'm trying to find good features, technical indicators, other assets, etc. I can use to feed the GAN with. Question is: which ones can I use, or which ones are currently used by traders to get an idea about stock price movements?
Dr Ernest Chan: You can try to add sentiment-based data to improve accuracy. For example, getting the twitter feed sentiment using tweepy and then add the rolling sentiment scores.
Apart from that, traders do use technical indicators, like Moving average, RSI, along with crossover strategy to get a fair idea of the market movements.

Question #2: When using neural networks (MLP) to predict stock prices, is it generally helpful to include technical analysis indicators in the data? Or is it sufficient to use specific stock price data?
Dr. Ernest Chan: Yes, technical indicators do help in predicting stock directions. 
No, stock price alone may not be sufficient for profitability.

Question #3: On a classification problem, you want to predict if the stock will go up or down? How do you do that? Please explain how you place the order with your code as well?
Varun Divakar: You can use a classifier model from Sklearn. For example, you can use a Decision Tree Classifier and pass the past OHLC data and some indicators as inputs and label the Uptrend and Downtrend points in the data as the corresponding outputs to be predicted. While the actual code will be too lengthy, you can refer to any of the machine learning blogs which explain various concepts in detail, along with codes.


Presentation

You can check out the powerpoint presentation for this webinar here:

Check out all of our 40+ previous webinar powerpoint presentations on Slideshare.


Webinar Recording


Overview

This Q&A session on Machine Learning in Trading, with Dr. Ernest Chan was the perfect opportunity to ask him any query pertaining to this topic. It was helpful to those who wish to apply their technical skills in AI, Cloud, Machine Learning etc. to Financial Markets or aspire to belong to the algorithmic trading community. 

In our webinars, we try to cater to all the questions/queries sent in by the attendees. Some questions are common to a lot of people while some are exclusive and niche. Both types of questions bring new perspectives to all the participants and shed more clarity on the topic. 

Event Specifications:

  • Tuesday, June 11, 2019 at 11:00 AM ET | 8:30 PM IST | 10:00 PM SGT
  • Questions were shortlisted prior to the webinar.
  • Questions were specific to the following categories: Artificial Intelligence, Machine Learning, Deep Learning, and their application in the Financial Markets. 

About the Authors

About Dr. Ernest Chan

Dr. Ernest P. Chan is the Managing Member of QTS Capital Management, LLC. He has worked for various investment banks (Morgan Stanley, Credit Suisse, Maple) and hedge funds (Mapleridge, Millennium Partners, MANE) since 1997. 

Dr. Chan received his PhD in Physics from Cornell University and was a member of IBM’s Human Language Technologies group before joining the financial industry. He was a co-founder and principal of EXP Capital Management, LLC, a Chicago-based investment firm.

Chan is also the author of:

  • Quantitative Trading: How to Build Your Own Algorithmic Trading Business (Wiley),
  • Algorithmic Trading: Winning Strategies and Their Rationale and
  • Machine Trading: Deploying Computer Algorithms to Conquer the Markets.

He is a popular financial blogger at http://epchan.blogspot.com.

About Varun Divakar

Varun Divakar is a member of the Quantra® Research and Development team at
QuantInsti®, and is responsible for creating the content for trading strategies, using
Quantitative and Machine Learning techniques. Prior to QuantInsti, Varun worked
as an associate commodities trader managing international energy and softs
markets at Futures First.

You can enroll for the online machine learning course on Quantra which covers classification algorithms, performance measures in machine learning, hyper-parameters, and building of supervised classifiers.


Disclaimer: All data and information provided in this article are for informational purposes only. QuantInsti® makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information in this article and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.