Autocorrelation in Trading: A Practical Python Approach to Analyzing Time Series Data

By Chainika Thakar

Autocorrelation is a statistical concept that measures the correlation between observations of a time series and its lagged values. It is commonly used in various fields, including trading for technical analysis, to identify patterns, trends, and relationships within data.

Autocorrelation helps analyse the dependence between past and present values and provides insights into the persistence or reversibility of data patterns. This helps the trader learn about the trend of stock prices.

All the concepts covered in this blog are taken from this Quantra learning track on Financial time series analysis for trading. You can take a Free Preview of the course by clicking on the green-coloured Free Preview button.

This blog covers:

What is autocorrelation?
Example of autocorrelation
Why is autocorrelation used?
How does autocorrelation work?
How to compute autocorrelation?
How to use autocorrelation with Python in trading?
Autocorrelation in Technical Analysis
Autocorrelation vs Partial autocorrelation
Pros of using autocorrelation
Cons of using autocorrelation
FAQs

What is autocorrelation?

Autocorrelation refers to the statistical correlation between observations of a time series with their past or future values. In simple terms, It quantifies the similarity or dependence between consecutive data points.

Example of autocorrelation

Let us assume a stock price time series where the closing prices for each day are recorded. Autocorrelation in this context would measure the relationship between the closing price on a given day and the closing prices on previous or future days.

Here, you can see the two observations in autocorrelation. It must be noted that autocorrelation can take lags of more days.

The observations are:

If the closing price today is positively correlated with the closing price of the previous day, it indicates positive autocorrelation and if not it is considered as a negative correlation.
The positive autocorrelation suggests short-term momentum or trend-following behaviour. Traders can utilise this autocorrelation to identify potential trading opportunities based on the persistence of price movements.

You can see how positive and negative autocorrelation are visualised below.

In the image above, the x-axis shows the period of time in years, months etc. Whereas, the y-axis shows the autocorrelation value which we will learn how to compute and utilise ahead in the blog.

Why is autocorrelation used?

Autocorrelation is a powerful tool that can enhance your trading skills by enabling you to understand market dynamics, make predictions, manage risk effectively, and develop smarter strategies for more successful trading decisions.

Let us check out some of the uses of autocorrelation in trading below:

Pattern identification: Autocorrelation enables you to uncover meaningful patterns and relationships in financial data. By comparing past and present market values, you can identify recurring trends and correlations.
Predicting the future price changes: By studying past autocorrelation patterns, you can gain insights into potential price changes and adjust your trading strategies accordingly.
Smart strategy development: Leveraging autocorrelation, you can fine-tune your trading strategies by identifying high or low correlation in the mentioned time periods. This knowledge helps you adapt your strategies to capitalise on highly correlated periods for trend-following approaches or low correlation periods for mean-reversion strategies.
Managing risk: Autocorrelation provides valuable insights into market volatility and stability, allowing you to better manage risk. By assessing the likelihood of price reversals or trend continuations, you can make more informed decisions regarding risk management.

Autocorrelation vs Partial autocorrelation

ACF considers both direct and indirect effects, while PACF concentrates exclusively on the direct effect of lagged prices on the current price. PACF helps enhance our understanding of the specific relationships within the time series.

Autocorrelation	Partial autocorrelation
Autocorrelation measures the correlation between a time series observation and its lagged values. It quantifies the linear relationship between an observation and its previous observations at different lags.	Partial autocorrelation measures the direct correlation between an observation and its lagged values, while removing the indirect correlation through intermediate lags.
ACF measures the overall correlation at each lag without considering the influence of intermediate lags. It helps identify the presence of significant patterns and trends in the data.	PACF helps identify the specific lag(s) that directly influence an observation without the influence of other lags. It provides insights into the unique contribution of each lag to the current observation.
ACF is useful for detecting seasonality, identifying the order of an autoregressive (AR) model, and determining the appropriate lag values for forecasting.	PACF is useful for determining the order of a moving average (MA) model, identifying the presence of significant lags, and building autoregressive integrated moving average (ARIMA) models.

How does autocorrelation work?

Let us see the working of autocorrelation in a step by step manner. By following the steps below, you can effectively apply autocorrelation analysis to gain insights into the relationship and patterns within your time series data, aiding in decision-making and strategy development.

Gather Time Series Data: Collect the time series data you want to analyse. This could be any sequence of observations recorded at regular intervals, such as stock prices, sales figures, etc.
Calculate the Lagged Values: For each data point in your time series, determine the lagged values by selecting the previous observations at specific time intervals. Common intervals include one day, one week, one month, or any relevant time frame based on your data.
Compute the Correlation Coefficients: Calculate the correlation coefficients between the current data point and its corresponding lagged values. The correlation coefficient measures the strength and direction of the relationship.

Common methods for calculating correlation include Pearson correlation or Spearman correlation, depending on the nature of your data.

Create an Autocorrelation Function (ACF) Plot: Plot the correlation coefficients on the y-axis and the lagged values on the x-axis. This visual representation is known as the Autocorrelation Function (ACF) plot. The ACF plot helps you visualise the correlation patterns and identify significant correlations at different lags.
Analyse the ACF Plot: Examine the ACF plot to interpret the autocorrelation patterns. Look for significant correlation coefficients at specific lag values. Positive autocorrelation suggests a similar pattern between consecutive data points, while negative autocorrelation indicates an inverse relationship. The magnitude of the correlation coefficient indicates the strength of the relationship.
Make Informed Decisions: Based on the autocorrelation analysis, make informed decisions regarding your trading or analysis. Positive autocorrelation may indicate a trend-following strategy, while negative autocorrelation may suggest a mean-reversion strategy. Adjust your trading or analytical approach accordingly.

How to compute autocorrelation?

To compute autocorrelation, you can follow these steps:

Preprocess the Data

Ensure that your time series data is properly organised and formatted. Remove any missing or irrelevant data points that might interfere with the analysis.

Calculate the Mean

Compute the mean of your time series data. This will be used as a reference point for measuring the correlation between the data points.

Calculate the Variance

Calculate the variance of your time series data. This will help in normalising the autocorrelation values.

Compute the Autocovariance

For each lag value, calculate the autocovariance between the original data points and their corresponding lagged values.

The autocovariance at lag "k" is given by the formula:

$$Autocovariance(k) = Σ[(X(t) - mean) * (X(t-k) - mean)] / n$$

Here, X(t) represents the original data point at time "t," X(t-k) represents the lagged value at time "t-k," mean is the mean of the data, and "n" is the total number of data points.

Compute the Autocorrelation Coefficient

Normalise the autocovariance values by dividing them by the variance. This yields the autocorrelation coefficient at lag "k." The autocorrelation coefficient at lag "k" is given by the formula:

$$Autocorrelation(k) = Autocovariance(k) / Variance$$

The autocorrelation coefficient ranges from -1 to 1, where -1 represents a perfect negative correlation, 1 represents a perfect positive correlation, and 0 represents no correlation.

Repeat for Different Lag Values

Compute the autocorrelation coefficient for different lag values of interest. This allows you to observe how the correlation changes over time.

Visualise the Autocorrelation

Plot the computed autocorrelation coefficients against the corresponding lag values. This graphical representation is known as the Autocorrelation Function (ACF) plot.

How to use autocorrelation with Python in trading?

Now, let us find out how to use autocorrelation with Python in trading below::

Step 1: Import necessary libraries

Import necessary libraries: The code begins by importing the required libraries, including pandas, matplotlib.pyplot, plot_acf from statsmodels.graphics.tsaplots, and yfinance for fetching financial data.

Step 2: Fetch data and create DataFrame

The code uses the yf.download function from yfinance to fetch the historical stock price data for AAPL (Apple Inc.) from September 30, 2022, to January 1, 2023. The data is stored in the AAPL_data DataFrame.

A new DataFrame called data is created, which includes the 'Close' prices from AAPL_data. This DataFrame will be used for further analysis.

Step 3: Calculate the 20-period autocorrelation

The code calculates the 20-period autocorrelation for the 'Close' prices by using the rolling function combined with the autocorr method from pandas. The autocorrelation values are stored in the 'autocorr_20' column of the data DataFrame.

Step 4: Generate long and short signals based on autocorrelation

Threshold-based signals are generated using the autocorrelation values and the direction of the market.

A threshold of 0.5 is set, and long (buy) signals are generated when the autocorrelation is greater than or equal to the threshold and the price has decreased.

Short (sell) signals are generated when the autocorrelation is greater than or equal to the threshold and the price has increased. These signals are stored in the 'long_signal' and 'short_signal' columns of the data DataFrame.

Step 5: Plot the time series and signals:

A figure is created using plt.subplots, and the 'Close' prices from the data DataFrame are plotted.

The long (buy) signals are indicated by an upward arrow ('^') at the corresponding dates, and the short (sell) signals are indicated by a downward arrow ('v').

Step 6: Visualise autocorrelation

Another figure is created to visualise the autocorrelation using the plot_acf function from statsmodels.graphics.tsaplots. The autocorrelation values for the 'Close' prices are plotted up to a lag of 20.

Labels, titles, and other customizations are applied to the plots using functions like plt.xlabel, plt.ylabel, and plt.title.

Finally, the plots are displayed using plt.show().

Output:

Following are the observations looking at the graph of autocorrelation above.

We do not look at autocorrelation at lag 0 as the current price is perfectly correlated with itself.
Values from lag 1 to 3 that lie outside the blue region are statistically significant.
Blue region is the confidence interval. The default set confidence interval in the statsmodels library is 95%. It can be interpreted as with 95% confidence level; you can say that from 4th value in the time series data, the price is not correlated with the current price.
A slow decay of autocorrelation shows that the prices are less correlated with the current prices as we move more in the past.

Let us also see how to use PACF for eliminating the indirect effects of lagged prices on the current price.

The python code for the same is as follows:

Output:

The above plot shows the Partial Autocorrelation Function (PACF) for the given time series data.

The blue part shows the confidence interval. The confidence interval helps visualise the direct correlation between the current observation and its lagged values while considering the influence of other lags.

Here are the observations:

Partial autocorrelation plot of prices has both negative and positive values. For example, at lag 1 it is positive, which means the previous month's price of the asset is positively correlated with current month's price.
Similarly, at lag 2 it is negative. It means the prices two months before are negatively correlated with the current month's price.

You can find out more about ACF and PACF in this notebook from the course on Financial Time Series Analysis in Trading.

Autocorrelation in Technical Analysis

Autocorrelation is a statistical concept widely used in technical analysis to study the relationship between past and current price movements in financial markets. By analysing autocorrelation in price data, technical analysts aim to identify patterns and trends that can help predict future price movements.

In technical analysis, autocorrelation is typically applied to the returns or changes in price rather than the absolute price levels. By examining the autocorrelation of returns, analysts can gain insights into the persistence or mean-reverting nature of price movements.

Here are a few key aspects of autocorrelation in technical analysis:

Trend Identification: Autocorrelation can be used to identify and confirm the presence of trends in price data. Positive autocorrelation suggests the presence of a trend, indicating that past price movements have a predictive effect on future prices. This information can assist in trend-following strategies.
Reversal Signals: Negative autocorrelation indicates a tendency for price reversals or mean reversion. It suggests that past price movements have an inverse relationship with future prices. Traders may use this information to identify potential turning points in the market and implement mean-reversion strategies.
Time Frame Selection: Autocorrelation analysis helps traders select an appropriate timeframe for their trading strategies. By observing the autocorrelation patterns at different lags, analysts can determine the optimal time horizon for capturing and exploiting price movements.
Trading System Development: Autocorrelation analysis can contribute to the development and refinement of trading systems. By incorporating autocorrelation measures into trading algorithms, traders can generate buy and sell signals based on the detected patterns and correlations in the price data.

Pros of using autocorrelation

Using autocorrelation in data analysis and trading offers several benefits:

Trend Identification: Autocorrelation helps identify trends and patterns in data. Positive autocorrelation indicates the presence of a trend, enabling traders to follow the trend and potentially capture profitable opportunities.
Predictive Power: Autocorrelation provides insights into the relationship between past and future data points. By analysing autocorrelation, traders can make informed predictions about future price movements and adjust their trading strategies accordingly.
Market Regime Detection: Autocorrelation helps identify different market regimes, such as trending or mean-reverting conditions. This information allows traders to adapt their strategies based on the prevailing market dynamics.
Strategy Development: Autocorrelation aids in the development and refinement of trading strategies. By incorporating autocorrelation analysis, traders can fine-tune their entry and exit signals, optimise risk management techniques, and enhance the overall effectiveness of their strategies.
Risk Management: Autocorrelation analysis provides insights into the volatility and stability of market conditions. Traders can use this information to assess the likelihood of price reversals or trend continuations, helping them make better risk management decisions.
Time Frame Selection: Autocorrelation assists in determining the optimal timeframe for trading strategies. By analysing autocorrelation at different lags, traders can select the appropriate time horizon for capturing price movements and align their trading strategies accordingly.
Confirmation of Signal Reliability: Autocorrelation can act as a confirmation tool for other technical indicators or signals. When multiple indicators or signals align with autocorrelation patterns, it strengthens the reliability of the trading signal. You can learn all about in this course on technical indicators python.

Integrating autocorrelation analysis into trading approaches can enhance decision-making and potentially improve trading outcomes.

Cons of using autocorrelation

While autocorrelation can be a valuable tool in data analysis and trading, there are some limitations and potential drawbacks to consider:

False Signals: Autocorrelation analysis can generate false signals, especially when applied to noisy or random data. In such cases, spurious autocorrelation may lead to incorrect predictions or trading decisions.
Lag Selection: Determining the appropriate lag length for autocorrelation analysis can be subjective and challenging. Choosing the wrong lag length can result in misinterpretation or ineffective analysis.
Overfitting: Autocorrelation analysis involves examining multiple lag values, which increases the risk of overfitting the data. Overfitting occurs when the analysis excessively fits the historical data but fails to generalise well to new data, leading to poor performance in real-world trading scenarios.
Lack of Causality: Autocorrelation only measures the statistical relationship between past and current data points. It does not provide insight into the underlying causes or drivers of price movements. Therefore, it is important to exercise caution when inferring causality solely based on autocorrelation results.
Ignoring Fundamental Factors: Autocorrelation analysis focuses solely on price data and may overlook fundamental factors that can influence market dynamics. It is crucial to consider other relevant factors, such as economic indicators, news events, or company fundamentals, to gain a comprehensive understanding of the market.

Nevertheless, the cons mentioned above can be taken care of if our approach is correct.

In order to overcome the limitations of using autocorrelation, consider the following brief strategies:

Validate results with other indicators or tests.
Explore autocorrelation patterns across multiple lag values.
Assess data stationarity and apply necessary transformations.
Evaluate model performance on out-of-sample data.
Incorporate additional factors beyond autocorrelation.
Adapt analysis to changing market conditions.
Implement proper risk management strategies.
Seek expertise and guidance when needed.

FAQs

Let us find out the answers to some of the frequently asked questions now.

Can machine learning models handle autocorrelation?

Yes, machine learning models can handle autocorrelation. Autocorrelation in the input features of a dataset does not pose a problem for most machine learning algorithms. However, autocorrelation in the residuals or target variable can violate the assumption of independence, which is important for some models like linear regression.

In such cases, it may be necessary to address autocorrelation by applying appropriate techniques such as including lagged variables, differencing, or using specialised models like autoregressive integrated moving average (ARIMA) or recurrent neural networks (RNNs).

How to detect autocorrelation?

Autocorrelation can be detected using various statistical methods and visualisations. Some common approaches include:

Autocorrelation Function (ACF) Plot: Plotting the autocorrelation coefficients against different lag values.
Partial Autocorrelation Function (PACF) Plot: Examining the partial autocorrelation coefficients to identify direct relationships between the current observation and its lagged values.
Statistical Tests: Conducting formal statistical tests such as the Durbin-Watson test or Ljung-Box test to assess the presence of autocorrelation.

Is autocorrelation good or bad?

The presence of autocorrelation is not inherently good or bad. It depends on the context and purpose of the analysis. Autocorrelation can provide valuable insights into the relationships and patterns within time series data, aiding in trend identification, prediction, and strategy development.

However, autocorrelation can also introduce challenges, such as violating independence assumptions in some models or leading to false signals. It is essential to understand and properly handle autocorrelation based on the specific analysis goals and requirements.

What is autocorrelation in regression?

In regression analysis, autocorrelation refers to the correlation between the residuals (or errors) of a regression model at different time points. Autocorrelation in the residuals indicates a systematic relationship or dependency between the error terms, violating the assumption of independence.

Autocorrelation in regression can lead to biassed coefficient estimates, incorrect standard errors, and unreliable statistical inference.

To address autocorrelation in regression, specialised techniques such as autoregressive integrated moving average (ARIMA) models or generalised least squares (GLS) regression can be employed.

Bibliography

Conclusion

Autocorrelation is a powerful statistical tool used to analyse the relationships and patterns within time series data. It has applications in trading, technical analysis, econometrics, and various other fields. By understanding autocorrelation, its computation, and how to interpret the results, analysts and traders can gain valuable insights into the dynamics of data and make more informed decisions.

However, it is important to consider the limitations and potential pitfalls associated with autocorrelation. Nevertheless, you can overcome the challenges by following some easy strategies as we discussed in the blog.

If you wish to learn more about autocorrelation in trading, you must explore our time series analysis course named "Financial Time Series Analysis for Trading". You can practise all the learnings through a capstone project, paper and live trade the strategies covered in the course.

Also, you will learn to apply time series analysis to data exhibiting characteristics like seasonality and non-constant volatility. Implement a trading strategy on stocks, ETFs, currency pairs, as well as VXX.

Disclaimer: All data and information provided in this article are for informational purposes only. QuantInsti^® makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information in this article and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.

EPAT Walkthrough & Live Q&A