PROGRAMACIÓN DE OBJETIVOS Y ACCIONES ESTRATÉGICAS, EN

One of the key quantitative trading concepts is mean reversion. This process refers to a time series that displays a tendency to revert to a historical mean value. Such a time series can be exploited to generate trading strategies as we enter the market when a price series is far from the mean under the expectation that the series will return to a mean value, whereby we exit the market for a profit. Mean-reverting strategies form a large component of the statistical arbitrage quant hedge funds. In later chapters we will create both intraday and interday strategies that exploit mean-reverting behaviour.

The basic idea when trying to ascertain if a time series is mean-reverting is to use a statistical test to see if it differs from the behaviour of a random walk. A random walk is a time series where the next directional movement is completely independent of any past movements - in essence the time series has no "memory" of where it has been. A mean-reverting time series, however, is different. The change in the value of the time series in the next time period is proportional to the current value. Specifically, it is proportional to the difference between the mean historical price and the current price.

Mathematically, such a (continuous) time series is referred to as an Ornstein-Uhlenbeck process. If we can show, statistically, that a price series behaves like an Ornstein-Uhlenbeck series then we can begin the process of forming a trading strategy around it. Thus the goal of this chapter is to outline the statistical tests necessary to identify mean reversion and then use Python libraries (in particular statsmodels) in order to implement these tests. In particular, we will study the concept of stationarity and how to test for it.

As stated above, a continuous mean-reverting time series can be represented by an Ornstein- Uhlenbeck stochastic differential equation:

dxt= θ(µ − xt)dt + σdWt (10.1)

Where θ is the rate of reversion to the mean, µ is the mean value of the process, σ is the variance of the process and Wtis a Wiener Process or Brownian Motion.

This equation essentially states that the change of the price series in the next continuous time period is proportional to the difference between the mean price and the current price, with the addition of Gaussian noise.

We can use this equation to motivate the definition of the Augmented Dickey-Fuller Test, which we will now describe.

10.1.1 Augmented Dickey-Fuller (ADF) Test

The ADF test makes use of the fact that if a price series possesses mean reversion, then the next price level will be proportional to the current price level. Mathematically, the ADF is based on the idea of testing for the presence of a unit root in an autoregressive time series sample.

We can consider a model for a time series, known as a linear lag model of order p. This model says that the change in the value of the time series is proportional to a constant, the time itself and the previous p values of the time series, along with an error term:

∆yt= α + βt + γyt−1+ δ1∆yt−1+ · · · + δp−1∆yt−p+1+ t (10.2)

Where α is a constant, β represents the coefficient of a temporal trend and ∆yt = y(t) −

y(t − 1). The role of the ADF hypothesis test is to ascertain, statistically, whether γ = 0, which would indicate (with α = β = 0) that the process is a random walk and thus non mean reverting. Hence we are testing for the null hypothesis that γ = 0.

If the hypothesis that γ = 0 can be rejected then the following movement of the price series is proportional to the current price and thus it is unlikely to be a random walk. This is what we mean by a "statistical test".

So how is the ADF test carried out?

• Calculate the test statistic, DFτ, which is used in the decision to reject the null hypothesis

• Use the distribution of the test statistic (calculated by Dickey and Fuller), along with the critical values, in order to decide whether to reject the null hypothesis

Let’s begin by calculating the test statistic (DFτ). This is given by the sample proportionality

constant ˆγ divided by the standard error of the sample proportionality constant: DFτ =

ˆ γ

SE(ˆγ) (10.3)

Now that we have the test statistic, we can use the distribution of the test statistic calculated by Dickey and Fuller to determine the rejection of the null hypothesis for any chosen percentage critical value. The test statistic is a negative number and thus in order to be significant beyond the critical values, the number must be smaller (i.e. more negative) than these values.

A key practical issue for traders is that any constant long-term drift in a price is of a much smaller magnitude than any short-term fluctuations and so the drift is often assumed to be zero (β = 0) for the linear lag model described above.

Since we are considering a lag model of order p, we need to actually set p to a particular value. It is usually sufficient, for trading research, to set p = 1 to allow us to reject the null hypothesis. However, note that this technically introduces a parameter into a trading model based on the ADF.

To calculate the Augmented Dickey-Fuller test we can make use of the pandas and statsmodels libraries. The former provides us with a straightforward method of obtaining Open-High-Low- Close-Volume (OHLCV) data from Yahoo Finance, while the latter wraps the ADF test in a easy to call function. This prevents us from having to calculate the test statistic manually, which saves us time.

We will carry out the ADF test on a sample price series of Google stock, from 1st January 2000 to 1st January 2013.

Here is the Python code to carry out the test:

# Import the Time Series library

import statsmodels.tsa.stattools as ts

# Import Datetime and the Pandas DataReader

from datetime import datetime

# Download the Google OHLCV data from 1/1/2000 to 1/1/2013

goog = DataReader("GOOG", "yahoo", datetime(2000,1,1), datetime(2013,1,1))

# Output the results of the Augmented Dickey-Fuller test for Google # with a lag order value of 1

ts.adfuller(goog[’Adj Close’], 1)

Here is the output of the Augmented Dickey-Fuller test for Google over the period. The first value is the calculated test-statistic, while the second value is the p-value. The fourth is the number of data points in the sample. The fifth value, the dictionary, contains the critical values of the test-statistic at the 1, 5 and 10 percent values respectively.

(-2.1900105430326064, 0.20989101040060731, 0, 2106, {’1%’: -3.4334588739173006, ’10%’: -2.5675011176676956, ’5%’: -2.8629133710702983}, 15436.871010333041)

Since the calculated value of the test statistic is larger than any of the critical values at the 1, 5 or 10 percent levels, we cannot reject the null hypothesis of γ = 0 and thus we are unlikely to have found a mean reverting time series. This is in line with our tuition as most equities behave akin to Geometric Brownian Motion (GBM), i.e. a random walk.

This concludes how we utilise the ADF test. However, there are alternative methods for de- tecting mean-reversion, particularly via the concept of stationarity, which we will now discuss.

In document PLAN ANUAL DE TRABAJO (PAT) 2019 (página 15-22)