Friday, June 17, 2011

When cointegration of a pair breaks down

I have written a lot in the past about the cointegration of ETF pairs, and how this condition can lead to profitable pairs trading. However, as every investment advisor could have told you, past cointegration is no guarantee of future cointegration. Often, cointegration for a pair breaks down for an extended period, maybe as long as a half a year or more. Naturally, trading this pair during this period is a losing proposition, but abandoning such a pair completely is also unsatisfactory, since cointegration often mysteriously returns after a while.

A case in point is the ETF pair GLD-GDX. When I first tested it in 2006, it was an excellent candidate for pair trading, and I not only traded it in my personal portfolio, but we traded it in our fund too. Unfortunately, it went haywire in 2008. We promptly abandoned it, only to see the strategy recovered sharply in 2007.

So the big question is: how do we know whether the loss of cointegration is temporary, and how do we know when to resume trading a pair?

To answer the first question, it is often necessary to go beyond the technicals, and delve into the fundamentals of pair. Take GLD-GDX as the example. When I taught my pairs trading workshop in South Africa, several  portfolio managers in attendance told me that there are 2 reasons why gold spot price diverged from gold miners' stock prices. Firstly, due to the sharp increase in oil prices during the first half of 2008, it costs the gold miners a lot more in energy to extract the gold from the ground, hence the gold miners' income lags behind the rise in gold prices. Secondly, many gold miners hedge their exposure to fluctuating gold prices with derivatives. Hence when gold price rise beyond a certain limit, the gold miners cease to benefit from this rise. Recently, the Economist magazine published an article that essentially confirms this view. But further confirmation can be gained by introducing oil (future) price into the cointegration equation. If you do that, and if you trade this triplet of GLD-GDX-USO, you will find that it is profitable throughout the entire period from 2006-2010. If you find trading a triplet too complicated, you can at least backtest a trading filter such that you will cease to trade GLD-GDX whenever USO goes beyond (above, and maybe below too) a certain band. If you have done all these backtests, you will have a plan in place to tell you when to resume trading this pair. But even if you haven't done this backtest, and you find that you need to stop trading a pair because of cumulating losses, you should at least continue paper trading it to see when it is turning around!

(By the way, if you think trading ETF pairs offers too low returns due to the low leverage allowed, consider the single stock futures on ETF's trading on the OneChicago exchange. Certainly the future on GDX is available there, while you might just trade the futures GC and CL directly on CME. There is, of course, the usual caveat that applies to futures pairs trading: the switch from contango to backwardation and vice versa can ruin many a pairs-trading strategy, even if the spot prices remain cointegrating. But that's a story for another time.)

99 comments:

sjev said...

I've done a quick analysis of a GLD-GDX-USO triplet using PCA. Just can't seem to find a portfolio that is more stable than GLD-GDX for the whole 2007-now period.
Could you give an example of a triplet ratio that is holds through the 2008 market?

Ernie Chan said...

sjev,
Try 0.5350*GLD-0.7387*GDX+0.0293*USO.
GLD, GDX, USO refer to their prices, not returns. This triplet should be stationary in the period I mentioned. By the way, I am not sure the PCA is the proper way to analyze cointegration.
Ernie

Anonymous said...

Ernie, would the pair return to its previous equilibrium? In other words, would the trade breakeven after a number of months or years?

Thanks

Ernie Chan said...

Anon,
The pair would not necessarily return to its previous equilibrium, but the triplet would, as evidenced by its cointegration property throughout the period.

However, the profitability of just trading the pair will still return if you use moving averages etc. instead of static parameters.

Ernie

Jozef Rudy said...

Ernie, I suppose you ran OLS regression on levels, with 3 variables.

Or did you use Johansen test?

Ernie Chan said...

Jozef,
I used Johansen tests to get the hedge coefficients (via the eigenvectors). But you can also use regression of one variable (level) against the other 2.
Ernie

eduardo said...

hello ernie,

i work in the industry and have your book and some others about pairs trading, but i never seen to find any info about coint triplets.

could you please share some of your knowledge about where can i find some papers, books, or anything about this topic? (since you dont have any seminars scheduled for us here in Brasil)

thanks

eduardo

Ernie Chan said...

Hi Eduardo,
The concept of cointegration has always been applied to more than 2 time series in econometrics. Johansen test, for e.g., is designed for this situation. You can read the documentation at spatial-econometrics.com to see how this is used for multiple time series.
I have not seen this explored in the trading literature though.
Ernie

Dan Rico said...

Below a link to a multivariate pairs trading strategy:

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=952782

@sjev – do you use PCA in the traditional statistical approach (arbitrage price theory) - decompose the covariance matrix (correlation vs. cointegration) of security returns and identify the factors? If, yes than the assumption is that the factors are assumed to be constant which doesn’t hold true for the mean reverting spread.

@Ernie – is it possible that the cointegration break is due the fact that gold abilities as hedging instrument have been changed? At least for the past 12 months the price of gold and S&P 500 moved in tandem.

My $.02 said...

Carol Alexander is a very good source for cointegration.

Ernie Chan said...

Dan,
I don't think the cointegration break was due to gold's hedging ability has changed because the break only occurs in 2008. I think the reasons are more likely the 2 that I mentioned.
Ernie

kongming said...

Hi Ernie,

Can you kindly advise me on the tax payable by hedge funds on their trading gains? Thanks.

Ben

Ernie Chan said...

Hi Ben,
First, please let me say that I am not a CPA and am not qualified to give tax advice.

Just to share my own experience though, most hedge funds are organized as limited partnerships, and thus all taxable profits pass through to the limited partners. So the limited partners have to pay tax, but not the fund itself.

Ernie

Anonymous said...

Hi Ernie, some suggest to use Hurst Exponent to trade mean reversion, what is your opinion.

http://10outof10.blogspot.com/2008/03/introduction-to-hurst-exponent.html

Thx

Ernie Chan said...

Hi Anon,
I believe Hurst exponent can in fact be a reasonable measure of whether a time series is mean-reverting.
Ernie

kenny said...

Hi Erine

Would you mind sharing your experience or articles of how to apply the Hurst exponent to trade mean reversion in a proper way , I applied it on FX trading in the past but no so successful. I don't feel it is that simple by just looking at whether Hurst < 0.5.

Thx

Ernie Chan said...

Hi Kenny,
I believe the usage of Hurst exponent is the same as using the adf test to check for stationarity, or calculating the half-life of mean-reversion using the Ornstein-Uhlenbeck formula. Even after any of these criteria indicates that the time series is mean-reverting, you still have to find a suitable trading strategy to take advantage of the mean-reversion. For e.g. if you trade it using Bollinger bands, then you still need to decide what lookback to use, and whether you should enter at 1, 2 or 3 standard deviations.
Ernie

ww said...

Ernie: you mentioned using moving average instead of static parameters. could you say more on how to dynamically update the parameters? is there any good reference on that?

Ernie Chan said...

ww,
By "moving averages" I just mean using the usual Bollinger bands (which are composed of moving averages and standard deviations) method to determine entries and exits into a pair, instead of entering and exiting at a fixed spread value.
Ernie

Anonymous said...

Hi Ernie,

How would you optimise for a portfolio of pairs? If i use the mean-variance approach of Markowitz then would i have to constrain the weights of the portfolio to be >= 0?

Thanks!

Ernie Chan said...

Hi Anon,
You can treat each pair as an asset with its own returns and stddev, which you can long only.

(By Long, we mean long that strategy, not that we are long one side of the pair and short the other side.)

Ernie

edba said...

Hello Ernie,
I tried 0.5350*GLD-0.7387*GDX+0.0293*USO and it seems that this triplet is trend stationary (ie w/ a deterministic trend) and then cannot be used for a cointegration strategy.
I ran a Johansen test for the 2006-01-01 to 2011-01-01 period and I found that a stationary triplet is: (1; -4.21; 0.16).
Do you agree with me?

Anonymous said...

Hi Ernie,

So what would you do if you were trading a pair and it was no longer cointegrated (say by the ADF test)?
1.Immediately get out of the pair
2.Get out normally (take profit or stop loss) and do not re-enter trade
3.Continue trading pair normally for a set amount of time and stop
4.Something else?

Thanks

Ernie Chan said...

Hi edba,
I used data from 20060523-20100521 to compute the hedge ratios for GLD-GDX-USO. If you use a different data period, you could certainly get different ratios.
Ernie

Ernie Chan said...

Anon,
If you are trading a "static" mean-reversion strategy, then yes, loss of cointegration compels you to liquidate positions immediately.
"Static" means you don't continuously updates the mean and stddev of the spread.
However, short-term mean-reversion does not require cointegration/stationarity, so you can continue trading as long as the time series mean-reverts.
Ernie

edba said...

Hi Ernie,
let me ask my question again:
The triplet that you suggest (0.5350*GLD-0.7387*GDX+0.0293*USO) has a strong trend (ie is not stationary). How do you use it for cointegration?

Ernie Chan said...

edba,
My hedge ratios are given by the Johansen tests, which find the 3 instruments to cointegrate. If you find them to have strong trend in another data period, you will certainly need to use Johansen to compute a new set of hedge ratios. In the period I tested, a mean-reversion strategy is quite profitable with this triplet.
Ernie

Fuzhi Cheng said...

Hi, Ernie:

What do you think of the possibility of pairs trading a commodity ETF with its underlying futures contracts (contract rolling issues aside)? Right now your spread suggestions are all ETF/ETF combination.

Thanks.

Fuzhi

Ernie Chan said...

Hi Fuzhi,
If the commodity ETF holds futures also, then there is no issue, and it should cointegrate very well with the underlying futures.

One possible issue is that the ETF may hold contracts of different months. In that case, you have to hedge with contracts from those months as well.
Ernie

Fuzhi Cheng said...

Thanks a lot Ernie.
Will you by any chance offer a workshop in the New York area this year?

Fuzhi

Ernie Chan said...

Fuzhi,
There may be a plan to offer this in New York in January 2012. Please check back in a few months.
Ernie

basant said...

Hi Ernie,

I have been trying to trade pairs using cointegration. I generally use 2 yrs rolling data to test cointegration and a shorter period to find the static hedge ratio. Once I find cointegration and take the trade, I keep on calculating the spread every day by rolling the shorter period to check for mean reversion. Though the spread mean reverts nicely, the trades are not profitable. Could you please advise what's wrong with my approach.

Ernie Chan said...

Hi basant,
It sounds like your lookback period for your moving averages and/or standard deviations may be too short.
Ernie

basant said...

Hi Ernie,

The lookback period is at least sixty days. Do you advise a longer lookback period?

I think the problem is because I calculate the new beta and new spread as I roll the lookback period and take sigma of the new spread. Since the original portfolio is formed using different beta and spread is with regard different beta, the trades are not profitable with respect to rolling mean. In this case I have two choice 1) Rebalance the portfolio with respect to new beta which leads to higher transaction cost. 2) Keep the beta constant and take mean and sigma of the rolling spread (which is same as bollinger band).

Could you please let me know your views?

Ernie Chan said...

Hi basant,
I like method 2. But in any case, your lookback should be set by halflife calculations, as recommended in my book. A long lookback will prevent the problem you are experiencing.
Ernie

basant said...

Thanks Ernie for your comments. I looked in your book for lookback period and didn't find any reference to relation between lookback period and half life. Could you please through some more light on this?

Ernie Chan said...

Hi Basant,
Yes, I didn't mention this relationship in my book (I discussed this in my workshops.) But you can simply try setting your lookback to (or greater than) the halflife of mean reversion. This usually works out pretty well.
Ernie

basant said...

Thanks a lot Ernie for you valuable comments.

Anonymous said...

Hi Ernie,

1)Does the optimal regression period of a pair depends on the half life? I mean is it wise to take beta of a period where the half life is shortest?
2)Should the prices of stocks be detrended before testing for cointegration?

Ernie Chan said...

Hi Anon,
Yes, generally you should set the period of regression to the halflife, unless the halflife is too short to give a meaningful fit.

You should not detrend stocks beforehand, otherwise the cointegration test has no meaning.
Ernie

Anonymous said...

Hi Ernie,

Could you elaborate on the following?

The half life is calculated after running the regression. Therefore how does one set the period of regression prior to calculating halflife?

How short is a short halflife? Sometime I get half life of 8 days in a regression period of 2 yrs. will this be considered as a short half life?

Ernie Chan said...

Hi Anon,
That is a good question and a good example of the situation one faces in numerical methods quite often.

One typically initiate the iterations by guessing an approximate lookback, and use this for regression and halflife computations. Then you set the new lookback to equal this halflife, and repeat the process. If this process converges, i.e. if the resulting halflife ceases to change much with each iteration, you have found the correct lookback to use.

I think a linear regression fit should have at least 10 data points.

Ernie

Anonymous said...

Thanks Ernie for the clarification.

Does that mean in the process the beta used in trading is different from the cointegration beta.

Ernie Chan said...

Anon,
Yes, typically the beta used in trading uses a shorter lookback than that used in cointegration test.
Ernie

Anonymous said...

Hi Ernie,

I am trying to replicate your suggestions but probably making some mistakes and not getting the expected results.

1)For example I started with a lookback of 100 days and got a half life of 15 days. Next with a lookback of 15 days I get a half life of 5 days and with lookback of 5 days HL comes to 2 days and so on. The convergence doesn't happen.

2) When I calculate beta with lookback of 10 days for the above data, the sigma of the spread is quite smaller than the 100 days sigma. In the out of sample spread crosses even 6 times the 10 day lookback sigma where as it remains with in 3 times the 100 day sigma. Which sigma should I use?

Ernie Chan said...

Hi Anon,
Did you really use 5 days for linear regression? That's too short. In any case, I think your HL calculations may not be correct. If you can't find the correct lookback this way, just try to find the optimal lookback in-sample, and test it out-of-sample. The same goes for your sigma computations.
Ernie

Henri said...

Hi Ernie,
Thanks much for the valuable information shared on your blog.

It seems that a lookback period of 2 years for the cointegration test is quite usual. However, do you believe that a shorter period might be useful to detect when cointegration breaks down (in other words, could a coningration test on a 6 month period detect than a pair is not any more stationary when the same test on 2 years would not?). Thanks.
-Henri

Ernie Chan said...

Hi Henri,
It is actually quite hard to detect the breakdown of cointegration except in hindsight maybe a year afterwards. This is because any drawdown in a pairs strategy can be interpreted as a breakdown. But only when the drawdown lasts for, say, a year, when we can say that the cointegration is really gone.
The best we can do is to examine past periods of such temporary breakdowns and identify the fundamental reason/variable for it, and then add an additional cointegrating instrument that hopefully will take into account the extra variable.

Ernie

Andy Webb said...

Apologies if this is slightly off topic - it's less about when cointegration breaks down and more about the reliability/stability of a particular cointegration test.

Using EXACTLY the same data set, I'm seeing significantly different results between sequential tests on that same data when using the Spatial Econometrics cadf.m MATLAB function. Anyone else experienced this? Thx Andy

Ernie Chan said...

Hi Andy,
By sequential tests, do you mean running cadf on, say, the 2008 data of a price series over and over again and getting different results each time?
Ernie

Anonymous said...

Hi Ernie,

Let's assume I have a pair that cointegrates with a 95-99 % probability over a 2 year, 3 year and 4 year Lookback data period. But before I enter the trade i test the cointegration on a 1 year Lookback data period and the probability coming out of my adf test is high (eg 30%) . Should I assume the pair no longer cointegrate and not trade the pair OR ignore the 1 year period lookback test and assume they cointegrate??

Thanks
Ca

Jeet said...

I am a student and working on my pairs trading project.
1. I found that constant beta give very slow mean reversion process. Spread revert to its mean some time in 6-9 months. I think that is not desirable. In this case, how to make changes such that it revert quickly.

2. If I change beta everyday, portfolio re balancing come into picture. Everyday spread changes with new beta. If I get a Exit signal with new beta which is say, less (greater) than beta while entering the position. I would left (fall short) with short securities to square off my positions. How to overcome this problem?

Thanks

Ernie Chan said...

Ca,
If you find that the pair is not cointegrating in the last 1 year, try to find out the fundamental economic reason/condition why they might diverge. But in any case, I would not trade it until I believe that this condition is over.
Ernie

Ernie Chan said...

Jeet,
Indeed I recommend updating your beta daily with a short lookback period. However, it doesn't necessarily mean you have to adjust your existing positions given the changed beta. You can just use it to generate an exit signal so that you are either in the current position, or exit both sides completely.
Ernie

Jeet said...

Hi Mr. Chan,
Thanks for replying my previous post. I have regress S1 over S2 with 1000 points and found this pairs is co integrated by using ADF test.
Now, I know the pair is cointegrated, I tried to trade with this coefficient and found some time the trade length is very large like 4-5 months. It is not desirable.

So I decided to find coefficient dynamically. For the cointegrated pair, I regressed first 20 values of S1[1:20] vs S2[1:20] and find one coefficient, next day S1[2:21] ~ S2[2:21] and found another cofficient next day which I used to calculate the spread which reduces trade length dramatically but problem is spread is calculated with different cofficient everyday but actual trading does not involve rebalancing. Strategy makes loesses.

Ernie Chan said...

Jeet,
Did your strategy lose in backtest, or in live trading, based on the method you described?
Ernie

Jeet said...

Hi Erin,
Some time period it made losses and some time periods it gives profit around 3-4% per Annum.

Need your suggestions.

Ernie Chan said...

Jeet,
I suggest you decrease the lookback period in order to reduce the holding period. This usually increases the Sharpe ratio as well.
Ernie

Jeet said...

You mean to say I should use let us say 10 days look back period to calculate coefficients dynamically. S1[1:10]~s2[1:10], next
S1[2:11]~ s2[2:11], then
s1[3:12]~s2[3:12] and so on.

Ernie Chan said...

Jeet,
That's correct.
Ernie

gcn said...

Ernie, I revisit this post and read the latest conversation between you and @Jeet. I wonder is it meaningful to update hedge coefficient daily? and using such short period (10 days in Jeet's case)?

I think hedge coefficient needs to be updated when fundamental changes but pairs are still cointegrated.

Ernie Chan said...

gcn,
For pairs that are truly cointegrated, and you don't mind holding for a long enough period so that the spread mean-revert, you can certainly use a static hedge ratio, updated only when fundamental changes occurred. But for those traders who desire short holding periods, and particularly when the pair does not really cointegrate but nevertheless mean-revert on a short time scale, a short lookback enables us to take advantage of this profit opportunity and exit quickly.
Ernie

gcn said...

But what if the half life is short, for example 10 days or shorter. It is less meaningful to regression on such short series.

Ernie Chan said...

gcn,
You can certainly regress on 10 data points. We are used to large uncertainties in finance anyway, so just because one has small error bars doesn't mean the prediction is better.
Ernie

Danie Pretorius said...

Hi Ernie,

In testing for cointegration in the pair, how do you decide (a) whether to use an Augmented DF test, or whether a simple Dickey-Fuller test is enough; and (b) in a ADF, what lag to use (especially if different lag lengths result in different conclusions)

Thanks

Ernie Chan said...

Hi Danie,
I always use ADF test, because of certain defects in the DF test. And I always use lag=1 in an ADF test, because it is usually the simplest model that can reject the null hypothesis.
Ernie

Danie Pretorius said...

Thanks for the response Ernie.

Conceptually, if the ADF test with 1-lag suggests the series is stationary, but with higher lag orders the null hypothesis cannot be rejected, how confident would you be in trading the pair?

Many thanks.

Ernie Chan said...

Danie,
I actually have never seen such a situation, and am not even sure that it is theoretically possible, since higher lags include the possibility of lag=1. Have you found such an empirical counter-example?
Ernie

Danie Pretorius said...

Thanks Ernie,

It's possible I'm making a mistake in the model specification (I'm working in Excel to keep it simple), but when testing two South African stocks over a 120 day period, the test statistic increases with the the lag period (i.e., lag=1 gives -3.98, l=2 gives -3.10, l=3 gives -2.95). The series being tested is the difference in log prices (i.e. Y = ln(price A) - 0.87*ln(price B))

Any thoughts are welcome!
Thanks, Danie

Ernie Chan said...

Danie,
As long as you have found one lag where the null hypothesis cannot be rejected, than the series is stationary.
Ernie

Danie Pretorius said...

Thanks Ernie, I appreciate the help

archlight said...

Ernie,
I did calculation on OIH-RKH-RTH in your book example-6-3 but with past one year data. F I got is
[ -8.3926516 , -0.06206326, 38.20104829]. does it mean pair no longer co-integrates and RKH looks insignificant in portfolio.

By the way, I am using pandas python module which is very handy to replicate examples in your book. maybe you like to recommend other readers?

Regards

Ernie Chan said...

archlight,
Example 6.3 is only about optimal allocation of capital based on Kelly formula. It has nothing to do with cointegration of the 3 ETF's. Your numbers merely mean we should short OIH and long RTH.

Thanks for mentioning Python. Yes, I have heard good things about it.
Ernie

Anonymous said...

Hi Ernie,

I have enjoyed reading through the above comments. Very interesting and informative. I have a question regarding triplets. Once you get an entry signal from your model how do you get into the trade? One option is to cross the spread on all 3 legs (which is very expensive) ... another is to use an autospreader to dynamically adjust limit orders to try and leg into the trade as much as possible before crossing the spread on the leftovers. The danger of the latter approach is that you don't get into the trade as the spread reverts.. so there is an opportunity cost with being passive.

What is your opinion on this?

Regards, Rob.

Ernie Chan said...

Hi Anon,
We can place limit orders for the least liquid component. Once it is executed, then use market orders for the remaining two.

Ernie

Anonymous said...

Hi Ernie,

Yes this seems to be a standard approach.

I carry out the following regression on daily prices over a 2 year period:

asset1_t = drift + beta*asset2_t

The residuals, denoted a_t, are given by:

a_t = asset1_t - drift - beta*asset2_t

I find that the residuals are stationary. Thus, using a ratio of 1*asset1 and beta*asset2 I trade the spread.

From reading the comments above it seems that a common approach to trading the spread is to not use this ratio but instead estimate a linear regression over a shorter timeframe on a rolling basis... so they are only using the longer window to identifying a pair that is cointegrated .... Can you please clarify the logic in this? To me it seems that the cointegrating relationship holds for the original ratio (1, beta) .... but not necessarily for the rolling estimate?

Regards,

Rob.

Ernie Chan said...

Hi Rob,
Using a shorter time period to find a rolling estimate of the hedge ratio enables us to exit unprofitable positions naturally. Also, hedge ratio may drift over time in the out-of-sample period. You can backtest this scheme to see if this works better than a static hedge ratio out-of-sample. (Obviously, a static hedge ratio will work best in-sample.)
Ernie

Anonymous said...

Hi Ernie,

Ok, I can see logic in that. I will run some backtests to investigate it further.

From your experience of trading 2 leg and 3 leg mean reverting spreads what is the highest frequency you can trade the spread at before the transaction costs become to large relative to the expect profit per trade. For example, trading on a 1 second basis is too fast.

Regards,

Rob.ad

Ernie Chan said...

Hi Rob,
You can trade a pair at any frequency depending on the market and your technology. A holding period of seconds is possible.
Ernie

HASNAT said...

This is probably the best place for coint. Please I need clarification for following.
1. I have 2 series closing data coming in live every minute.
2. I check co-int by stacking every new minute data in the stack, and cointegration is detected/starts at some time = T1 : (T1-20)
3. Then I find the spread at that particular time. and enter a trade if spread suggests so.
4. Then new data comes in, I again stack the new data on to the previous data and keep checking for cointegration. and also calculate the spread for data (T1-20):NOW.
5. This time NOW changes every minute and every minute I calculate new spread from original start point when the cointegration started (i.e T1-20) till NOW.
6. Is it correct appraoch to start from the beginning or should I only use NOW-20:NOW for caluclating regression coefficients and then spread.

Thanks

Ernie Chan said...

Hi HASNAT,
It is not statistically significant to determine cointegration based on 20 data points. You need at least 100 data points. Also, coint is a long term property of time series, it is not very meaningful to determine cointegration using intraday data.
But in general, you can use NOW-lookback:NOW to test for coint.
Ernie

Anonymous said...

Mr chan, as you have pointed out, when calculating an ADF test to determine cointegration it is a good idea to test both ways (switch independent and dependent variable with each other). I have noticed that if the Y variable is smaller than the X variable it is more likely to be cointegrated for the ADF test, but not as likely when switched. This happens a lot. Does it make sense to calculate a hedge ratio with data before the start of test date and multiply X by that hedge ratio before performing ADF test? X and Y would seem to be normalized then. Is this valid? Thanks

Ernie Chan said...

Hi Anon,
The best way to avoid the order-dependence that you pointed out is to use Johansen test for cointegration. The eigenvector thus obtained is the best hedge ratios you can find.

For details, please see my new book.
Ernie

Anonymous said...

continuing from my post right above. My fear of using Cadf is that when testing both y=bx and x=by and one way is strong, but the other is weak cointegration I will be throwing out some good pairs that are only not working because of the price difference of the 2 stocks. You suggest me to use Johansen test to avoid this, but I have read you yourself use Engle-Granger. Do you not worry about the affect of order-dependence on your pairs? Thanks!!

Ernie Chan said...

Anon,
In cases where cointegration is strong, which is where I usually operate, using cadf is quite OK.
However, if you have any concern about the strength of cointegration, I recommend you switch to Johansen.
Ernie

Anonymous said...

Mr. Chan, What if the cointegration tests (johansen and adf) show strong cointegration, but a chart of the spread has a pretty clear up or down trend (not at all like the GLD, GDX pictures in your book)? Could the test be giving some kind of false positive?

Thanks again sir

Ernie Chan said...

Hi anon,
If you have set the input parameter p=2, Johansen test does allow for a non-zero slope of the spread as function of time. The mean reversion is then with respect to the trend line.
Ernie

Anonymous said...

hello,
I am still trying to understand Johansen method fully. Can you tell me if the correct number of lags to use is 1? (k=1) I am just testing 2 variables at a time if this makes a difference

thanks,
confused

Ernie Chan said...

Anon,
Yes, I often find k=1 is the minimum. If this can't reject the null hypothesis, try larger numbers, but I think they usually don't help much.
Ernie

experquisite said...

Hi Ernie -- are there any papers or guidance on doing cointegration with tick data, instead of time-sliced bar data? I am thinking about creating 100-1000 tick bars for each asset, but it seems to be that in order to assess cointegration, I need a time-consistent frame across all assets, so I need to close all the "bars" at the same instant. Is there a more fluid but still consistent way of assessing the discrepancy from cointegrated mean?

I hope you understand my question, I believe it's quite basic but I don't want to re-invent the wheel.

Ernie Chan said...

Hi experquisite,
I think the only way to use tick data for coint test is to create volume bars, as you suggested.
Ernie

Hasnat said...

Ernie Chan, I have three basic questions. 1. In coint based tading, every new data point changes the regression coefficients. How can we stabilize the regression coefficient ? 2. It seems that coint based strategies only work on daily,weekly data and should not be used for minute, hourly, intraday data. but then the trades would be very slow (not rapid). Is there any rapid version/strategy of coint based trading? 3. Is there a simple, basic paper, example which can give a practical coint based trading strategy ? Mostly the papers calculate the regression coefficient in advance with all data (including the one to be tested). which I think is not a correct approach for practical trading.

Ernie Chan said...

Hasnat,
1) You should run regression everyday, and update your coefficients and possibly positions everyday based on the latest coefficients.

2) If you want to analyze coint day trading strategies (i.e. always liquidate all positions at market close), you can concatenate all the intraday prices of different days together, but adjusting them to eliminate the overnight gaps (similar to backadjustment of futures prices to avoid rollover gaps.)

3) Indeed, when computing regression coefficient in a backtest, one should only use data up to the moment you need to enter a trade. I don't know of any such papers, as these strategies are straightforward to construct and backtest yourself.

Ernie

Li said...

Hi Mr. Chan,
I tried to replicate your johansen test and get the GLD,GDX,USO combination with the identical parameters as yours. But I did't succeed, where I used daily close price with time period 20060523-20100521.

So I am wondering what prices were you using? Did you do any preprocessing on your database?

Ernie Chan said...

Li,
The only adjustments are for splits and dividends, which correspond to the Adj Close column in Yahoo! Finance.
Ernie

cheerful said...

Dr Ernie,

When I find the APR and Sharpe ratio for the USDCAD using the stationarytest file. I get the APR=63% and Sharpe Ratio=0.1. How can APR be so high while the equity line is so poor?

fprintf(1, 'APR=%f Sharpe=%f\n', sum(pnl).^(252/length(pnl)), sqrt(252)*mean(pnl)/std(pnl));

Thank you
Leo

Ernie Chan said...

Hi Leo,
APR can be very high just by luck. Sharpe ratio is a much better measure of consistency.
Ernie

cheerful said...

Dr Earnest,

May I know the time zone of your USDCAD, CADAUD 1 minute data? ie NY time or Singapore time?

Thank you
Leo

Ernie Chan said...

Leo,
The data uses ET.
Ernie