Friday, March 03, 2017

More Data or Fewer Predictors: Which is a Better Cure for Overfitting?

One of the perennial problems in building trading models is the spareness of data and the attendant danger of overfitting. Fortunately, there are systematic methods of dealing with both ends of the problem. These methods are well-known in machine learning, though most traditional machine learning applications have a lot more data than we traders are used to. (E.g. Google used 10 million YouTube videos to train a deep learning network to recognize cats' faces.)

To create more training data out of thin air, we can resample (perhaps more vividly, oversample) our existing data. This is called bagging. Let's illustrate this using a fundamental factor model described in my new book. It uses 27 factor loadings such as P/E, P/B, Asset Turnover, etc. for each stock. (Note that I call cross-sectional factors, i.e. factors that depend on each stock, "factor loadings" instead of "factors" by convention.) These factor loadings are collected from the quarterly financial statements of SP 500 companies, and are available from Sharadar's Core US Fundamentals database (as well as more expensive sources like Compustat). The factor model is very simple: it is just a multiple linear regression model with the next quarter's return of a stock as the dependent (target) variable, and the 27 factor loadings as the independent (predictor) variables. Training consists of finding the regression coefficients of these 27 predictors. The trading strategy based on this predictive factor model is equally simple: if the predicted next-quarter-return is positive, buy the stock and hold for a quarter. Vice versa for shorts.

Note there is already a step taken in curing data sparseness: we do not try to build a separate model with a different set of regression coefficients for each stock. We constrain the model such that the same regression coefficients apply to all the stocks. Otherwise, the training data that we use from 200701-201112 will only have 1,260 rows, instead of 1,260 x 500 = 630,000 rows.

The result of this baseline trading model isn't bad: it has a CAGR of 14.7% and Sharpe ratio of 1.8 in the out-of-sample period 201201-201401. (Caution: this portfolio is not necessarily market or dollar neutral. Hence the return could be due to a long bias enjoying the bull market in the test period. Interested readers can certainly test a market-neutral version of this strategy hedged with SPY.) I plotted the equity curve below.




Next, we resample the data by randomly picking N (=630,000) data points with replacement to form a new training set (a "bag"), and we repeat this K (=100) times to form K bags. For each bag, we train a new regression model. At the end, we average over the predicted returns of these K models to serve as our official predicted returns. This results in marginal improvement of the CAGR to 15.1%, with no change in Sharpe ratio.

Now, we try to reduce the predictor set. We use a method called "random subspace". We randomly pick half of the original predictors to train a model, and repeat this K=100 times. Once again, we average over the predicted returns of all these models. Combined with bagging, this results in further marginal improvement of the CAGR to 15.1%, again with little change in Sharpe ratio.

The improvements from either method may not seem large so far, but at least it shows that the original model is robust with respect to randomization.

But there is another method in reducing the number of predictors. It is called stepwise regression. The idea is simple: we pick one predictor from the original set at a time, and add that to the model only if BIC  (Bayesian Information Criterion) decreases. BIC is essentially the negative log likelihood of the training data based on the regression model, with a penalty term proportional to the number of predictors. That is, if two models have the same log likelihood, the one with the larger number of parameters will have a larger BIC and thus penalized. Once we reached minimum BIC, we then try to remove one predictor from the model at a time, until the BIC couldn't decrease any further. Applying this to our fundamental factor loadings, we achieve a quite significant improvement of the CAGR over the base model: 19.1% vs. 14.7%, with the same Sharpe ratio.

It is also satisfying that the stepwise regression model picked only two variables out of the original 27. Let that sink in for a moment: just two variables account for all of the predictive power of a quarterly financial report! As to which two variables these are - I will reveal that in my talk at QuantCon 2017 on April 29.

===

My Upcoming Workshops

March 11 and 18: Cryptocurrency Trading with Python

I will be moderating this online workshop for my friend Nick Kirk, who taught a similar course at CQF in London to wide acclaim.

May 13 and 20: Artificial Intelligence Techniques for Traders

I will discuss in details AI techniques such as those described above, with other examples and in-class exercises. As usual, nuances and pitfalls will be covered.

Wednesday, November 16, 2016

Pre-earnings Annoucement Strategies

Much has been written about the Post-Earnings Announcement Drift (PEAD) strategy (see, for example, my book), but less was written about pre-earnings announcement strategies. That changed recently with the publication of two papers. Just as with PEAD, these pre-announcement strategies do not make use of any actual earnings numbers or even estimates. They are based entirely on announcement dates (expected or actual) and perhaps recent price movement.

The first one, by So and Wang 2014, suggests various simple mean reversion strategies for US stocks that enter into positions at the market close just before an expected announcement. Here is my paraphrase of one such strategies:

1) Suppose t is the expected earnings announcement date for a stock in the Russell 3000 index.
2) Compute the pre-announcement return from day t-4 to t-2 (counting trading days only).
3) Subtract a market index return over the same lookback period from the pre-announcement return, and call this market-adjusted return PAR.
4) Pick the 18 stocks with the best PAR and short them (with equal dollars) at the market close of t-1, liquidate at market close of t+1.  Pick the 18 stocks with the worst PAR, and do the opposite. Hedge any net exposure with a market-index ETF or future.

I backtested this strategy using Wall Street Horizon (WSH)'s expected earnings dates data, applying it to stocks in the Russell 3000 index, and hedging with IWV. I got a CAGR of 9.1% and a Sharpe ratio of  1 from 2011/08/03-2016/09/30. The equity curve is displayed below.



Note that WSH's data was used instead of  Yahoo! Finance, Compustat, or even Thomson Reuters' I/B/E/S earnings data, because only WSH's data is "point-in-time". WSH captured the expected earnings announcement date on the day before the announcement, just as we would have if we were live trading. We did not use the actual announcement date as captured in most other data sources because we could not be sure if a company changed their expected announcement date on that same date. The actual announcement date can only be known with certainty after-the-fact, and therefore isn't point-in-time. If we were to run the same backtest using Yahoo! Finance's historical earnings data, the CAGR would have dropped to 6.8%, and the Sharpe ratio dropped to 0.8.

The notion that companies do change their expected announcement dates takes us to the second strategy, created by Ekaterina Kramarenko of Deltix's Quantitative Research Team. In her paper "An Automated Trading Strategy Using Earnings Date Movements from Wall Street Horizon", she describes the following strategy that explicitly makes use of such changes as a trading signal:

1) At the market close prior to the earnings announcement  expected between the current close and the next day's open, compute deltaD which is the last change of the expected announcement date for the upcoming announcement, measured in calendar days. deltaD > 0 if the company moved the announcement date later, and deltaD < 0 if the company moved the announcement date earlier.
2) Also, at the same market close, compute deltaU which is the number of calendar days since the last change of the expected announcement date.
3) If deltaD < 0 and deltaU < 45, buy the stock at the market close and liquidate on next day's market open. If deltaD > 0 and deltaU >= 45, do the opposite.

The intuition behind this strategy is that if a company moves an expected announcement date earlier, especially if that happens close to the expected date, that is an indication of good news, and vice versa. Kramarenko found a CAGR of 14.95% and a Sharpe ratio of 2.08 by applying this strategy to SPX stocks from 2006/1/3 - 2015/9/2.

In order to reproduce this result, one needs to make sure that the capital allocation is based on the following formula: suppose the total buying power is M, and the number of trading signals at the market close is n, then the trading size per stock is M/5 if n <= 5, and is M/n if n > 5.

I backtested this strategy from 2011/8/3-2016/9/30 on a fixed SPX universe on 2011/7/5, and obtained CAGR=17.6% and Sharpe ratio of 0.6.

Backtesting this on Russell 3000 index universe of stocks yielded better results, with CAGR=17% and Sharpe ratio=1.9.  Here, I adjust the trading size per stock to M/30 if n <=30, and to M/n if n > 30, given that the total number of stocks in Russell 3000 is about 6 times larger than that of SPX. The equity curve is displayed below:


Interestingly, a market neutral version of this strategy (using IWV to hedge any net exposure) does not improve the Sharpe ratio, but does significantly depressed the CAGR.

===

Acknowledgement: I thank Michael Raines at Wall Street Horizon for providing the historical point-in-time expected earning dates data for this research. Further, I thank Stuart Farr and  Ekaterina Kramarenko at Deltix for providing me with a copy of their paper and explaining to me the nuances of their strategy. 

===

My Upcoming Workshop

January 14 and 21: Algorithmic Options Strategies

This  online course is different from most other options workshops offered elsewhere. It will cover backtesting intraday option strategies and portfolio option strategies.

Wednesday, September 28, 2016

Really, Beware of Low Frequency Data

I wrote in a previous article about why we should backtest even end-of-day (daily) strategies with intraday quote data. Otherwise, the performance of such strategies can be inflated. Here is another brilliant example that I came across recently.

Consider the oil futures ETF USO and its evil twin, the inverse oil futures ETF DNO*. In theory, if USO has a daily return of x%, DNO will have a daily return of -x%. In practice, if we plot the daily returns of DNO against that of USO from 2010/9/27-2016/9/9, using the usual consolidated end-of-day data that you can find on Yahoo! Finance or any other vendor,





















we see that though the slope is indeed -1 (to within a standard error of 0.004), there are many days with significant deviation from the straight line. The trader in us will immediately think "arbitrage opportunities!"

Indeed, if we backtest a simple mean reversion strategy on this pair - just buy equal dollar amount of USO and DNO when the sum of their daily returns is less than 40 bps at the market close, hold one day, and vice versa - we will find a strategy with a decent Sharpe ratio of 1 even after deducting 5 bps per side as transaction costs. Here is the equity curve:





















Looks reasonable, doesn't it? However, if we backtest this strategy again with BBO data at the market close, taking care to subtract half the bid-ask spread as transaction cost, we find this equity curve:














We can see that the problem is not only that we lose money on practically every trade, but that there was seldom any trade triggered. When the daily EOD data suggests a trade should be triggered, the 1-min bar BBO data tells us that in fact there was no deviation from the mean.

(By the way, the returns above were calculated before we even deduct the borrow costs of occasionally shorting these ETFs. The "rebate rate" for USO is about 1% per annum on Interactive Brokers, but a steep 5.6% for DNO.)

In case you think this problem is peculiar to USO vs DNO, you can try TBT vs UBT as well.

Incidentally, we have just verified a golden rule of financial markets: apparent deviation from efficient market is allowed when no one can profitably trade on the arbitrage opportunity.

===
*Note: according to www.etf.com, "The issuer [of DNO] has temporarily suspended creations for this fund as of Mar 22, 2016 pending the filing of new paperwork with the SEC. This action could create unusual or excessive premiums— an increase of the market price of the fund relative to its fair value. Redemptions are not affected. Trade with care; check iNAV vs. price." For an explanation of "creation" of ETF units, see my article "Things You Don't Want to Know about ETFs and ETNs".

===

Industry Update
  • Quantiacs.com just recently registered as a CTA and operates a marketplace for trading algorithms that anyone can contribute. They also published an educational blog post for Python and Matlab backtesters: https://quantiacs.com/Blog/Intro-to-Algorithmic-Trading-with-Heikin-Ashi.aspx
  • I will be moderating a panel discussion on "How can funds leverage non-traditional data sources to drive investment returns?" at Quant World Canada in Toronto, November 10, 2016. 

===

Upcoming Workshops
Momentum strategies are for those who want to benefit from tail events. I will discuss the fundamental reasons for the existence of momentum in various markets, as well as specific momentum strategies that hold positions from hours to days.

A senior director at a major bank wrote me: "…thank you again for the Momentum Strategies training course this week. It was very beneficial. I found your explanations of the concepts very clear and the examples well developed. I like the rigorous approach that you take to strategy evaluation.”

Friday, June 17, 2016

Things You Don't Want to Know about ETFs and ETNs

Everybody loves trading or investing in ETPs. ETP is the acronym for exchange-traded products, which include both exchange-traded funds (ETF) and exchange-traded notes (ETN). They seem simple, transparent, easy to understand. But there are a few subtleties that you may not know about.

1) The most popular ETN is VXX, the volatility index ETF. Unlike ETF, ETN is actually an unsecured bond issued by the issuer. This means that the price of the ETN may not just depend on the underlying assets or index. It could potentially depend on the credit-worthiness of the issuer. Now VXX is issued by Barclays. You may think that Barclays is a big bank, Too Big To Fail, and you may be right. Nevertheless, nobody promises that its credit rating will never be downgraded. Trading the VX future, however, doesn't have that problem.

2) The ETP issuer, together with the "Authorized Participants"  (the market makers who can ask the issuer to issue more ETP shares or to redeem such shares for the underlying assets or cash), are supposed to keep the total market value of the ETP shares closely tracking the NAV of the underlying assets. However, there was one notable instance when the issuer deliberately not do so, resulting in big losses for some investors.

That was when the issuer of TVIX, the leveraged ETN that tracks 2x the daily returns of VXX, stopped all creation of new TVIX shares temporarily on February 22, 2012 (see sixfigureinvesting.com/2015/10/how-does-tvix-work/). That issuer is Credit Suisse, who might have found that the transaction costs of rebalancing this highly volatile ETN were becoming too high. Because of this stoppage, TVIX turned into a closed-end fund (temporarily), and its NAV diverged significantly from its market value. TVIX was trading at a premium of 90% relative to the underlying index. In other words, investors who bought TVIX in the stock market by the end of March were paying 90% more than they would have if they were able to buy the VIX index instead. Right after that, Credit Suisse announced they would resume the creation of TVIX shares. The TVIX market price immediately plummeted to its NAV per share, causing huge losses for those investors who bought just before the resumption.

3) You may be familiar with the fact that a β-levered ETF is supposed to track only β times the daily returns of the underlying index, not its long-term return. But you may be less familiar with the fact that it is also not supposed to track β times the intraday return of that index (although at most times it actually does, thanks to the many arbitrageurs.)

Case in point: during the May 2010 Flash Crash, many inverse levered ETFs experienced a decrease in price as the market was crashing downwards. As inverse ETFs, many investors thought they are supposed to rise in price and act as hedge against market declines. For example, this comment letter to the SEC pointed out that DOG, the inverse ETF that tracks -1x Dow 30 index, went down more than 60% from its value at the beginning (2:40 pm ET) of the Flash Crash. This is because various market makers including the Authorized Participants for DOG weren't making markets at that time. But an equally important point to note is that at the end of the trading day, DOG did return 3.2%, almost exactly -1x the return of DIA (the ETF that tracks the Dow 30). So it functioned as advertised. Lesson learned: We aren't supposed to use inverse ETFs for intraday nor long term hedging!

4) The NAV (not NAV per share) of an ETF does not have to change in the same % as the underlying asset's unit market value. For example, that same comment letter I quoted above wrote that GLD, the gold ETF, declined in price by 24% from March 1 to December 31, 2013, tracking the same 24% drop in spot gold price. However, its NAV dropped 52%. Why? The Authorized Participants redeemed many GLD shares, causing the shares outstanding of GLD to decrease from 416 million to 266 million.  Is that a problem? Not at all. An investor in that ETF only cares that she experienced the same return as spot gold, and not how much assets the ETF held. The author of that comment letter strangely wrote that "Investors wishing to participate in the gold market would not buy the GLD if they knew that a price decline in gold could result in twice as much underlying asset decline for the GLD." That, I believe, is nonsense.

For further reading on ETP, see www.ici.org/pdf/per20-05.pdf and www.ici.org/pdf/ppr_15_aps_etfs.pdf.

===

Industry Update

Alex Boykov co-developed the WFAToolbox – Walk-Forward Analysis Toolbox for MATLAB, which automates the process of using a moving window to optimize parameters and entering trades only in the out-of-sample period. He also compiled a standalone application from MATLAB that allows any user (having MATLAB or not) to upload quotes in csv format from Google Finance for further import to other programs and for working in Excel. You can download it here: wfatoolbox.com/epchan.

Upcoming Workshop

July 16 and 23, Saturdays: Artificial Intelligence Techniques for Traders

AI/machine learning techniques are most useful when someone gives us newfangled technical or fundamental indicators, and we haven't yet developed the intuition of how to use them. AI techniques can suggest ways to incorporate them into your trading strategy, and quicken your understanding of these indicators. Of course, sometimes these techniques can also suggest unexpected strategies in familiar markets.

My course covers the basic AI techniques useful to a trader, with emphasis on the many ways to avoid overfitting.

Thursday, April 07, 2016

Mean reversion, momentum, and volatility term structure

Everybody know that volatility depends on the measurement frequency: the standard deviation of 5-minute returns is different from that of daily returns. To be precise, if z is the log price, then volatility, sampled at intervals of τ, is 

volatility(τ)=√(Var(z(t)-z(t-τ)))

where Var means taking the variance over many sample times. If the prices really follow a geometric random walk, then Var(τ)≡Var((z(t)-z(t-τ)) ∝ τ, and the volatility simply scales with the square root of the sampling interval. This is why if we measure daily returns, we need to multiply the daily volatility by √252 to obtain the annualized volatility.

Traders also know that prices do not really follow a geometric random walk. If prices are mean reverting, we will find that they do not wander away from their initial value as fast as a random walk. If prices are trending, they wander away faster. In general, we can write

Var(τ)  ∝ τ^(2H)

where H is called the "Hurst exponent", and it is equal to 0.5 for a true geometric random walk, but will be less than 0.5 for mean reverting prices, and greater than 0.5 for trending prices.

If we annualize the volatility of a mean-reverting price series, it will end up having a lower annualized volatility than that of a geometric random walk, even if both have exactly the same volatility measured at, say, 5-min bars. The opposite is true for a trending price series.  For example, if we try this on AUDCAD, an obviously mean-reverting time series, we will get H=0.43.

All of the above are well-known to many traders, and are in fact discussed in my book. But what is more interesting is that the Hurst exponent itself can change at some time scale, and this change sometimes signals a shift from a mean reversion to a momentum regime, or vice versa. To see this, let's plot volatility (or more conveniently, variance) as a function of τ. This is often called the term structure of (realized) volatility. 

Start with the familiar SPY. we can compute the intraday returns using midprices from 1 minutes to 2^10 minutes (~17 hrs), and plot the log(Var(τ)) against log(τ). The fit, shown below,  is excellent. (Click figure to enlarge). The slope, divided by 2, is the Hurst exponent, which turns out to be 0.494±0.003, which is very slightly mean-reverting.




But if we do the same for daily returns of SPY, for intervals of 1 day up to 2^8 (=256) days, we find that H is now 0.469±0.007, which is significantly mean reverting. 




Conclusion: mean reversion strategies on SPY should work better interday than intraday.

We can do the same analysis for USO (the WTI crude oil futures ETF). The intraday H is 0.515±0.001, indicating significant trending behavior. The daily H is 0.56±0.02, even more significantly trending. So momentum strategies should work for crude oil futures at any reasonable time scales.


Let's turn now to GLD, the gold ETF. Intraday H=0.505±0.002, which is slightly trending. But daily H=0.469±0.007: significantly mean reverting! Momentum strategies on gold may work intraday, but mean reversion strategies certainly work better over multiple days. Where does the transition occur? We can examine the term structure closely:




We can see that at around 16-32 days, the volatilities depart from straight line extrapolated from intraday frequencies. That's where we should switch from momentum to mean reversion strategies.

One side note of interest: when we compute the variance of returns over periods that straddle two trading days and plot them as function of log(τ), should τ include the hours when the market was closed? It turns out that the answer is yes, but not completely.  In order to produce the chart above where the daily variances initially fall on the same straight line as the intraday variances, we have to count 1 trading day as equivalent to 10 trading hours. Not 6.5 (for the US equities/ETF markets), and not 24. The precise number of equivalent trading hours, of course, varies across different instruments.

===


Industry Update
Upcoming Workshops

There are a lot more to mean reversion strategies than just pairs trading. Find out how to thrive in the current low volatility environment favorable to this type of strategies.

Friday, November 27, 2015

Predicting volatility

Predicting volatility is a very old topic. Every finance student has been taught to use the GARCH model for that. But like most things we learned in school, we don't necessarily expect them to be useful in practice, or to work well out-of-sample. (When was the last time you need to use calculus in your job?) But out of curiosity, I did a quick investigation of its power on predicting the volatility of SPY daily close-to-close returns. I estimated the parameters of a GARCH model on training data from December 21, 2005 to December 5, 2011 using Matlab's Econometric toolbox, and tested how often the sign of the predicted 1-day change in volatility agree with reality on the test set from December 6, 2011 to November 25, 2015. (One-day change in realized volatility is defined as the change in the absolute value of the 1-day return.) A pleasant surprise: the agreement is 58% of the days.

If this were the accuracy for predicting the sign of the SPY return itself, we should prepare to retire in luxury. Volatility is easier to predict than signed returns, as every finance student has also been taught. But what good is a good volatility prediction? Would that be useful to options traders, who can trade implied volatilities instead of directional returns? The answer is yes, realized volatility prediction is useful for implied volatility prediction, but not in the way you would expect.

If GARCH tells us that the realized volatility will increase tomorrow, most of us would instinctively go out and buy ourselves some options (i.e. implied volatility). In the case of SPY, we would probably go buy some VXX. But that would be a terrible mistake. Remember that the volatility we predicted is an unsigned return: a prediction of increased volatility may mean a very bullish day tomorrow. A high positive return in SPY is usually accompanied by a steep drop in VXX. In other words, an increase in realized volatility is usually accompanied by a decrease in implied volatility in this case. But what is really strange is that this anti-correlation between change in realized volatility and change in implied volatility also holds when the return is negative (57% of the days with negative returns). A very negative return in SPY is indeed usually accompanied by an increase in implied volatility or VXX, inducing positive correlation. But on average, an increase in realized volatility due to negative returns is still accompanied by a decrease in implied volatility.

The upshot of all these is that if you predict the volatility of SPY will increase tomorrow, you should short VXX instead.

====

Industry Update
  • Quantiacs.com just launched a trading system competition with guaranteed investments of $2.25M for the best three trading systems. (Quantiacs helps Quants get investments for their trading algorithms and helps investors find the right trading system.)
  • A new book called "Momo Traders - Tips, Tricks, and Strategies from Ten Top Traders" features extensive interviews with ten top day and swing traders who find stocks that move and capitalize on that momentum. 
  • Another new book called "Algorithmic and High-Frequency Trading" by 3 mathematical finance professors describes the sophisticated mathematical tools that are being applied to high frequency trading and optimal execution. Yes, calculus is required here.
My Upcoming Workshop

January 27-28: Algorithmic Options Strategies

This is a new online course that is different from most other options workshops offered elsewhere. It will cover how one can backtest intraday option strategies and portfolio option strategies.

March 7-11: Statistical Arbitrage, Quantitative Momentum, and Artificial Intelligence for Traders.

These courses are highly intensive training sessions held in London for a full week. I typically need to walk for an hour along the Thames to rejuvenate after each day's class.

The AI course is new, and to my amazement, some of the improved techniques actually work.

My Upcoming Talk

I will be speaking at QuantCon 2016 on April 9 in New York. The topic will be "The Peculiarities of Volatility". I pointed out one peculiarity above, but there are others.

====

QTS Partners, L.P. has a net return of +1.56% in October (YTD: +11.50%). Details available to Qualified Eligible Persons as defined in CFTC Rule 4.7.

====

Follow me on Twitter: @chanep




Friday, October 16, 2015

An open-source genetic algorithm software (Guest post)

By Lukasz Wojtow

Mechanical traders never stop researching for the next market edge. Not only to get better results but also to have more than one system. The best trading results can be achieved with multiple non-correlated systems traded simultaneously. Unfortunately, most traders use similar market inefficiency: some traders specialize in trend following, some in mean reversion and so on. That's because learning to exploit one kind of edge is hard enough, mastering all of them – impossible. It would be beneficial to have a software that creates many non-related systems.

Recently I released Genotick - an open source software that can create and manage a group of trading systems. At the Genotick's core lies an epiphany: if it's possible to create any software with just a handful of assembler instructions, it should be possible to create any trading systems with a handful of similarly simple instructions. These simple and meaningless-on-its-own instructions become extremely powerful when combined together. Right instructions in the right order can create any type of mechanical system: trend following, mean reverting or even based on fundamental data.

The driving engine behind Genotick's power is a genetic algorithm. Current implementation is quite basic, but with some extra quirks. For example, if any of the systems is really bad – it stays in the population but its predictions are reversed. Another trick is used to help recognize biased trading systems: a system can be removed if it doesn't give mirrored prediction on mirrored data. So for example, position on GBP/USD must be opposite to the one on USD/GBP. Genotick also supports optional elitism (where the best systems always stay in the population, while others are retired due to old age), protection for new systems (to avoid removing systems that didn't yet have a chance to prove themselves) and inheriting initial system's weight from parents. These options give users plenty of room for experimentation.

When Genotick is run for the first time - there are no systems. They are created at the start using randomly chosen instructions. Then, a genetic algorithm takes over: each system is executed to check its prediction on historical data. Systems that predicted correctly gain weight for future predictions, systems that predicted incorrectly – lose weight. Gradually, day after day, population of systems grows. Bad systems are removed and good systems breed. Prediction for each day is calculated by adding predictions of all systems available at the time. Genotick doesn't iterate over the same historical data more than once – training process looks exactly as if it was executed in real life: one day at a time. In fact, there is no separate “training” phase, program learns a little bit as each day passes by.

Interestingly, Genotick doesn't check for rationale behind created systems. As each system is created out of random instructions, it's possible (and actually very likely) that some systems use ridiculous logic. For example, it's possible that a system will give a “Buy” signal if Volume was positive 42 days ago. Another system may want to go short each time the third digit in yesterday's High is the same as second digit in today's Open. Of course, such systems would never survive in real world and also they wouldn't survive for long in Genotick's population. Because each system's initial weight is zero, they never gain any significant weight and therefore don't spoil cumulative prediction given by the program. It may seem a little silly to allow such systems in the first place, but it enables Genotick to test algorithms that are free from traders' believes, misguided opinions and personal limitations. The sad fact is, the market doesn't care about what system you use and how much sweat and tears you put into it. Market is going to do what it wants to do – no questions asked, not taking prisoners. Market doesn't even care if you use any sort of intelligence, artificial or not. And so, the only rationale behind every trading system should be very simple: “Does it work?”. Nothing more, nothing less. This is the only metric Genotick uses to gauge systems.

Each program's run will be a little bit different. Equity chart below shows one possible performance. Years shown are 2007 until 2015 with actual training starting in 2000. There is nothing special about year 2007, remember – Genotick learns as it goes along. However, I felt it's important to look how it performed during financial crisis. Markets traded were:

USD/CHF, USD/JPY, 10 Year US Bond Yield, SPX, EUR/USD, GBP/USD and Gold.

(In some cases, I tested the system on a market index such as SPX instead of an instrument that tracks the index such as SPY, but the difference should be minor.)  All markets were mirrored to allow removing biased systems. Some vital numbers:

CAGR: 9.88%
Maxim drawdown: -21.6%
Longest drawdown: 287 trading days
Profitable days: 53.3 %
CALMAR ratio: 0.644
Sharpe ratio: 1.06
Mean annual gain: 24.1%
Losing year: 2013 (-12%)

(Click the cumulative returns in % chart below to enlarge.)
Cumulative Returns (%) since 2007


These numbers represent only “directional edge” offered by the software. There were no stop-losses, no leverage and no position sizing, which could greatly improve real life results. The performance assumes that at the end of each day, the positions are rebalanced so that each instrument starts with equal dollar value. (I.e. this is a constant rebalanced portfolio.)

Artificial Intelligence is a hot topic. Self driving cars that drive better than an average human and chess algorithms that beat an average player are facts. The difference is that using AI for trading is perfectly legal and opponents may never know. Unlike chess and driving, there is a lot of randomness in financial markets and it may take us longer to notice when AI starts winning. Best hedge funds can be still run by humans but if any trading method is really superior, AI will figure it out as well.

At the moment Genotick is more of a proof-of-concept rather than production-ready.
It is very limited in usability, it doesn't forgive mistakes and it's best to ask before using it for real trading. You will need Java 7 to run it. It's tested on both Linux and Windows 10. Example historical data is included. Any questions or comments are welcomed.

Genotick website: http://genotick.com

For a general reference on genetic algorithms, see "How to Solve It: Modern Heuristics". 

===

My Upcoming Workshop


Momentum strategies have performed superbly in the recent market turmoil, since they are long volatility. This course will cover momentum strategies on a variety of asset classes and with a range of trading horizons.

====

Follow me on Twitter: @chanep

Friday, September 18, 2015

Interview with Euan Sinclair

I have been a big fan of options trader and author Euan Sinclair for a long time. I have cited his highly readable and influential book Option Trading in my own work, and it is always within easy reach from my desk. His more recent book Volatility Trading is another must-read. I ran into him at the Chicago Trading Show a few months ago where he was a panelist on volatility trading, and he graciously agreed to be interviewed by me.

What is your educational background, and how did you start your trading career?

I got a Ph.D. in theoretical physics, studying the transition from quantum to classical mechanics. I always had intended to become a professor but the idea became less appealing once I saw what they did all day. At this time Nick Leeson was making news by blowing up Barings Bank and I thought I could do that. I mean trade derivatives not blowing up a bank (although I could probably manage that as well).

Do you recommend a new graduate with a similar educational background as yours to pursue finance or trading as a career today?

I don't think I would for a few reasons.

The world of derivatives and trading in general is now so much more visible than it was and there are now far better ways to prepare. When I started, physics Ph.D.s were hired only because they were smart and numerate and could pick things up on their own. My first trading firm had no training program. You just had to figure stuff out on your own. Now there are many good MFE courses or you could do a financial economics Ph.D.

Further, it would very much depend on exactly what kind of physics had been studied. I did a lot of classical mechanics which is really geometry. This kind of "pure" theory isn't nearly as useful as a background heavy with stats or simulation.

I think I could still make the transition, but it is no longer close to the ideal background.

You have been a well-known options trader with a long track record: what do you think is the biggest obstacle to success for a retail options trader?

Trading costs. Most option trading ideas are still built on the Black-Scholes-Merton framework and the idea of dynamic hedging (albeit heavily modified). Most pro firms have stat arb like execution methods to reduce the effective bid-ask they pay in the underlying. They also pay practically no ticket charges and probably get rebates. Even then, their average profit per option trade is very small and has been steadily decreasing.

Further, a lot of positional option trading relies on a large universe of possible trades to consider. This means a trader needs good scanning software to find trades, and a decent risk system because she will tend to have hundreds of positions on at one time. This is all expensive as well. 

Retail traders can't play this game at all. They have to look for situations that require little or no rebalancing and that can be limited to a much smaller universe. I would recommend the VIX complex or equity earnings events.
As an options trader, do you tend to short or long volatility?

I am short about 95% of the time, but about 35% of my profits come from the long trades.

Do you find it possible to fully automate options trading in the same way as that stocks, futures, and FX trading have been automated?

I see no reason why not. 

You have recently started a new website called FactorWave.com. Can you tell us about it? What prompted the transition of your focus from options to stocks?

FactorWave is a set of stock and portfolio tools that do analysis in terms of factors such as value, size, quality and momentum. There is a lot of research by both academics and investors that shows that these (and other) factors can give market beating returns and lower volatility.

I've been interested in stocks for a long time. Most of my option experience has been with stock options and some of my best research was on how these factors affected volatility trading returns.Also, equity markets are a great place to build wealth over the long term. They are a far more suitable vehicle for retirement planning than options!

I actually think the distinction between trading and investing is fairly meaningless. The only difference seems to be the time scale and this is very dependent on the person involved as well, with long-term meaning anything form months to inter-generational. All I've ever done as a trader is to look for meaningful edges and I found a lot of these in options. But I've never found anything as persistent as the stock factors. There is over a hundred years of statistical evidence, studies in many countries and economic and behavioral reasons for their existence. They present some of the best edges I have ever found. That should be appealing to any trader or investor.

Thank you! These are really valuable insights.

====

My Upcoming Workshop


Momentum strategies have performed superbly in the recent market turmoil, since they are long volatility. This course will cover momentum strategies on a variety of asset classes and with a range of trading horizons.

====

QTS Partners, L.P. has a net return of 1.25% in August (YTD: 10.44%).

====

Reader Burak B. has converted some of the Matlab codes from my book Algorithmic Trading into Python codes and made them open-source: https://github.com/burakbayramli/quant_at.

====

Follow me on Twitter: @chanep