Backtesting is a method for assessing the validity of an investment strategy by using historical data to see how an asset (or portfolio of assets) would have performed in past periods. If results were successful, it might encourage traders to use that strategy going forward.
The underlying theory is that any strategy that worked well in the past is likely to work well in the future, and conversely, any strategy that performed poorly in the past is likely to perform poorly in the future.
But is this true?
In many cases, back-tested strategies fail once applied to the real world, as the sudden collapse of LTCM graphically illustrated. This could be due a number of factors, but the most common are dependence on correlations that disappear or biases in the back-testing process.
Examples of the pitfalls of back-testing include;
- There are as many potential investment models as there are ideas that can be quantified.The temptation therefore, is to devise a trading model that generates the maximum possible returns for a given (chosen) period.
This is known as curve-fitting (or data mining), where an analyst creates a portfolio strategy that optimises returns during the period being studied, but which may not translate into superior returns during other, out-of-sample  periods. This is because they rely on the continuation of past correlations between the behaviour of assets or asset classes, which do not remain stable over time.
A consistent relationship between back-testing, out-of-sample and forward performance testing (e.g. “paper trading”) can reduce the risk of generating misleading results, but generally, the more complicated the investment criteria the worse the performance of a back-tested system in the real world.
- Survivorship Bias, where the returns generated fail to consider the possibility that some firms go bankrupt over time; those that survive thus inevitably have above-average returns, relative to all companies in the sample.Those that do not are excluded from the future dataset and thus from the simulations, thus overstating the potential returns being generated by the investment process.
These problems do not make back-testing useless, but like many things related to markets, they should not be relied on exclusively.
Market risk cannot be measured in an objective way as it is not directly observable, being only inferred from variables that can be directly measured (Value-at-Risk, Probabilities e.g. confidence intervals etc.) Ultimately, there is no substitute for “live” trading, as it incorporates the real-world pressures and biases involved in actual trading.
 That is, those investment horizons not originally used to generate the strategy. A 10-year dataset might be used to construct a trading system, but to be useful it would need to have demonstrated its viability in periods other than that 10-year horizon. The data outside of that 10-year period would be out-of-sample data.