1. Overfitting bias
When backtesting a trading strategy, it is always tempting to keep tweaking your model until you get a “perfect” result. But unfortunately, the result of all this tinkering could be overfitting bias. If you fall into this trap, you might realise that your model perfectly fits historical data but has little or no predictive power.
Simplicity is key: if you keep tweaking parameters, you may eventually believe your model can accurately predict returns. But if data from outside the sample study is introduced, this may prove to be the result of pure chance.
What can you do? Firstly, remember that the likelihood of overfitting is proportional to the number of parameters you introduce and optimise. Less is more: typically, two or three parameters are more than enough, but if you can do it with a single parameter, that’s even better. On the other hand, if you find yourself trialling endless variations of a strategy across many parameters, this could be a warning sign that you are overfitting.
Secondly, to improve the reliability of your model, always aim to find a stable parameter range within which slight tweaks do not have a significant effect on returns. In other words, while it can be tempting to hunt for steep peaks, it is often more prudent to look for long, gentle slopes.
Finally, learn to expect and accept a degree of imperfection in your model’s ability to represent the past — what you care about is the future. Always keep the underlying market logic of your model front and center in your mind and test against as much data as possible before pulling the trigger. A best practice is to test your backtesting model with live market data after you’ve already started trading live to see whether your backtest results match your live trading results.
2. Transactions costs, fees, and slippage
During backtesting, it is tempting to assume that trades get executed 100% of the time and at the current market price. This is rarely the case in the real world. It is easy to overlook, but it is crucial always to include reasonable transaction costs, spreads and slippage assumptions when running your back tests. It’s also important to factor in partial or delayed executions, especially when trading larger orders.
Bear in mind that spreads may be much higher if you are trading out-of-the-money options or futures that have an expiration date that is further in the future. This holds true for traditional assets but also for crypto assets.
3. Look-ahead bias
It is easy to judge in hindsight. Look-ahead bias occurs when you introduce information into your model that was not accessible or known to the market during the analysis period, which can lead to inaccurate results.
Look-ahead bias can be hard to detect if you build your own backtesting system. Off-the-shelf quant trading and backtesting products will, however, make sure that your model never accesses future data that was not known at a particular point in the past during the observation period.
4. Survivorship bias
If you are trading equities, you should pay attention to corporate actions like dividends, stock splits, and de-listings. If you don’t, then you run the risk of analysing investment performance based only on the survivors of a selected investment universe.
Survivorship bias is a form of sample selection bias, in that a skewed sample results in erroneous conclusions that would not be borne out if the entire population was used.
This bias can creep into your dataset in numerous ways, for instance, when selecting data from indices like the S&P 500, and from industry data where financial information about acquired or bankrupt companies is excluded.
Similarly, in the crypto world, when trading ICOs, it is important not just to consider coins that are still alive today but also those that died along the way. What to do? Seek to obtain a dataset that includes the whole market, including those who failed. It’s extra work, but it is worth it – an essential lesson for DeFi and ICO token traders.
5. Time period bias
This brings us to another form of sample selection bias: time period bias. This may occur if you choose an observation period where results are likely to deviate from the norm due to unique or atypical circumstances. This will hamper the predictive power of your model when it is operating outside these particular circumstances. To take the most obvious example, a model is likely to produce very different results if it is backtested with data from a bull run or periods of macroeconomic shocks.
In the crypto world, regime shifts are often introduced by fork events, block reward halving or temporary changes in regulation, for example.
What can you do to prevent time period bias? The simple answer is to ensure that you are using a large data set that spans as wide a time range as possible. By doing this you will mitigate the influence of short-term effects and will be better equipped to draw conclusions that will hold water across a wider range of scenarios.
Keep in mind that in the crypto market, however, market dynamics have shifted considerably since the early years. While you might ideally like to consider the entire lifespan of Bitcoin, for example, there has been a considerable increase in trading volume over the years as well as in the total amount of Bitcoin in circulation. Thus, it may be more instructive to consider the last three or four years of price data.
The trickiest thing about bias is that it is typically unconscious — almost no trader chooses to think in a biased way. However, many small, seemingly innocuous decisions can combine into a biased outlook on a macro level. In order to avoid falling into these traps, always keep market logic in mind, make sure your sample selection is robust, and trail your model against live market data.