By Megan Czasonis, Mark Kritzman, and David Turkington.
Published in the Journal of Portfolio Management, September 2020.
Similar to how economists might think about past events, regression models consider historical relevance when generating predictions. Censoring the least relevant periods can improve their predictive power.
Any introductory statistics course teaches that when it comes to regression analysis, the more data the better. This is because larger samples should produce more reliable predictions, but that is not always the case. In reality, some historical periods are more relevant than others. Just as an economist might extrapolate from a subset of relevant historical events, we propose that regression models should do the same. However, rather than using judgement, we introduce a precise methodology for measuring relevance that takes into account an observation’s informativeness and similarity to the current period. We show that by focusing on a subset of the most relevant data points, we can better forecast factor returns compared to traditional regression analysis.