The Factor Zoo & Replication Crisis
With 400+ published factors, most are likely false discoveries.
Typical IS Sharpe
0.3 – 0.9 (varies widely)
Typical OOS Sharpe
−0.1 – 0.3 (post-publication average)
Capacity
Small-cap
Signal decay
~12m half-life
Overview
Cochrane (2011) famously asked "which factors matter?" in his AFA Presidential Address, coining the "factor zoo" problem. Harvey, Liu, and Zhu (2016) documented 316 published factors by 2012 and argued that standard statistical significance thresholds (t > 2.0) are far too permissive given the scale of multiple testing. They proposed a t-statistic hurdle of 3.0 or higher for factor claims to survive a Bonferroni-corrected multiple testing framework. McLean and Pontiff (2016) showed that anomaly returns decay by 58% after publication — consistent with both rational learning and data-mining deterioration. Hou, Xue, and Zhang (2020) attempted to replicate 452 anomalies and found only 85 significant at conventional thresholds.
Economic Intuition
The problem is fundamental to any empirical science that runs many regressions on the same dataset. Given enough variables and enough researchers, some combination will look significant by chance. In finance, the situation is especially acute because: (1) financial data is relatively short (70 years of reliable US data), (2) researchers share the same datasets (CRSP, Compustat), (3) publication bias favors positive results, and (4) t-statistics are often artificially inflated by data-mining procedures that were not fully disclosed. The result is that many "discoveries" reflect sample-specific noise rather than genuine risk premia.
Out-of-Sample Evidence
Weak OOS survivalThis is the core theme of ConvexPi. The platform is built around one question: does your strategy survive out-of-sample? The factor zoo literature shows that most do not. The right mental model: treat every in-sample result as a hypothesis, not a fact. The OOS Sharpe ratio on fresh data is the only credible evidence of real alpha. Strategies that use more parameters, more indicators, and longer lookback periods are more susceptible to the multiple testing problem — even if each individual test looks conservative. Simplicity is a form of robustness.
Key Papers
Foundational research on this factor — start here.
Harvey, C. R., Liu, Y., & Zhu, H.
2016
Review of Financial Studies
McLean, R. D., & Pontiff, J.
2016
Journal of Finance
Hou, K., Xue, C., & Zhang, L.
2020
Review of Financial Studies
Cochrane, J. H.
2011
Journal of Finance
Further Reading
Is There a Replication Crisis in Finance?
Jensen, T. I., Kelly, B., & Pedersen, L. H.
2023
Journal of Finance
The History of the Cross-Section of Stock Returns
Linnainmaa, J. T., & Roberts, M. R.
2018
Review of Financial Studies

