Google Scholar Aggarwal, C., and Reddy, C (2014): Data Clustering – Algorithms and Applications. 1st ed. CRC Press. CrossRef

Google Scholar Ahmed, N., Atiya, A., Gayar, N., and El-Shishiny, H. (2010): “An Empirical Comparison of Machine Learning Models for Time Series Forecasting.” Econometric Reviews, Vol. 29, No. 5–6, pp. 594–621. CrossRef

Google Scholar Anderson, G., Guionnet, A, and Zeitouni, O (2009): An Introduction to Random Matrix Theory. 1st ed. Cambridge Studies in Advanced Mathematics. Cambridge University Press. CrossRef

Google Scholar Ballings, M., van den Poel, D., Hespeels, N., and Gryp, R. (2015): “Evaluating Multiple Classifiers for Stock Price Direction Prediction.” Expert Systems with Applications, Vol. 42, No. 20, pp. 7046–56. CrossRef

Benjamini, Y., and Yekutieli, D (2001): “The Control of the False Discovery Rate in Multiple Testing under Dependency.” Annals of Statistics, Vol. 29, pp. 1165–88. Google Scholar

Google Scholar Benjamini, Y., and Liu, W (1999): “A Step-Down Multiple Hypotheses Testing Procedure that Controls the False Discovery Rate under Independence.” Journal of Statistical Planning and Inference, Vol. 82, pp. 163–70. CrossRef

Benjamini, Y., and Hochberg, Y (1995): “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society, Series B, Vol. 57, pp. 289–300. Google Scholar

Google Scholar Bontempi, G., Taieb, S., and Le Borgne, Y. (2012): “Machine Learning Strategies for Time Series Forecasting.” Lecture Notes in Business Information Processing, Vol. 138, No. 1, pp. 62–77. CrossRef

Google Scholar Booth, A., Gerding, E., and McGroarty, F. (2014): “Automated Trading with Performance Weighted Random Forests and Seasonality.” Expert Systems with Applications, Vol. 41, No. 8, pp. 3651–61. CrossRef

Google Scholar Cao, L., and Tay, F. (2001): “Financial Forecasting Using Support Vector Machines.” Neural Computing and Applications, Vol. 10, No. 2, pp. 184–92. CrossRef

Google Scholar PubMed Cao, L., Tay, F., and Hock, F. (2003): “Support Vector Machine with Adaptive Parameters in Financial Time Series Forecasting.” IEEE Transactions on Neural Networks, Vol. 14, No. 6, pp. 1506–18. CrossRef

Google Scholar Cervello-Royo, R., Guijarro, F., and Michniuk, K. (2015): “Stockmarket Trading Rule Based on Pattern Recognition and Technical Analysis: Forecasting the DJIA Index with Intraday Data.” Expert Systems with Applications, Vol. 42, No. 14, pp. 5963–75. CrossRef

Google Scholar Chang, P., Fan, C., and Lin, J. (2011): “Trend Discovery in Financial Time Series Data Using a Case-Based Fuzzy Decision Tree.” Expert Systems with Applications, Vol. 38, No. 5, pp. 6070–80. CrossRef

Chen, B., and Pearl, J (2013): “Regression and Causation: A Critical Examination of Six Econometrics Textbooks.” Real-World Economics Review, Vol. 65, pp. 2–20. Google Scholar

Google Scholar Creamer, G., and Freund, Y. (2007): “A Boosting Approach for Automated Trading.” Journal of Trading, Vol. 2, No. 3, pp. 84–96. CrossRef

Google Scholar Creamer, G., and Freund, Y. (2010): “Automated Trading with Boosting and Expert Weighting.” Quantitative Finance, Vol. 10, No. 4, pp. 401–20. CrossRef

Google Scholar Creamer, G., Ren, Y., Sakamoto, Y., and Nickerson, J. (2016): “A Textual Analysis Algorithm for the Equity Market: The European Case.” Journal of Investing, Vol. 25, No. 3, pp. 105–16. CrossRef

Google Scholar Dixon, M., Klabjan, D., and Bang, J. (2017): “Classification-Based Financial Markets Prediction Using Deep Neural Networks.” Algorithmic Finance, Vol. 6, No. 3, pp. 67–77. CrossRef

Dunis, C., and Williams, M. (2002): “Modelling and Trading the Euro/US Dollar Exchange Rate: Do Neural Network Models Perform Better?” Journal of Derivatives and Hedge Funds, Vol. 8, No. 3, pp. 211–39. Google Scholar

Google Scholar Easley, D., and Kleinberg, J (2010): Networks, Crowds, and Markets: Reasoning about a Highly Connected World. 1st ed. Cambridge University Press. CrossRef

Easley, D., López de Prado, M, O’Hara, M, and Zhang, Z (2011): “Microstructure in the Machine Age.” Working paper. Google Scholar

Efroymson, M. (1960): “Multiple Regression Analysis.” In Ralston, A and Wilf, H (eds.), Mathematical Methods for Digital Computers. 1st ed. Wiley. Google Scholar

Google Scholar PubMed Einav, L., and Levin, J (2014): “Economics in the Age of Big Data.” Science, Vol. 346, No. 6210. Available at http://science.sciencemag.org/content/346/6210/1243089 CrossRef

Google Scholar Feuerriegel, S., and Prendinger, H. (2016): “News-Based Trading Strategies.” Decision Support Systems, Vol. 90, pp. 65–74. CrossRef

Greene, W. (2012): Econometric Analysis. 7th ed. Pearson Education. Google Scholar

Google Scholar Harvey, C., and Liu, Y (2015): “Backtesting.” The Journal of Portfolio Management, Vol. 42, No. 1, pp. 13–28. CrossRef

Harvey, C., and Liu, Y (2018): “False (and Missed) Discoveries in Financial Economics.” Working paper. Available at https://ssrn.com/abstract=3073799 Google Scholar

Harvey, C., and Liu, Y (2018): “Lucky Factors.” Working paper. Available at https://ssrn.com/abstract=2528780 Google Scholar

Hastie, T., Tibshirani, R, and Friedman, J (2016): The Elements of Statistical Learning: Data Mining, Inference and Prediction. 2nd ed. Springer. Google Scholar

Hayashi, F. (2000): Econometrics. 1st ed. Princeton University Press. Google Scholar

Holm, S. (1979): “A Simple Sequentially Rejective Multiple Test Procedure.” Scandinavian Journal of Statistics, Vol. 6, pp. 65–70. Google Scholar

Google Scholar Hsu, S., Hsieh, J., Chih, T., and Hsu, K. (2009): “A Two-Stage Architecture for Stock Price Forecasting by Integrating Self-Organizing Map and Support Vector Regression.” Expert Systems with Applications, Vol. 36, No. 4, pp. 7947–51. CrossRef

Google Scholar Huang, W., Nakamori, Y., and Wang, S. (2005): “Forecasting Stock Market Movement Direction with Support Vector Machine.” Computers and Operations Research, Vol. 32, No. 10, pp. 2513–22. CrossRef

Google Scholar PubMed Ioannidis, J. (2005): “Why Most Published Research Findings Are False.” PLoS Medicine, Vol. 2, No. 8. Available at https://doi.org/10.1371/journal.pmed.0020124 CrossRef

Google Scholar James, G., Witten, D, Hastie, T, and Tibshirani, R (2013): An Introduction to Statistical Learning. 1st ed. Springer. CrossRef

Google Scholar Kahn, R. (2018): The Future of Investment Management. 1st ed. CFA Institute Research Foundation. CrossRef

Google Scholar Kara, Y., Boyacioglu, M., and Baykan, O. (2011): “Predicting Direction of Stock Price Index Movement Using Artificial Neural Networks and Support Vector Machines: The Sample of the Istanbul Stock Exchange.” Expert Systems with Applications, Vol. 38, No. 5, pp. 5311–19. CrossRef

Google Scholar Kim, K. (2003): “Financial Time Series Forecasting Using Support Vector Machines.” Neurocomputing, Vol. 55, No. 1, pp. 307–19. CrossRef

Kolanovic, M., and Krishnamachari, R (2017): “Big Data and AI Strategies: Machine Learning and Alternative Data Approach to Investing.” J.P. Morgan Quantitative and Derivative Strategy, May. Google Scholar

Kolm, P., Tutuncu, R, and Fabozzi, F (2010): “60 Years of Portfolio Optimization.” European Journal of Operational Research, Vol. 234, No. 2, pp. 356–71. Google Scholar

Google Scholar Krauss, C., Do, X., and Huck, N. (2017): “Deep Neural Networks, Gradient-Boosted Trees, Random Forests: Statistical Arbitrage on the S&P 500.” European Journal of Operational Research, Vol. 259, No. 2, pp. 689–702. CrossRef

Google Scholar Kuan, C., and Tung, L. (1995): “Forecasting Exchange Rates Using Feedforward and Recurrent Neural Networks.” Journal of Applied Econometrics, Vol. 10, No. 4, pp. 347–64. CrossRef

Kuhn, H. W., and Tucker, A. W. (1952): “Nonlinear Programming.” In Proceedings of 2nd Berkeley Symposium. University of California Press, pp. 481–92. Google Scholar

Google Scholar Laborda, R., and Laborda, J. (2017): “Can Tree-Structured Classifiers Add Value to the Investor?” Finance Research Letters, Vol. 22, pp. 211–26. CrossRef

López de Prado, M. (2018): “A Practical Solution to the Multiple-Testing Crisis in Financial Research.” Journal of Financial Data Science, Vol. 1, No. 1. Available at https://ssrn.com/abstract=3177057 Google Scholar

López de Prado, M., and Lewis, M (2018): “Confidence and Power of the Sharpe Ratio under Multiple Testing.” Working paper. Available at https://ssrn.com/abstract=3193697 Google Scholar

MacKay, D. (2003): Information Theory, Inference, and Learning Algorithms. 1st ed. Cambridge University Press. Google Scholar

Marcenko, V., and Pastur, L (1967): “Distribution of Eigenvalues for Some Sets of Random Matrices.” Matematicheskii Sbornik, Vol. 72, No. 4, pp. 507–36. Google Scholar

Michaud, R. (1998): Efficient Asset Allocation: A Practical Guide to Stock Portfolio Optimization and Asset Allocation. Boston: Harvard Business School Press. Google Scholar

Google Scholar Nakamura, E. (2005): “Inflation Forecasting Using a Neural Network.” Economics Letters, Vol. 86, No. 3, pp. 373–78. CrossRef

Google Scholar Olson, D., and Mossman, C. (2003): “Neural Network Forecasts of Canadian Stock Returns Using Accounting Ratios.” International Journal of Forecasting, Vol. 19, No. 3, pp. 453–65. CrossRef

Google Scholar Otto, M. (2016): Chemometrics: Statistics and Computer Application in Analytical Chemistry. 3rd ed. Wiley. CrossRef

Patel, J., Sha, S., Thakkar, P., and Kotecha, K. (2015): “Predicting Stock and Stock Price Index Movement Using Trend Deterministic Data Preparation and Machine Learning Techniques.” Expert Systems with Applications, Vol. 42, No. 1, pp. 259–68. Google Scholar

Google Scholar Pearl, J. (2009): “Causal Inference in Statistics: An Overview.” Statistics Surveys, Vol. 3, pp. 96–146. CrossRef

Google Scholar Plerou, V., Gopikrishnan, P, Rosenow, B, Nunes Amaral, L, and Stanley, H (1999): “Universal and Nonuniversal Properties of Cross Correlations in Financial Time Series.” Physical Review Letters, Vol. 83, No. 7, pp. 1471–74. CrossRef

Porter, K. (2017): “Estimating Statistical Power When Using Multiple Testing Procedures.” Available at www.mdrc.org/sites/default/files/PowerMultiplicity-IssueFocus.pdf Google Scholar

Potter, M., Bouchaud, J. P., and Laloux, L (2005): “Financial Applications of Random Matrix Theory: Old Laces and New Pieces.” Acta Physica Polonica B, Vol. 36, No. 9, pp. 2767–84. Google Scholar

Google Scholar Qin, Q., Wang, Q., Li, J., and Shuzhi, S. (2013): “Linear and Nonlinear Trading Models with Gradient Boosted Random Forests and Application to Singapore Stock Market.” Journal of Intelligent Learning Systems and Applications, Vol. 5, No. 1, pp. 1–10. CrossRef

Shafer, G. (1982): “Lindley’s Paradox.” Journal of the American Statistical Association, Vol. 77, No. 378, pp. 325–34. Google Scholar

Simon, H. (1962): “The Architecture of Complexity.” Proceedings of the American Philosophical Society, Vol. 106, No. 6, pp. 467–82. Google Scholar

SINTEF (2013): “Big Data, for Better or Worse: 90% of World’s Data Generated over Last Two Years.” Science Daily, May 22. Available at www.sciencedaily.com/releases/2013/05/130522085217.htm Google Scholar

Google Scholar Sorensen, E., Miller, K., and Ooi, C. (2000): “The Decision Tree Approach to Stock Selection.” Journal of Portfolio Management, Vol. 27, No. 1, pp. 42–52. CrossRef

Theofilatos, K., Likothanassis, S., and Karathanasopoulos, A. (2012): “Modeling and Trading the EUR/USD Exchange Rate Using Machine Learning Techniques.” Engineering, Technology and Applied Science Research, Vol. 2, No. 5, pp. 269–72. Google Scholar

Trafalis, T., and Ince, H. (2000): “Support Vector Machine for Regression and Applications to Financial Forecasting.” Neural Networks, Vol. 6, No. 1, pp. 348–53. Google Scholar

Google Scholar Trippi, R., and DeSieno, D. (1992): “Trading Equity Index Futures with a Neural Network.” Journal of Portfolio Management, Vol. 19, No. 1, pp. 27–33. CrossRef

Tsai, C., and Wang, S. (2009): “Stock Price Forecasting by Hybrid Machine Learning Techniques.” Proceedings of the International Multi-Conference of Engineers and Computer Scientists, Vol. 1, No. 1, pp. 755–60. Google Scholar

Google Scholar Tsai, C., Lin, Y., Yen, D., and Chen, Y. (2011): “Predicting Stock Returns by Classifier Ensembles.” Applied Soft Computing, Vol. 11, No. 2, pp. 2452–59. CrossRef

Tsay, R. (2013): Multivariate Time Series Analysis: With R and Financial Applications. 1st ed. Wiley. Google Scholar

Google Scholar Wang, J., and Chan, S. (2006): “Stock Market Trading Rule Discovery Using Two-Layer Bias Decision Tree.” Expert Systems with Applications, Vol. 30, No. 4, pp. 605–11. CrossRef

Google Scholar Wang, Q., Li, J., Qin, Q., and Ge, S. (2011): “Linear, Adaptive and Nonlinear Trading Models for Singapore Stock Market with Random Forests.” In Proceedings of the 9th IEEE International Conference on Control and Automation, pp. 726–31. CrossRef

Google Scholar Wei, P., and Wang, N. (2016): “Wikipedia and Stock Return: Wikipedia Usage Pattern Helps to Predict the Individual Stock Movement.” In Proceedings of the 25th International Conference Companion on World Wide Web, Vol. 1, pp. 591–94. CrossRef

Wooldridge, J. (2010): Econometric Analysis of Cross Section and Panel Data. 2nd ed. MIT Press. Google Scholar

Wright, S. (1921): “Correlation and Causation.” Journal of Agricultural Research, Vol. 20, pp. 557–85. Google Scholar

Google Scholar Żbikowski, K. (2015): “Using Volume Weighted Support Vector Machines with Walk Forward Testing and Feature Selection for the Purpose of Creating Stock Trading Strategy.” Expert Systems with Applications, Vol. 42, No. 4, pp. 1797–1805. CrossRef

Google Scholar Zhang, G., Patuwo, B., and Hu, M. (1998): “Forecasting with Artificial Neural Networks: The State of the Art.” International Journal of Forecasting, Vol. 14, No. 1, pp. 35–62. CrossRef

Google Scholar Zhu, M., Philpotts, D., and Stevenson, M. (2012): “The Benefits of Tree-Based Models for Stock Selection.” Journal of Asset Management, Vol. 13, No. 6, pp. 437–48. CrossRef

Google Scholar Zhu, M., Philpotts, D., Sparks, R., and Stevenson, J. (2011): “A Hybrid Approach to Combining CART and Logistic Regression for Stock Ranking.” Journal of Portfolio Management, Vol. 38, No. 1, pp. 100–109. CrossRef

American Statistical Association (2016): “Statement on Statistical Significance and P-Values.” Available at www.amstat.org/asa/files/pdfs/P-ValueStatement.pdf Google Scholar

Apley, D. (2016): “Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models.” Available at https://arxiv.org/abs/1612.08468 Google Scholar

Google Scholar Athey, Susan (2015): “Machine Learning and Causal Inference for Policy Evaluation.” In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 5–6. ACM. CrossRef

Google Scholar Bailey, D., and López de Prado, M (2012): “The Sharpe Ratio Efficient Frontier.” Journal of Risk, Vol. 15, No. 2, pp. 3–44. CrossRef

Google Scholar Bailey, D., and López de Prado, M (2013): “An Open-Source Implementation of the Critical-Line Algorithm for Portfolio Optimization.” Algorithms, Vol. 6, No. 1, pp. 169–96. Available at http://ssrn.com/abstract=2197616 CrossRef

Google Scholar Bailey, D., and López de Prado, M (2014): “The Deflated Sharpe Ratio: Correcting for Selection Bias, Backtest Overfitting and Non-Normality.” Journal of Portfolio Management, Vol. 40, No. 5, pp. 94–107. CrossRef

Google Scholar Bailey, D., Borwein, J, López de Prado, M, and Zhu, J (2014): “Pseudo-mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance.” Notices of the American Mathematical Society, Vol. 61, No. 5, pp. 458–71. Available at http://ssrn.com/abstract=2308659 CrossRef

Google Scholar Black, F., and Litterman, R (1991): “Asset Allocation Combining Investor Views with Market Equilibrium.” Journal of Fixed Income, Vol. 1, No. 2, pp. 7–18. CrossRef

Google Scholar Black, F., and Litterman, R (1992): “Global Portfolio Optimization.” Financial Analysts Journal, Vol. 48, No. 5, pp. 28–43. CrossRef

Brian, E., and Jaisson, M. (2007): “Physico-theology and Mathematics (1710–1794).” In The Descent of Human Sex Ratio at Birth. Springer Science & Business Media, pp. 1–25. Google Scholar

Google Scholar Brooks, C., and Kat, H (2002): “The Statistical Properties of Hedge Fund Index Returns and Their Implications for Investors.” Journal of Alternative Investments, Vol. 5, No. 2, pp. 26–44. CrossRef

Google Scholar Cavallo, A., and Rigobon, R (2016): “The Billion Prices Project: Using Online Prices for Measurement and Research.” NBER Working Paper 22111, March. CrossRef

CFTC (2010): “Findings Regarding the Market Events of May 6, 2010.” Report of the Staffs of the CFTC and SEC to the Joint Advisory Committee on Emerging Regulatory Issues, September 30. Google Scholar

Google Scholar Christie, S. (2005): “Is the Sharpe Ratio Useful in Asset Allocation?” MAFC Research Paper 31. Applied Finance Centre, Macquarie University. CrossRef

Google Scholar Clarke, Kevin A. (2005): “The Phantom Menace: Omitted Variable Bias in Econometric Research.” Conflict Management and Peace Science, Vol. 22, No. 1, pp. 341–52. CrossRef

Google Scholar Clarke, R., De Silva, H, and Thorley, S (2002): “Portfolio Constraints and the Fundamental Law of Active Management.” Financial Analysts Journal, Vol. 58, pp. 48–66. CrossRef

Google Scholar Cohen, L., and Frazzini, A (2008): “Economic Links and Predictable Returns.” Journal of Finance, Vol. 63, No. 4, pp. 1977–2011. CrossRef

De Miguel, V., Garlappi, L, and Uppal, R (2009): “Optimal versus Naive Diversification: How Inefficient Is the 1/N Portfolio Strategy?” Review of Financial Studies, Vol. 22, pp. 1915–53. Google Scholar

Google Scholar Ding, C., and He, X (2004): “K-Means Clustering via Principal Component Analysis.” In Proceedings of the 21st International Conference on Machine Learning. Available at http://ranger.uta.edu/~chqding/papers/KmeansPCA1.pdf CrossRef

Easley, D., López de Prado, M, and O’Hara, M (2011a): “Flow Toxicity and Liquidity in a High-Frequency World.” Review of Financial Studies, Vol. 25, No. 5, pp. 1457–93. Google Scholar

Google Scholar Easley, D., López de Prado, M, and O’Hara, M (2011b): “The Microstructure of the ‘Flash Crash’: Flow Toxicity, Liquidity Crashes and the Probability of Informed Trading.” Journal of Portfolio Management, Vol. 37, No. 2, pp. 118–28. CrossRef

Google Scholar Efron, B., and Hastie, T (2016): Computer Age Statistical Inference: Algorithms, Evidence, and Data Science. 1st ed. Cambridge University Press. CrossRef

Embrechts, P., Klueppelberg, C, and Mikosch, T (2003): Modelling Extremal Events. 1st ed. Springer. Google Scholar

Google Scholar PubMed Goutte, C., Toft, P, Rostrup, E, Nielsen, F, and Hansen, L (1999): “On Clustering fMRI Time Series.” NeuroImage, Vol. 9, No. 3, pp. 298–310. CrossRef

Grinold, R., and Kahn, R (1999): Active Portfolio Management. 2nd ed. McGraw-Hill. Google Scholar

Google Scholar Gryak, J., Haralick, R, and Kahrobaei, D (Forthcoming): “Solving the Conjugacy Decision Problem via Machine Learning.” Experimental Mathematics. Available at https://doi.org/10.1080/10586458.2018.1434704 CrossRef

Google Scholar Hacine-Gharbi, A., and Ravier, P (2018): “A Binning Formula of Bi-histogram for Joint Entropy Estimation Using Mean Square Error Minimization.” Pattern Recognition Letters, Vol. 101, pp. 21–28. CrossRef

Google Scholar Hacine-Gharbi, A., Ravier, P, Harba, R, and Mohamadi, T (2012): “Low Bias Histogram-Based Estimation of Mutual Information for Feature Selection.” Pattern Recognition Letters, Vol. 33, pp. 1302–8. CrossRef

Hamilton, J. (1994): Time Series Analysis. 1st ed. Princeton University Press. Google Scholar

Google Scholar Harvey, C., Liu, Y, and Zhu, C (2016): “… and the Cross-Section of Expected Returns.” Review of Financial Studies, Vol. 29, No. 1, pp. 5–68. Available at https://ssrn.com/abstract=2249314 CrossRef

Google Scholar Hodge, V., and Austin, J (2004): “A Survey of Outlier Detection Methodologies.” Artificial Intelligence Review, Vol. 22, No. 2, pp. 85–126. CrossRef

IDC (2014): “The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things.” EMC Digital Universe with Research and Analysis. April. Available at www.emc.com/leadership/digital-universe/2014iview/index.htm Google Scholar

Ingersoll, J., Spiegel, M, Goetzmann, W, and Welch, I (2007): “Portfolio Performance Manipulation and Manipulation-Proof Performance Measures.” The Review of Financial Studies, Vol. 20, No. 5, pp. 1504–46. Google Scholar

Google Scholar Jaynes, E. (2003): Probability Theory: The Logic of Science. 1st ed. Cambridge University Press. CrossRef

Jolliffe, I. (2002): Principal Component Analysis. 2nd ed. Springer. Google Scholar

Kraskov, A., Stoegbauer, H, and Grassberger, P (2008): “Estimating Mutual Information.” Working paper. Available at https://arxiv.org/abs/cond-mat/0305641v1 Google Scholar

Google Scholar Laloux, L., Cizeau, P, Bouchaud, J. P., and Potters, M (2000): “Random Matrix Theory and Financial Correlations.” International Journal of Theoretical and Applied Finance, Vol. 3, No. 3, pp. 391–97. CrossRef

Google Scholar Ledoit, O., and Wolf, M (2004): “A Well-Conditioned Estimator for Large-Dimensional Covariance Matrices.” Journal of Multivariate Analysis, Vol. 88, No. 2, pp. 365–411. CrossRef

Google Scholar Lewandowski, D., Kurowicka, D, and Joe, H (2009): “Generating Random Correlation Matrices Based on Vines and Extended Onion Method.” Journal of Multivariate Analysis, Vol. 100, pp. 1989–2001. CrossRef

PubMed Liu, Y. (2004): “A Comparative Study on Feature Selection Methods for Drug Discovery.” Journal of Chemical Information and Modeling, Vol. 44, No. 5, pp. 1823–28. Available at https://pubs.acs.org/doi/abs/10.1021/ci049875d Google Scholar

Google Scholar Lo, A. (2002): “The Statistics of Sharpe Ratios.” Financial Analysts Journal, July, pp. 36–52. CrossRef

Lochner, M., McEwen, J, Peiris, H, Lahav, O, and Winter, M (2016): “Photometric Supernova Classification with Machine Learning.” The Astrophysical Journal, Vol. 225, No. 2. Available at http://iopscience.iop.org/article/10.3847/0067-0049/225/2/31/meta Google Scholar

Google Scholar López de Prado, M. (2016): “Building Diversified Portfolios that Outperform Out-of-Sample.” Journal of Portfolio Management, Vol. 42, No. 4, pp. 59–69. CrossRef

López de Prado, M. (2018a): Advances in Financial Machine Learning. 1st ed. Wiley. Google Scholar

Google Scholar López de Prado, M. (2018b): “The 10 Reasons Most Machine Learning Funds Fail.” The Journal of Portfolio Management, Vol. 44, No. 6, pp. 120–33. CrossRef

López de Prado, M. (2019a): “A Data Science Solution to the Multiple-Testing Crisis in Financial Research.” Journal of Financial Data Science, Vol. 1, No. 1, pp. 99–110. Google Scholar

Google Scholar López de Prado, M. (2019b): “Beyond Econometrics: A Roadmap towards Financial Machine Learning.” Working paper. Available at https://ssrn.com/abstract=3365282 CrossRef

Google Scholar López de Prado, M. (2019c): “Ten Applications of Financial Machine Learning.” Working paper. Available at https://ssrn.com/abstract=3365271 CrossRef

Google Scholar López de Prado, M., and Lewis, M (2018): “Detection of False Investment Strategies Using Unsupervised Learning Methods.” Working paper. Available at https://ssrn.com/abstract=3167017 CrossRef

Louppe, G., Wehenkel, L., Sutera, A., and Geurts, P. (2013): “Understanding Variable Importances in Forests of Randomized Trees.” In Proceedings of the 26th International Conference on Neural Information Processing Systems, pp. 431–39. Google Scholar

Google Scholar Meila, M. (2007): “Comparing Clusterings – an Information Based Distance.” Journal of Multivariate Analysis, Vol. 98, pp. 873–95. CrossRef

Mertens, E. (2002): “Variance of the IID estimator in Lo (2002).” Working paper, University of Basel. Google Scholar

Molnar, C. (2019): “Interpretable Machine Learning: A Guide for Making Black-Box Models Explainable.” Available at https://christophm.github.io/interpretable-ml-book/ Google Scholar

Google Scholar Mullainathan, S., and Spiess, J (2017): “Machine Learning: An Applied Econometric Approach.” Journal of Economic Perspectives, Vol. 31, No. 2, pp. 87–106. CrossRef

Neyman, J., and Pearson, E (1933): “IX. On the Problem of the Most Efficient Tests of Statistical Hypotheses.” Philosophical Transactions of the Royal Society, Series A, Vol. 231, No. 694–706, pp. 289–337. Google Scholar

Google Scholar Opdyke, J. (2007): “Comparing Sharpe Ratios: So Where Are the p-Values?” Journal of Asset Management, Vol. 8, No. 5, pp. 308–36. CrossRef

Google Scholar Parzen, E. (1962): “On Estimation of a Probability Density Function and Mode.” The Annals of Mathematical Statistics, Vol. 33, No. 3, pp. 1065–76. CrossRef

Google Scholar Resnick, S. (1987): Extreme Values, Regular Variation and Point Processes. 1st ed. Springer. CrossRef

Romer, P. (2016): “The Trouble with Macroeconomics.” The American Economist, September 14. Google Scholar

Google Scholar Rosenblatt, M. (1956): “Remarks on Some Nonparametric Estimates of a Density Function.” The Annals of Mathematical Statistics, Vol. 27, No. 3, pp. 832–37. CrossRef

Google Scholar Rousseeuw, P. (1987): “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis.” Computational and Applied Mathematics, Vol. 20, pp. 53–65. CrossRef

Google Scholar PubMed Schlecht, J., Kaplan, M, Barnard, K, Karafet, T, Hammer, M, and Merchant, N (2008): “Machine-Learning Approaches for Classifying Haplogroup from Y Chromosome STR Data.” PLOS Computational Biology, Vol. 4, No. 6. Available at https://doi.org/10.1371/journal.pcbi.1000093 CrossRef

Sharpe, W. (1966): “Mutual Fund Performance.” Journal of Business, Vol. 39, No. 1, pp. 119–38. Google Scholar

Google Scholar Sharpe, W. (1975): “Adjusting for Risk in Portfolio Performance Measurement.” Journal of Portfolio Management, Vol. 1, No. 2, pp. 29–34. CrossRef

Google Scholar Sharpe, W. (1994): “The Sharpe Ratio.” Journal of Portfolio Management, Vol. 21, No. 1, pp. 49–58. CrossRef

Šidàk, Z. (1967): “Rectangular Confidence Regions for the Means of Multivariate Normal Distributions.” Journal of the American Statistical Association, Vol. 62, No. 318, pp. 626–33. Google Scholar

Solow, R. (2010): “Building a Science of Economics for the Real World.” Prepared statement of Robert Solow, Professor Emeritus, MIT, to the House Committee on Science and Technology, Subcommittee on Investigations and Oversight, July 20. Google Scholar

Google Scholar Steinbach, M., Levent, E, and Kumar, V (2004): “The Challenges of Clustering High Dimensional Data.” In Wille, L (ed.), New Directions in Statistical Physics. 1st ed. Springer, pp. 273–309. CrossRef

Google Scholar Štrumbelj, E., and Kononenko, I. (2014): “Explaining Prediction Models and Individual Predictions with Feature Contributions.” Knowledge and Information Systems, Vol. 41, No. 3, pp. 647–65. CrossRef

Google Scholar Varian, H. (2014): “Big Data: New Tricks for Econometrics.” Journal of Economic Perspectives, Vol. 28, No. 2, pp. 3–28. CrossRef

Google Scholar Wasserstein, R., Schirm, A., and Lazar, N. (2019): “Moving to a World beyond p<0.05.” The American Statistician, Vol. 73, No. 1, pp. 1–19. CrossRef

Google Scholar Wasserstein, R., and Lazar, N. (2016): “The ASA’s Statement on p-Values: Context, Process, and Purpose.” The American Statistician, Vol. 70, pp. 129–33. CrossRef

Witten, D., Shojaie, A., and Zhang, F. (2013): “The Cluster Elastic Net for High-Dimensional Regression with Unknown Variable Grouping.” Technometrics, Vol. 56, No. 1, pp. 112–22. Google Scholar