Machine Learning Reading List

(Asterisks indicated “must reads”!)

Articles:

Seminal work by Brieman on the distinction between a more classical statistical modeling approach and more recent “algorithmic” modeling approach.

*Brieman (2001) Statistical Modeling: The Two Cultures.

Excellent introduction to concepts and issues in using machine learning for epidemiologists.

Bi et al (2019) What is Machine Learning?: A Primer for Epidemiologists

Attempt to demonstrate the fundamentals behind the super learner.

Naimi & Balzer (2018) Stacked Generalization: An Introduction to Super Learning

Detailed resource on using the super learner in real data settings.

Kennedy (2017) Guide to Super Learner. URL: https://cran.r-project.org/web/packages/SuperLearner/vignettes/Guide-to-SuperLearner.html

Important example of some fundamental constraints on using data with algorithms to predict outcomes fairly.

Chouldechova (2016) Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. https://arxiv.org/abs/1610.07524

Excellent introduction to machine learning (emphasis on econometrics but very useful for epidemiologists).

Mullainathan, S. and J. Spiess, Machine learning: an applied econometric approach. Journal of Economic Perspectives, 2017. 31(2): p. 87-106

Important example of how ML algorithms can yield very misleading predictions when deeper aspects of the data-modeling complex are not taken into account.

Caruana, R., et al. Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission. in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015. ACM.

Books:

Technical Skills

*Burkov (2019) The Hundred Page Machine Learning Book
Burkov (2021) Machine Learning Engineering
Kuhn and Johnson (2016) Applied Predictive Modeling

Advanced Texts

Wasserman (2006) All of Nonparametric Statistics
Shalev-Schwartz and Ben-David (2014) Understanding Machine Learning: From Theory to Algorithms
Efron and Hastie (2017) Computer Age Statistical Inference: Algorithms, Evidence, and Data Science
Hastie, Tibshirani, Friedman (2009) Elements of Statistical Learning
James, Witten, Hastie, Tibshirani (2017) Introduction to Statistical Learning

Machine Learning Reading List

Articles:

Books:

Conceptual/Theoretical Understanding & Social Issues

Advanced Texts