Last updated: 2022-10-11 14:06:21. Homework answers available on
request.
Before Class
- Syllabus.
- Textbooks:
- “Elements of Statistical Learning” (ESL) by Hastie, Tibshirani, and
Friedman – online and
here
- “Statistical Consequences of Fat Tails” by NN Taleb – online and here
- “Causal Inference: The Mixtape” by Scott Cunningham – online
- HW0: An ungraded assignment for making sure everyone is up to speed.
Entirely optional.
Week 1
- Lecture
1 – Admin, Syllabus, Stats 101, FDR Control – .Rmd file and
PDF
- HW1a
– Prediction Competition.
- Lecture
2 – Predictions, FDR Control, and Loss – .Rmd and PDF
- Lipids example:
- Code.
- Data is on canvas under lecture 2. “jointGwasMc_LDL.txt”
- Diabetes Example:
- Simulations in lecture: Code
- Homework 1:
FDR control
- Textbook references:
- False discovery rate: Ch. 18.7 of ESL
- Loss: Ch. 2.4, Ch. 7.2, and Ch. 10.6 of ESL
Week 2
- Lecture
3 – Regression: Linear and Logit – Rmd and PDF
- Homework 2
– Due Wednesday April 14
- Lecture
4 – Deviance, OOS, Bootstrap – Rmd and PDF
- Textbook References:
- Linear Regression: ESL Ch. 3.2
- Logistic Regression: ESL Ch. 4.4
- Deviance: ESL p. 124
- Out-of-Sample: ESL Ch. 7.1 (uses the term ‘generalization error’ or
‘validation error’)
- Bootstrap: ESL Ch 7.49
Week 3
- Lecture
5 – Variable Selection – Rmd and PDF
- For those of you struggling with homeworks, there is a lot of
valuable code in the old Lecture Rmd files
- Semiconductors: Data,
code in lecture Rmd
- Comscore: Code in lecture Rmd, Data: domains,
sites,
total
spend
- Lecture
6 – Cross Validation – Rmd and PDF
- The last few slides include a lot of useful code. If you can
interpret and run the cross-validation code there, you should gain a
solid grasp on cross-validation generally.
- Homework 3
and Rmd
- Due Next wednesday at midnight.
- Some of the ratios etc may be inverted in the solutions. This
shouldn’t change any interpretations.
- Textbook References
- AIC: ESL Ch 7.5
- BIC: ESL Ch 7.7
- Stepwise: ESL Ch. 3.3
- LASSO/Ridge/Shrinkage: ESL Ch. 3.4, 3.6, 3.8
- Cross-Validation: ESL Ch 7.10
- Bias-Variance Decomp: ESL Ch 7.2, 7.3
Week 5
- Lecture
9 – ROC and Trees – Rmd
- Lecture
10 – Trees and Forests – and Rmd
- Textbook References:
- Trees: ESL Ch. 9.2
- Bagging: ESL Ch 8.7
- Forests: ESL Ch 15
Week 7
- Lecture
13 – Data Cleaning – Rmd
- Lecture
14 – Data Cleaning and Bayes – Rmd
- Homework 6
and Rmd –
Cleaning
- The textbooks below may help.
- Textbook References: (giant references for R) Both free online.
Week 8
- Lecture
15 – Causal Inference 1: RCTs – Rmd
- Optional Predictions 3
– Sunday Cases Redux – Rmd
- Lecture
16 – Causal Inference 2: Targeting, Observational Methods – Rmd
- HW7 and Rmd
- Textbook References:
- RCTs/ATEs: Causal
Inference Mixtape Ch. 4
- Observational Causal: Causal Inference Mixtape, Chapters 6, 7, 8,
and 10 for Regression
Discontinuity, Instrumental
Variables, Diff-in-Diff,
and Synthetic
Controls respectively
- Targeting/Heterogenous Treatment Effects: (no good texts at the
moment, so I’ve linked a few papers) Tibshirani,
Hastie, et al (and here).
CATE meets ML. Optimal
Targeting with Heterogenous TEs