- Assorted Business
- Predictions
- Questions

- Quick Review
- Regression
- False Discovery Intro

- False Discovery Rate, More than you wanted to know
- Loss functions
- My prediction walkthrough
- Homework intro (if time?)

April 1st, 2021

- Assorted Business
- Predictions
- Questions

- Quick Review
- Regression
- False Discovery Intro

- False Discovery Rate, More than you wanted to know
- Loss functions
- My prediction walkthrough
- Homework intro (if time?)

“How many people in the US will have had at least one dose by end of day on April 30th?”

Prediction: 148 million

90% CI: [130,169] million.

Based on CDC trend data â€“ not what I gave you

- but clearly available on the page with target numbers.
- I pulled it in directions that felt better. Code online/later.

You donâ€™t always have the best data

But you could probably still do pretty well with the data I gave. CI calibration would be tough.

- R comments
- 1-indexing
- Usually you want to save scripts, not workspaces.
- Stay organized. Folders for homeworks, etc.
- Consider using shared drives or github to collaborate

- Office hours will be Fridays at 9AM

The basic model is as follows:

\(Perc.OneDose = \beta_0 + \beta_1 Delivered.100k +\) \(\beta_2 Perc.TwoDose + \epsilon\)

Where \(E[\epsilon] = 0\).

We care about \(\beta_1\) or perhaps \(\beta_2\). What are they?

We can compare pvalues, which are measure of extremity, to a pre-set threshold (\(\alpha\)) which controls our false discovery chance.

But with lots of variables, how do we think about things?

- No correction? \(p\alpha\) false rejections
- Bonferonni? 5% chance of any false rejections.

Both seem aggressive. Want a middle ground.

Notation Changed

We wish to test \(K\) simultaneous null hypothesis: \[H0_1,H0_2,...,H0_K \] Out of the \(K\) null hypothesis, \(N_0\) are true nulls and \(N_1 = K-N_0\) are false â€“ i.e.Â there is an effect.