-
Notifications
You must be signed in to change notification settings - Fork 0
/
SampleSize.Rmd
31 lines (26 loc) · 1.26 KB
/
SampleSize.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
title: "SampleSize"
author: "Jasper Olthuis"
date: "23-3-2023"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r Packages, message=F, warning=F}
list.to.install <- c('caret', 'pmsampsize', 'ggplot2')
lapply(list.to.install, require, character.only=T)
```
## Sample Size Calculation Predictive Model
Sample size calculations for predictive models can be found in [this paper](https://www.bmj.com/content/bmj/368/bmj.m441.full.pdf). Calculations for binary outcome variables are used. The R package [pmsampsize](https://cran.r-project.org/web/packages/pmsampsize/index.html) provides most of the calculations and [this website](https://mvansmeden.shinyapps.io/BeyondEPV/) provides one that is not availabele in the package.
```{r pmsampsize}
pmsampsize::pmsampsize(
type = 'b', #binary outcome variable
parameters = 16,
shrinkage = 0.9, #recommendation,
prevalence = 0.327, #prevalence in dataset
seed = 1330,
cstatistic = 0.90 #AUC for LR1 (is around 0.95, however we're being cautious)
)
```
The approximation for the required sample size gives a minimal sample size of $339$ individuals. Considering the low C-statistic, we are being quite cautious. Since our sample size is ~$1000$ individuals, this should be fine.