diff --git a/.nojekyll b/.nojekyll index 03fe33e2..64309386 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -8d593c3d \ No newline at end of file +f1989b81 \ No newline at end of file diff --git a/Aalto2023.html b/Aalto2023.html index 947f98ba..0d46c1f7 100644 --- a/Aalto2023.html +++ b/Aalto2023.html @@ -6,7 +6,7 @@ - + Bayesian Data Analysis course - Aalto 2023 – Bayesian Data Analysis course @@ -103,6 +69,7 @@ "search-label": "Search" } } + @@ -299,7 +266,10 @@

On this page

@@ -308,7 +278,7 @@

On this page

-

Assignment 7

+

Assignment 7

@@ -333,11 +303,415 @@

On this page

1 General information

-

This assignment will be published later!

+

The exercises here refer to the lecture 7/BDA chapter 5 content. The main topics for this assignment are the MCSE and importance sampling.

+

The exercises constitute 96% of the Quiz 7 grade.

+

We prepared a quarto notebook specific to this assignment to help you get started. You still need to fill in your answers on Mycourses! You can inspect this and future templates

+ +
+
+
+ +
+
+General Instructions for Answering the Assignment Questions +
+
+
+
+
+
    +
  • Questions below are exact copies of the text found in the MyCourses quiz and should serve as a notebook where you can store notes and code.
  • +
  • We recommend opening these notebooks in the Aalto JupyterHub, see how to use R and RStudio remotely.
  • +
  • For inspiration for code, have a look at the BDA R Demos and the specific Assignment code notebooks
  • +
  • Recommended additional self study exercises for each chapter in BDA3 are listed in the course web page. These will help to gain deeper understanding of the topic.
  • +
  • Common questions and answers regarding installation and technical problems can be found in Frequently Asked Questions (FAQ).
  • +
  • Deadlines for all assignments can be found on the course web page and in MyCourses.
  • +
  • You are allowed to discuss assignments with your friends, but it is not allowed to copy solutions directly from other students or from internet.
  • +
  • Do not share your answers publicly.
  • +
  • Do not copy answers from the internet or from previous years. We compare the answers to the answers from previous years and to the answers from other students this year.
  • +
  • Use of AI is allowed on the course, but the most of the work needs to by the student, and you need to report whether you used AI and in which way you used them (See points 5 and 6 in Aalto guidelines for use of AI in teaching).
  • +
  • All suspected plagiarism will be reported and investigated. See more about the Aalto University Code of Academic Integrity and Handling Violations Thereof.
  • +
  • If you have any suggestions or improvements to the course material, please post in the course chat feedback channel, create an issue, or submit a pull request to the public repository!
  • +
+
+
+
+
+

1.1 Assignment questions

+

For convenience the assignment questions are copied below. Answer the questions in MyCourses.

+Lecture 7/Chapter 5 of BDA Quiz (96% of grade) + +This week's assignment focuses on hierarchical models and modelling with brms. + +1. Exchangeability +An important building block of hierarchical models is the assumption of exchangeability. Let's review. + +1.1 Consider parameters \(\theta_j\) for j in 1...J. Which of these statements correctly describes exchangeability? + + +1.2 What best describes the difference between independence and exchangeability? + + +Consider the following: assume a box has N black and white balls but we do not know how many of each color. We pick a ball \(y_1\) at random, and we do not put it back, we then pick another ball \(y_2\) at random. Denote the number of black balls by B and white by W. + +1.3: Are observations \( y_1 \) and \( y_2 \) exchangeable? + + +1.4: Are observations \(y_1\) and \(y_2\) independent? + + +1.5: Can we act as if the two observations are independent? + + +1.6: Exchangeability allows us to express dependencies of data and parameters in a convenient form. Suppose we model a sequence of exchangeable random variables \( \theta \) via a governing, or population distribution, where conditional on some unknown parameters \( \phi \), we may assume independence between the elements of \( \theta \). Assume that \( \theta \) has J elements, write down an equation that conveniently factors the joint probability \( p(\theta,\phi) \): + + +De Finetti's theorem provides the theoretical basis for the equivalence to the independence assumption when J goes to infinity. In practice J is finite, but the difference may be small when J is relatively large. Check out the examples mentioned in the Chapter notes if you need more convincing. + +\( \phi \) is in general unknown. so the marginal prior for \( \theta \) must average over uncertainty in \( \phi \): \( \int \prod^J_{j=1} p(\theta_j | \phi) p(\phi)d\phi \). + +1.7 Based on the marginal prior formulation in the paragraph above, what can you say about the covariances \( \text{cov}( \theta_i, \theta_j) \)? + + +1.8 As with the parameters above, we often think of exchangeability as arising for our data model conditional on extra information x, such that the tuple \((x_i,y_i)\) are exchangeable, whereas \(y_i\) might not be. In which modeling context is this useful? Assume that the target data is denoted by \( y_i \). + + +2. Overview of hierarchical models + +2.1 Which of these best describes a hierarchical model? + + + +2.2: Consider that there are observations \(y\) indexed by observation number \(i\) and group \(j\). Suppose the data are modeled dependent on parameters \( \theta_j \) where \( j = 1,\dotsc,J\) indexes some meaningful grouping such as hospital-j specific health-outcomes, conditional on parameters \( \phi \) which are hyper-parameters of the prior distribution of \( \theta_j \). \( \phi \) are themselved modeled by prior distribution \( p(\phi) \). We think of the distribution \( p( \phi ) \) as the population distribution which generates the values for \( \theta \) in the hierarchical model. What are some of the benefits of hierarchical models compared to seperate models, which assume no relationship between j (and seperate models are estimated), and pooled models, which consider all j jointly, but do not model j specific parameters? + + +Throughout the rest of the course, we will often compare the hierarchical model, to a seperate model and the pooled model. Suppose for the questions below that you are modeling data \( y_{ij} \) where \( i \) refers to an observation within the \( j^{\text{th}}\) group, the observation model for \( y_{ij} \) is normal, and we consider hierarchies at the level of the parameters describing the location of \( y_{ij} \). + +2.3 Based on the description of the above, the lecture and BDA chapter 5 content, which of these below would suggest a seperate model for \( y_{ij} \)? \(\pi() \) stands for some prior distribution and \( \eta_j \) may be a vector of parameters such that \(cov(\eta_s,\eta_r) = 0 \) for \( s \neq r \). + + +2.4 Based on the description of the above, the lecture and BDA chapter 5 content, which of these below would suggest a pooled model for \( y_{ij} \)? \(\pi() \) stands for some prior distribution and \( \eta_j \) may be a vector of parameters such that \(cov(\eta_s,\eta_r) = 0 \) for \( s \neq r )\. + + + +2.5: Based on the description of the above, the lecture and BDA chapter 5 content, which of these below would suggest a pooled model for \( y_{ij} \)? \(\pi() \) stands for some prior distribution and \( \eta_j \) may be a vector where \(cov(\eta_s,\eta_r) = 0 \) for \( s \neq r \). + + + +2.6: Why are hierarchical models sometimes also referred to as partially pooled models. You may find it helpful to review the first unnumbered equation on page 115 of BDA3? + + +It may help, before translating the model to code, to first describe the relationship between variables as a directed acyclic graph (DAG) (see lecture 7). We interpret the DAG starting from top to bottom as describing the generative mechanism implied by priors and observation model for \( y_{ij} \). Top level variables feed into distributions of lower level parameters and the observation model. Distributions are typically not further described in the DAG. By drawing variables sequentially from top to bottom, you can generate fake observations (push-forward distribution). In the case that the parameters have not yet been updated by the data information, this push-forward distribution is called the prior predictive distribution of \( y \). You have already created such a prior predictive distribution in Assignment 1, question 7 and we will return to this in the sections below. + +2.7: What is the difference between sequential draws from priors and the data model as described above to drawing from the joint posterior in Stan? + + +In the Figure 1 below, assume that all circular nodes indicate that the object inside can be generated according to some distribution. + +Figure 1 + +2.8: Which of these equations properly describes the posterior for the model shown in the diagram? + + + +3. Meta-analysis and hierarchical models + +Often similar research studies in areas such as medicine or social science will be published under slightly different conditions or replicated with different subjects. It is of interest to summarise and integrate those findings for which hierarchical models have become increasingly popular. + +3.1: Suppose \(y_i\) is the point estimate for an effect size of a single study, \( i \). Why is it often appropriate to model \( y_i \) by a normal distribution with known standard deviation \( \sigma_i \) which is taken as the standard error estimate for \( y_i \)? + + +3.2: Why are hiearchical models preferable to seperate and pooled models for meta-analysis? + + +3.3: Suppose the assumption of exchangeability is false because you know the estimates of effects across studies depends on extra information \( x_i \), e.g. geolocation and geolocation has a significant impact on the estimates. What should you do in this circumstance? + + +3.4: Based on the discussion above, which of the below would refer to a hierarchical model for \( y_i\)? + + + +4. Introduction to brms + +brms is an R package which makes writing models with Stan easier. + +Suppose you have oberved the following variables, from different groups: \(x, z, y\). \(i\) is the observation number and \(j\) the group indicator. Assume a model \(y_ij \sim N(\mu_ij, \sigma)\). +Translate the following equations for the linear predictor term (\mu_ij\) into brms syntax: + +4.1 \(\mu_ij = \alpha_0\): + + + + +4.2 \(\mu_ij = \alpha_0 + \beta_1 x_i\) + + + +4.3 \(\mu_ij = \alpha_0 + \beta_1 x_i + \beta_2 z_i\) + + + +4.4 \(\mu_ij = \alpha_0 + \alpha_j + \beta_1 x_i\) + + + +5. Simulation warmup + +In this task, you will simulate data from a hierarchical model to gain better insight into it. + + +The following R code simulates data from a hierarchical structure, and then plots the results. Experiment with this function by using it on different values of the arguments, and answer the following questions. This code is also included in the notebook for this assignment. + +hierarchical_sim <- function(group_pop_mean, + between_group_sd, + within_group_sd, + n_groups, + n_obs_per_group + ) { + # Generate group means + group_means <- rnorm( + n = n_groups, + mean = group_pop_mean, + sd = between_group_sd + ) + + # Generate observations + + ## Create an empty vector for observations + y <- numeric() + ## Create a vector for the group identifier + group <- rep(1:n_groups, each = n_obs_per_group) + + for (j in 1:n_groups) { + ### Generate one group observations + group_y <- rnorm( + n = n_obs_per_group, + mean = group_means[j], + sd = within_group_sd + ) + ### Append the group observations to the vector + y <- c(y, group_y) + } + + # Combine into a data frame + data <- data.frame( + group = factor(group), + y = y + ) + + # Plot the data + ggplot(data, aes(x = y, y = group)) + + geom_point() + + geom_vline(xintercept = group_pop_mean, linetype = "dashed") +} + +5.1 Which of the following generative models does the code correspond to? + +\(i\) is the observation number, \(j\) is the group indicator. + + + +5.2 What does the vertical dashed line in the plot represent? + + +5.3 Which function call would plausibly create a plot like the following? + + + +5.4 Which of these statements correctly describes the behaviour when the number of groups is increased? + + +5.5 Which of these statements correctly describes the behaviour when the ratio of the between group and within group variance is changed? + + +6. Sleep deprivation +In many cases the same people will have several data points collected (e.g. at different time points). A hierarchical model is well suited for this, with each group corresponding to a person. The sleepstudy dataset contains observations of reaction times for different people after differing number of days of sleep deprivation. + +Your task is to fit a hierarchical normal linear model in brms. +The observation model will be \(Reaction_{ij} \sim N(\mu_{ij}, \sigma)\). But depending on the setup, the \(\mu_{ij}\) term will differ. + +First fit a model with a population-level intercept, population-level effect of Days and varying intercepts per Subject. Note: in order to have the Intercept parameter be directly interpretable, use `center = FALSE` when creating the brms formula. See the code notebook for more details. + +Use the following priors (check the code notebook for how to specify): + +\(\alpha_0 \sim normal(250, 100)\) +\(\beta_0 \sim normal(0, 20)\) +\(\alpha_j \sim normal(0, \tau)\) +\(\tau \sim normal^+(0, 100)\) +\(\sigma \sim normal^+(0, 100)\) + + +6.1 Which of these formulae correctly defines the linear predictor term in a model with population-level intercept, population-level effect of Days and varying intercepts per Subject? - +6.2 Which is the correct brms formula for this model? + + +6.3 Based on the posterior, the estimate of the population-level intercept is . + +6.4 The population-level intercept can be interpreted as: + + +6.5 Based on the posterior, the estimate of the population-level effect of Days on Reaction is . + +6.6 The population-level effect of Days can be interpreted as: + + +6.7 Based on the posterior, the estimate of the standard deviation of the Intercept between Subjects is . + +6.8 The standard deviation of the Subject-specific Intercept can be interpreted as: + + +Next fit a model with subject-specific effects of Days in addition to the other terms. + +Use the prior \(\beta_j \sim normal(0, 20)\) +In this model, we can consider a correlation between the by-Subject intercept and Days effects. Use an \(LKJ(2)\) prior on this, \(\rho \sim LKJ(2)\). + +6.9 Which of these formulae correctly defines the linear predictor in this case? + + +6.10 Which is the correct brms formula for this model? + + + +6.11 Based on the posterior, the estimate of the standard deviation of Subject-specific effect of Days is . + +6.12 The the standard deviation of the Subject specific Intercept can be interpreted as: + +6.13 Based on the posterior, the estimate of correlation between the Subject-specific Intercept and effect of Days is . + +6.14 After fitting both models, which of the statements describe the results best (here interpret "plausible" as what is contained within the 95% posterior interval): + + + + +7. School calendar dataset + +As you've learned from the above, hierarchical models are also useful for meta-analyses. +The dataset dat.konstantopolous2011 from the metadat package contains results from different studies conducted on the effect of changing the school calendar from a standard one with a long summer break to a modified one with more regular but shorter breaks. + +Each result of a study is a standardized estimated mean effect on student achievement, where positive values indicate improvement (yi) and the variance of the estimate (vi) for a school. + +The meta-analysis model can be written as follows: \(y_{ijk} \sim normal(\mu_{ijk}, \sigma_{ijk})\). \(i\) refers to an observation, \(j\) refers to a school, \(k\) refers to a district. \(\mu_{ijk}\) thus refers to one observation \(i\) at school \(j\) in district \(k\). \(\mu_0\) is the population-level effect, \(\mu_j\) is the school-specific effect, and \(\mu_k\) is the district-specific effect. + +In this case, the \(\sigma_{ijk}\) values are known (the standard errors reported from each study) and the \(\mu_{ijk}\) term will differ depending on the model. + +In this exercise you will fit four different models to the same data set: Pooled, separate, partially pooled within school-specific effects, and partially pooled within schools and district effects. + +For all models, use brms with the default priors and use the following settings: +iter = 2000, warmup = 1000, chains = 4, seed = 2024 + +The dataset has separate columns for school and district, but for the models, we will want a unique identifier for each school. Before fitting any models, add a new column to the dataset called "district_school" which is the combination of the district and the school. This is then unique for each school, and will be used in the model formulae. + + +Pooled model: + +Use brms to fit a pooled model and answer the following questions. Check the code notebook for help starting. + +The pooled model formula is: \(\mu_{ijk} = \mu_0\). +7.1 The pooled model is written as: + + + +7.2 How many parameters are estimated in the pooled model? + + +7.3 What is the Rhat value for the Intercept term? + + +7.4 Based on the Rhat value, have the chains converged? + + +7.5 What is the posterior mean (labelled Estimate) and lower and upper 95% posterior interval bounds (labelled CI) for the Intercept? +Mean: , lower 95% interval bound: , upper 95% interval bound: + +7.6 Based on the posterior of the Intercept, what would you conclude about the effect of the intervention: + + +7.7 Suppose there is a new school in an existing district (School = 9, District = 86). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument. + + +7.8 Suppose there is a new school in a new district (School = 1, District = 30). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument. + + +Separate model: + +Use brms to fit a separate model and answer the following questions. Check the code notebook for help starting. + +The separate model formula is: \(\mu_{ijk}= \mu_{ijk}\). +7.9 The separate model is written as: + + + +7.10 How many parameters are estimated in the separate model? + + +7.11 What is the estimated effect for School 3 in District 71? +Mean: , lower 95% interval bound: , upper 95% interval bound: + +7.12 What is the estimated effect of School 7 in District 86? +Mean: , lower 95% interval bound: , upper 95% interval bound: + +7.13 Based on the posterior for the separate model, what could reasonably be concluded about the effect of the intervention: + + +7.14 Suppose there is a new school in an existing district (District = 86, School = 9). Why can we not easily predict the effect for this school based on the separate model posterior? + + +Partially pooled model for schools. + +The partially pooled model formula is: \(\mu_{ijk} = \mu_0 + \mu_j\) +7.15 And the partially pooled hierarchical model: + + + +7.16 How many parameters are estimated in the partially pooled for schools model? + + +7.17 Based on the posterior for the pooled model, what could reasonably be concluded about the effect of the intervention: + + +7.18 What is the estimated effect for School 3 in District 71? +Mean: , lower 95% interval bound: , upper 95% interval bound: + +7.19 What is the estimated effect of School 7 in District 86? +Mean: , lower 95% interval bound: , upper 95% interval bound: + +7.20 Suppose there is a new school in an existing district (District = 86, School = 9). What would the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument with allow_new_levels = TRUE. + + +7.21 Suppose there is a new school in a new district (District = 30, School = 1). What would the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument with allow_new_levels = TRUE. + + +Partially pooled model for schools in districts: + +In this data set, as there is a two-level structure, where schools are nested in districts, we can add another level to the hierarchy. The additional hierarchy formula is: \(\mu_{ijk} = \mu_0 + \mu_j + \mu_k\) + +7.22 What is the correct brms formula? + + +7.23 How many parameters are estimated in the partially pooled for schools in districts model? + + +7.24 Suppose there is a new school in an existing district (District = 86, School = 9). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument with allow_new_levels = TRUE. + + +7.25 Suppose there is a new school in a new district (District = 30, School = 1). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the `newdata` argument with `allow_new_levels = TRUE`. + + +7.26 Which of the following statements are true relating to exchangeability assumptions in the models: + + +7.27 After fitting all the models, choose which of the following statements relating to the results are true: + + + + + +
@@ -430,57 +804,6 @@

1 General informa container: window.document.getElementById('quarto-embedded-source-code-modal') }); clipboardModal.on('success', onCopySuccess); - } - const viewSource = window.document.getElementById('quarto-view-source') || - window.document.getElementById('quarto-code-tools-source'); - if (viewSource) { - const sourceUrl = viewSource.getAttribute("data-quarto-source-url"); - viewSource.addEventListener("click", function(e) { - if (sourceUrl) { - // rstudio viewer pane - if (/\bcapabilities=\b/.test(window.location)) { - window.open(sourceUrl); - } else { - window.location.href = sourceUrl; - } - } else { - const modal = new bootstrap.Modal(document.getElementById('quarto-embedded-source-code-modal')); - modal.show(); - } - return false; - }); - } - function toggleCodeHandler(show) { - return function(e) { - const detailsSrc = window.document.querySelectorAll(".cell > details > .sourceCode"); - for (let i=0; i .sourceCode"); - const fromCls = show ? "hidden" : "unhidden"; - const toCls = show ? "unhidden" : "hidden"; - for (let i=0; i1 General informa Assignment 8 - + diff --git a/assignments/search.json b/assignments/search.json index ccad8669..ec2a9420 100644 --- a/assignments/search.json +++ b/assignments/search.json @@ -37,7 +37,7 @@ "href": "template7.html", "title": "Notebook for Assignment 7", "section": "", - "text": "1 General information\nThis notebook will be published later!", + "text": "1 General information\nThis assignment relates to Lecture 7 and Chapter 5.\nWe recommend using JupyterHub (which has all the needed packages pre-installed).\n\nReading instructions:\n\nThe reading instructions for BDA3 Chapter 5.\n\n\n\n\n\n\n\n\nGeneral Instructions for Answering the Assignment Questions\n\n\n\n\n\n\nQuestions below are exact copies of the text found in the MyCourses quiz and should serve as a notebook where you can store notes and code.\nWe recommend opening these notebooks in the Aalto JupyterHub, see how to use R and RStudio remotely.\nFor inspiration for code, have a look at the BDA R Demos and the specific Assignment code notebooks\nRecommended additional self study exercises for each chapter in BDA3 are listed in the course web page. These will help to gain deeper understanding of the topic.\nCommon questions and answers regarding installation and technical problems can be found in Frequently Asked Questions (FAQ).\nDeadlines for all assignments can be found on the course web page and in MyCourses.\nYou are allowed to discuss assignments with your friends, but it is not allowed to copy solutions directly from other students or from internet.\nDo not share your answers publicly.\nDo not copy answers from the internet or from previous years. We compare the answers to the answers from previous years and to the answers from other students this year.\nUse of AI is allowed on the course, but the most of the work needs to by the student, and you need to report whether you used AI and in which way you used them (See points 5 and 6 in Aalto guidelines for use of AI in teaching).\nAll suspected plagiarism will be reported and investigated. See more about the Aalto University Code of Academic Integrity and Handling Violations Thereof.\nIf you have any suggestions or improvements to the course material, please post in the course chat feedback channel, create an issue, or submit a pull request to the public repository!\n\n\n\n\n\nif (!require(tidybayes)) {\n install.packages(\"tidybayes\")\n library(tidybayes)\n}\n\nLoading required package: tidybayes\n\n\nWarning in library(package, lib.loc = lib.loc, character.only = TRUE,\nlogical.return = TRUE, : there is no package called 'tidybayes'\n\n\nInstalling package into '/home/runner/work/_temp/Library'\n(as 'lib' is unspecified)\n\n\nalso installing the dependencies 'svUnit', 'arrayhelpers'\n\nif (!require(brms)) {\n install.packages(\"brms\")\n library(brms)\n}\n\nLoading required package: brms\n\n\nLoading required package: Rcpp\n\n\nLoading 'brms' package (version 2.22.0). Useful instructions\ncan be found by typing help('brms'). A more detailed introduction\nto the package is available through vignette('brms_overview').\n\n\n\nAttaching package: 'brms'\n\n\nThe following objects are masked from 'package:tidybayes':\n\n dstudent_t, pstudent_t, qstudent_t, rstudent_t\n\n\nThe following object is masked from 'package:stats':\n\n ar\n\nif (!require(metadat)) {\n install.packages(\"metadat\")\n library(metadat)\n}\n\nLoading required package: metadat\n\n\nWarning in library(package, lib.loc = lib.loc, character.only = TRUE,\nlogical.return = TRUE, : there is no package called 'metadat'\n\n\nInstalling package into '/home/runner/work/_temp/Library'\n(as 'lib' is unspecified)\n\n\nalso installing the dependency 'mathjaxr'\n\nif(!require(cmdstanr)){\n install.packages(\"cmdstanr\", repos = c(\"https://mc-stan.org/r-packages/\", getOption(\"repos\")))\n library(cmdstanr)\n}\n\nLoading required package: cmdstanr\n\n\nThis is cmdstanr version 0.8.1\n\n\n- CmdStanR documentation and vignettes: mc-stan.org/cmdstanr\n\n\n- CmdStan path: /home/runner/.cmdstan/cmdstan-2.35.0\n\n\n- CmdStan version: 2.35.0\n\ncmdstan_installed <- function(){\n res <- try(out <- cmdstanr::cmdstan_path(), silent = TRUE)\n !inherits(res, \"try-error\")\n}\nif(!cmdstan_installed()){\n install_cmdstan()\n}\n\n\n\n2 Simulation warm-up\nHere is the function to simulate and plot observations from a hierarchical data-generating process.\n\nhierarchical_sim <- function(group_pop_mean,\n between_group_sd,\n within_group_sd,\n n_groups,\n n_obs_per_group\n ) {\n # Generate group means\n group_means <- rnorm(\n n = n_groups,\n mean = group_pop_mean,\n sd = between_group_sd\n )\n\n # Generate observations\n\n ## Create an empty vector for observations\n y <- numeric()\n ## Create a vector for the group identifier\n group <- rep(1:n_groups, each = n_obs_per_group)\n \n for (j in 1:n_groups) {\n ### Generate one group observations\n group_y <- rnorm(\n n = n_obs_per_group,\n mean = group_means[j],\n sd = within_group_sd\n )\n ### Append the group observations to the vector\n y <- c(y, group_y)\n }\n\n # Combine into a data frame\n data <- data.frame(\n group = factor(group),\n y = y\n )\n\n # Plot the data\n ggplot(data, aes(x = y, y = group)) +\n geom_point() +\n geom_vline(xintercept = group_pop_mean, linetype = \"dashed\")\n}\n\nExample using the function:\n\nhierarchical_sim(\n group_pop_mean = 50,\n between_group_sd = 5,\n within_group_sd = 1,\n n_groups = 10,\n n_obs_per_group = 5\n )\n\nError in ggplot(data, aes(x = y, y = group)): could not find function \"ggplot\"\n\n\n\n\n3 Sleep deprivation\nThe dataset sleepstudy is available by using the command data(sleepstudy, package = \"lme4\")\nBelow is some code for fitting a brms model. This model is a simple pooled model. You will need to fit a hierarchical model as explained in the assignment, but this code should help getting started.\nLoad the dataset\n\ndata(sleepstudy, package = \"lme4\")\n\nError in find.package(package, lib.loc, verbose = verbose): there is no package called 'lme4'\n\n\nSpecify the formula and observation family:\n\nsleepstudy_pooled_formula <- bf(\n Reaction ~ 1 + Days,\n family = \"gaussian\",\n center = FALSE\n)\n\nWe can see the parameters and default priors with\n\nget_prior(pooled_formula, data = sleepstudy)\n\nError: object 'pooled_formula' not found\n\n\nWe can then specify the priors:\n\n(sleepstudy_pooled_priors <- c(\n prior(\n normal(400, 100),\n class = \"b\",\n coef = \"Intercept\"\n ),\n prior(\n normal(0, 50),\n class = \"b\",\n coef = \"Days\"\n ),\n prior(\n normal(0, 50),\n class = \"sigma\"\n )\n))\n\n prior class coef group resp dpar nlpar lb ub source\n normal(400, 100) b Intercept <NA> <NA> user\n normal(0, 50) b Days <NA> <NA> user\n normal(0, 50) sigma <NA> <NA> user\n\n\nAnd then fit the model:\n\nsleepstudy_pooled_fit <- brm(\n formula = pooled_formula,\n prior = pooled_priors,\n data = sleepstudy\n)\n\nError: object 'pooled_formula' not found\n\n\nWe can inspect the model fit:\n\nsummary(pooled_fit)\n\nError: object 'pooled_fit' not found\n\n\n\n\n4 School calendar\nMeta-analysis models can be fit in brms. When the standard error is known, the se() function can be used to specify it.\nThe dataset dat.konstantopoulos2011 has the observations for the school calendar intervention meta-analysis.\n\ndata(dat.konstantopoulos2011, package = \"metadat\")\n\nAs mentioned in the assignment instructions, a unique identifier for school needs to be created by combining the district and school:\n\nschoolcalendar_data <- dat.konstantopoulos2011 |>\n dplyr::mutate(\n school = factor(school),\n district = factor(district),\n district_school = interaction(district, school, drop = TRUE, sep = \"_\")\n )\n\nThen the models can be fit\n\nschoolcalendar_pooled_formula <- bf(\n formula = yi | se(sqrt(vi)) ~ 1,\n family = \"gaussian\"\n) \n\nschoolcalendar_pooled_fit <- brm(\n formula = schoolcalendar_pooled_formula,\n data = schoolcalendar_data\n)\nPredictions for a new school can be made using the posterior_epred function:\n\nnew_school <- data.frame(\n school = factor(1),\n district = factor(1),\n district_school = factor(\"1_1\"),\n vi = 0 # the expectation of the prediction is not affected by the sampling variance, so this can be any number\n)\n \n\nschoolcalendar_post_epred <- posterior_epred(\n schoolcalendar_pooled_fit,\n newdata = new_school,\n allow_new_levels = TRUE\n )\nIt can be helpful to plot the posterior estimates. Here is a function that will do this:\n\nplot_school_posteriors <- function(fit, dataset) {\n tidybayes::add_predicted_draws(dataset, fit) |>\n ggplot(\n aes(\n x = .prediction,\n y = interaction(district, school, sep = \", \", lex.order = TRUE))) +\n tidybayes::stat_halfeye() +\n ylab(\"District, school\") +\n xlab(\"Posterior effect\")\n}\n\nAnd can be used as follows:\nplot_school_posteriors(\n fit = schoolcalendar_pooled_fit,\n dataset = school_calendar_data\n)", "crumbs": [ "Templates", "Notebook for Assignment 7" @@ -389,7 +389,18 @@ "href": "assignment7.html", "title": "Assignment 7", "section": "", - "text": "1 General information\nThis assignment will be published later!", + "text": "The exercises here refer to the lecture 7/BDA chapter 5 content. The main topics for this assignment are the MCSE and importance sampling.\nThe exercises constitute 96% of the Quiz 7 grade.\nWe prepared a quarto notebook specific to this assignment to help you get started. You still need to fill in your answers on Mycourses! You can inspect this and future templates\n\nas a rendered html file (to access the qmd file click the “</> Code” button at the top right hand corner of the template)\n\n\n\n\n\n\n\nGeneral Instructions for Answering the Assignment Questions\n\n\n\n\n\n\nQuestions below are exact copies of the text found in the MyCourses quiz and should serve as a notebook where you can store notes and code.\nWe recommend opening these notebooks in the Aalto JupyterHub, see how to use R and RStudio remotely.\nFor inspiration for code, have a look at the BDA R Demos and the specific Assignment code notebooks\nRecommended additional self study exercises for each chapter in BDA3 are listed in the course web page. These will help to gain deeper understanding of the topic.\nCommon questions and answers regarding installation and technical problems can be found in Frequently Asked Questions (FAQ).\nDeadlines for all assignments can be found on the course web page and in MyCourses.\nYou are allowed to discuss assignments with your friends, but it is not allowed to copy solutions directly from other students or from internet.\nDo not share your answers publicly.\nDo not copy answers from the internet or from previous years. We compare the answers to the answers from previous years and to the answers from other students this year.\nUse of AI is allowed on the course, but the most of the work needs to by the student, and you need to report whether you used AI and in which way you used them (See points 5 and 6 in Aalto guidelines for use of AI in teaching).\nAll suspected plagiarism will be reported and investigated. See more about the Aalto University Code of Academic Integrity and Handling Violations Thereof.\nIf you have any suggestions or improvements to the course material, please post in the course chat feedback channel, create an issue, or submit a pull request to the public repository!\n\n\n\n\n\n\nFor convenience the assignment questions are copied below. Answer the questions in MyCourses.\nLecture 7/Chapter 5 of BDA Quiz (96% of grade)\n\nThis week's assignment focuses on hierarchical models and modelling with brms.\n\n1. Exchangeability\nAn important building block of hierarchical models is the assumption of exchangeability. Let's review.\n\n1.1 Consider parameters \\(\\theta_j\\) for j in 1...J. Which of these statements correctly describes exchangeability?\n\n\n1.2 What best describes the difference between independence and exchangeability? \n\n\nConsider the following: assume a box has N black and white balls but we do not know how many of each color. We pick a ball \\(y_1\\) at random, and we do not put it back, we then pick another ball \\(y_2\\) at random. Denote the number of black balls by B and white by W.\n\n1.3: Are observations \\( y_1 \\) and \\( y_2 \\) exchangeable?\n\n\n1.4: Are observations \\(y_1\\) and \\(y_2\\) independent?\n\n\n1.5: Can we act as if the two observations are independent?\n\n\n1.6: Exchangeability allows us to express dependencies of data and parameters in a convenient form. Suppose we model a sequence of exchangeable random variables \\( \\theta \\) via a governing, or population distribution, where conditional on some unknown parameters \\( \\phi \\), we may assume independence between the elements of \\( \\theta \\). Assume that \\( \\theta \\) has J elements, write down an equation that conveniently factors the joint probability \\( p(\\theta,\\phi) \\):\n\n\nDe Finetti's theorem provides the theoretical basis for the equivalence to the independence assumption when J goes to infinity. In practice J is finite, but the difference may be small when J is relatively large. Check out the examples mentioned in the Chapter notes if you need more convincing. \n\n\\( \\phi \\) is in general unknown. so the marginal prior for \\( \\theta \\) must average over uncertainty in \\( \\phi \\): \\( \\int \\prod^J_{j=1} p(\\theta_j | \\phi) p(\\phi)d\\phi \\). \n\n1.7 Based on the marginal prior formulation in the paragraph above, what can you say about the covariances \\( \\text{cov}( \\theta_i, \\theta_j) \\)?\n\n\n1.8 As with the parameters above, we often think of exchangeability as arising for our data model conditional on extra information x, such that the tuple \\((x_i,y_i)\\) are exchangeable, whereas \\(y_i\\) might not be. In which modeling context is this useful? Assume that the target data is denoted by \\( y_i \\).\n \n\n2. Overview of hierarchical models\n\n2.1 Which of these best describes a hierarchical model?\n\n\n\n2.2: Consider that there are observations \\(y\\) indexed by observation number \\(i\\) and group \\(j\\). Suppose the data are modeled dependent on parameters \\( \\theta_j \\) where \\( j = 1,\\dotsc,J\\) indexes some meaningful grouping such as hospital-j specific health-outcomes, conditional on parameters \\( \\phi \\) which are hyper-parameters of the prior distribution of \\( \\theta_j \\). \\( \\phi \\) are themselved modeled by prior distribution \\( p(\\phi) \\). We think of the distribution \\( p( \\phi ) \\) as the population distribution which generates the values for \\( \\theta \\) in the hierarchical model. What are some of the benefits of hierarchical models compared to seperate models, which assume no relationship between j (and seperate models are estimated), and pooled models, which consider all j jointly, but do not model j specific parameters?\n \n\nThroughout the rest of the course, we will often compare the hierarchical model, to a seperate model and the pooled model. Suppose for the questions below that you are modeling data \\( y_{ij} \\) where \\( i \\) refers to an observation within the \\( j^{\\text{th}}\\) group, the observation model for \\( y_{ij} \\) is normal, and we consider hierarchies at the level of the parameters describing the location of \\( y_{ij} \\). \n\n2.3 Based on the description of the above, the lecture and BDA chapter 5 content, which of these below would suggest a seperate model for \\( y_{ij} \\)? \\(\\pi() \\) stands for some prior distribution and \\( \\eta_j \\) may be a vector of parameters such that \\(cov(\\eta_s,\\eta_r) = 0 \\) for \\( s \\neq r \\).\n\n\n2.4 Based on the description of the above, the lecture and BDA chapter 5 content, which of these below would suggest a pooled model for \\( y_{ij} \\)? \\(\\pi() \\) stands for some prior distribution and \\( \\eta_j \\) may be a vector of parameters such that \\(cov(\\eta_s,\\eta_r) = 0 \\) for \\( s \\neq r )\\.\n\n\n\n2.5: Based on the description of the above, the lecture and BDA chapter 5 content, which of these below would suggest a pooled model for \\( y_{ij} \\)? \\(\\pi() \\) stands for some prior distribution and \\( \\eta_j \\) may be a vector where \\(cov(\\eta_s,\\eta_r) = 0 \\) for \\( s \\neq r \\).\n\n\n\n2.6: Why are hierarchical models sometimes also referred to as partially pooled models. You may find it helpful to review the first unnumbered equation on page 115 of BDA3?\n\n\nIt may help, before translating the model to code, to first describe the relationship between variables as a directed acyclic graph (DAG) (see lecture 7). We interpret the DAG starting from top to bottom as describing the generative mechanism implied by priors and observation model for \\( y_{ij} \\). Top level variables feed into distributions of lower level parameters and the observation model. Distributions are typically not further described in the DAG. By drawing variables sequentially from top to bottom, you can generate fake observations (push-forward distribution). In the case that the parameters have not yet been updated by the data information, this push-forward distribution is called the prior predictive distribution of \\( y \\). You have already created such a prior predictive distribution in Assignment 1, question 7 and we will return to this in the sections below. \n\n2.7: What is the difference between sequential draws from priors and the data model as described above to drawing from the joint posterior in Stan?\n\n\nIn the Figure 1 below, assume that all circular nodes indicate that the object inside can be generated according to some distribution.\n\nFigure 1\n\n2.8: Which of these equations properly describes the posterior for the model shown in the diagram?\n\n\n\n3. Meta-analysis and hierarchical models\n\nOften similar research studies in areas such as medicine or social science will be published under slightly different conditions or replicated with different subjects. It is of interest to summarise and integrate those findings for which hierarchical models have become increasingly popular. \n\n3.1: Suppose \\(y_i\\) is the point estimate for an effect size of a single study, \\( i \\). Why is it often appropriate to model \\( y_i \\) by a normal distribution with known standard deviation \\( \\sigma_i \\) which is taken as the standard error estimate for \\( y_i \\)?\n\n\n3.2: Why are hiearchical models preferable to seperate and pooled models for meta-analysis?\n\n\n3.3: Suppose the assumption of exchangeability is false because you know the estimates of effects across studies depends on extra information \\( x_i \\), e.g. geolocation and geolocation has a significant impact on the estimates. What should you do in this circumstance?\n\n\n3.4: Based on the discussion above, which of the below would refer to a hierarchical model for \\( y_i\\)?\n\n\n\n4. Introduction to brms\n\nbrms is an R package which makes writing models with Stan easier.\n\nSuppose you have oberved the following variables, from different groups: \\(x, z, y\\). \\(i\\) is the observation number and \\(j\\) the group indicator. Assume a model \\(y_ij \\sim N(\\mu_ij, \\sigma)\\).\nTranslate the following equations for the linear predictor term (\\mu_ij\\) into brms syntax:\n\n4.1 \\(\\mu_ij = \\alpha_0\\):\n\n\n\n\n4.2 \\(\\mu_ij = \\alpha_0 + \\beta_1 x_i\\)\n\n\n\n4.3 \\(\\mu_ij = \\alpha_0 + \\beta_1 x_i + \\beta_2 z_i\\)\n\n\n\n4.4 \\(\\mu_ij = \\alpha_0 + \\alpha_j + \\beta_1 x_i\\)\n\n\n\n5. Simulation warmup\n\nIn this task, you will simulate data from a hierarchical model to gain better insight into it.\n\n\nThe following R code simulates data from a hierarchical structure, and then plots the results. Experiment with this function by using it on different values of the arguments, and answer the following questions. This code is also included in the notebook for this assignment.\n\nhierarchical_sim <- function(group_pop_mean,\n between_group_sd,\n within_group_sd,\n n_groups,\n n_obs_per_group\n ) {\n # Generate group means\n group_means <- rnorm(\n n = n_groups,\n mean = group_pop_mean,\n sd = between_group_sd\n )\n\n # Generate observations\n\n ## Create an empty vector for observations\n y <- numeric()\n ## Create a vector for the group identifier\n group <- rep(1:n_groups, each = n_obs_per_group)\n\n for (j in 1:n_groups) {\n ### Generate one group observations\n group_y <- rnorm(\n n = n_obs_per_group,\n mean = group_means[j],\n sd = within_group_sd\n )\n ### Append the group observations to the vector\n y <- c(y, group_y)\n }\n\n # Combine into a data frame\n data <- data.frame(\n group = factor(group),\n y = y\n )\n\n # Plot the data\n ggplot(data, aes(x = y, y = group)) +\n geom_point() +\n geom_vline(xintercept = group_pop_mean, linetype = \"dashed\")\n}\n\n5.1 Which of the following generative models does the code correspond to?\n\n\\(i\\) is the observation number, \\(j\\) is the group indicator.\n\n\n\n5.2 What does the vertical dashed line in the plot represent?\n\n\n5.3 Which function call would plausibly create a plot like the following?\n\n\n\n5.4 Which of these statements correctly describes the behaviour when the number of groups is increased?\n\n\n5.5 Which of these statements correctly describes the behaviour when the ratio of the between group and within group variance is changed?\n\n\n6. Sleep deprivation\nIn many cases the same people will have several data points collected (e.g. at different time points). A hierarchical model is well suited for this, with each group corresponding to a person. The sleepstudy dataset contains observations of reaction times for different people after differing number of days of sleep deprivation.\n\nYour task is to fit a hierarchical normal linear model in brms.\nThe observation model will be \\(Reaction_{ij} \\sim N(\\mu_{ij}, \\sigma)\\). But depending on the setup, the \\(\\mu_{ij}\\) term will differ.\n\nFirst fit a model with a population-level intercept, population-level effect of Days and varying intercepts per Subject. Note: in order to have the Intercept parameter be directly interpretable, use `center = FALSE` when creating the brms formula. See the code notebook for more details.\n\nUse the following priors (check the code notebook for how to specify):\n\n\\(\\alpha_0 \\sim normal(250, 100)\\)\n\\(\\beta_0 \\sim normal(0, 20)\\)\n\\(\\alpha_j \\sim normal(0, \\tau)\\)\n\\(\\tau \\sim normal^+(0, 100)\\)\n\\(\\sigma \\sim normal^+(0, 100)\\)\n\n\n6.1 Which of these formulae correctly defines the linear predictor term in a model with population-level intercept, population-level effect of Days and varying intercepts per Subject?\n\n\n6.2 Which is the correct brms formula for this model?\n\n\n\n6.3 Based on the posterior, the estimate of the population-level intercept is .\n\n6.4 The population-level intercept can be interpreted as:\n\n\n6.5 Based on the posterior, the estimate of the population-level effect of Days on Reaction is .\n\n6.6 The population-level effect of Days can be interpreted as:\n\n\n6.7 Based on the posterior, the estimate of the standard deviation of the Intercept between Subjects is .\n\n6.8 The standard deviation of the Subject-specific Intercept can be interpreted as:\n\n\nNext fit a model with subject-specific effects of Days in addition to the other terms.\n\nUse the prior \\(\\beta_j \\sim normal(0, 20)\\)\nIn this model, we can consider a correlation between the by-Subject intercept and Days effects. Use an \\(LKJ(2)\\) prior on this, \\(\\rho \\sim LKJ(2)\\).\n\n6.9 Which of these formulae correctly defines the linear predictor in this case?\n\n\n6.10 Which is the correct brms formula for this model?\n\n\n\n6.11 Based on the posterior, the estimate of the standard deviation of Subject-specific effect of Days is .\n\n6.12 The the standard deviation of the Subject specific Intercept can be interpreted as:\n\n6.13 Based on the posterior, the estimate of correlation between the Subject-specific Intercept and effect of Days is .\n\n6.14 After fitting both models, which of the statements describe the results best (here interpret \"plausible\" as what is contained within the 95% posterior interval):\n\n\n\n\n7. School calendar dataset\n\nAs you've learned from the above, hierarchical models are also useful for meta-analyses.\nThe dataset dat.konstantopolous2011 from the metadat package contains results from different studies conducted on the effect of changing the school calendar from a standard one with a long summer break to a modified one with more regular but shorter breaks.\n\nEach result of a study is a standardized estimated mean effect on student achievement, where positive values indicate improvement (yi) and the variance of the estimate (vi) for a school.\n\nThe meta-analysis model can be written as follows: \\(y_{ijk} \\sim normal(\\mu_{ijk}, \\sigma_{ijk})\\). \\(i\\) refers to an observation, \\(j\\) refers to a school, \\(k\\) refers to a district. \\(\\mu_{ijk}\\) thus refers to one observation \\(i\\) at school \\(j\\) in district \\(k\\). \\(\\mu_0\\) is the population-level effect, \\(\\mu_j\\) is the school-specific effect, and \\(\\mu_k\\) is the district-specific effect.\n\nIn this case, the \\(\\sigma_{ijk}\\) values are known (the standard errors reported from each study) and the \\(\\mu_{ijk}\\) term will differ depending on the model.\n\nIn this exercise you will fit four different models to the same data set: Pooled, separate, partially pooled within school-specific effects, and partially pooled within schools and district effects.\n\nFor all models, use brms with the default priors and use the following settings:\niter = 2000, warmup = 1000, chains = 4, seed = 2024\n\nThe dataset has separate columns for school and district, but for the models, we will want a unique identifier for each school. Before fitting any models, add a new column to the dataset called \"district_school\" which is the combination of the district and the school. This is then unique for each school, and will be used in the model formulae.\n\n\nPooled model:\n\nUse brms to fit a pooled model and answer the following questions. Check the code notebook for help starting.\n\nThe pooled model formula is: \\(\\mu_{ijk} = \\mu_0\\).\n7.1 The pooled model is written as:\n\n\n\n7.2 How many parameters are estimated in the pooled model?\n\n\n7.3 What is the Rhat value for the Intercept term?\n\n\n7.4 Based on the Rhat value, have the chains converged?\n\n\n7.5 What is the posterior mean (labelled Estimate) and lower and upper 95% posterior interval bounds (labelled CI) for the Intercept?\nMean: , lower 95% interval bound: , upper 95% interval bound: \n\n7.6 Based on the posterior of the Intercept, what would you conclude about the effect of the intervention:\n\n\n7.7 Suppose there is a new school in an existing district (School = 9, District = 86). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument.\n\n\n7.8 Suppose there is a new school in a new district (School = 1, District = 30). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument.\n\n\nSeparate model:\n\nUse brms to fit a separate model and answer the following questions. Check the code notebook for help starting.\n\nThe separate model formula is: \\(\\mu_{ijk}= \\mu_{ijk}\\).\n7.9 The separate model is written as:\n\n\n\n7.10 How many parameters are estimated in the separate model?\n\n\n7.11 What is the estimated effect for School 3 in District 71?\nMean: , lower 95% interval bound: , upper 95% interval bound: \n\n7.12 What is the estimated effect of School 7 in District 86?\nMean: , lower 95% interval bound: , upper 95% interval bound: \n\n7.13 Based on the posterior for the separate model, what could reasonably be concluded about the effect of the intervention:\n\n\n7.14 Suppose there is a new school in an existing district (District = 86, School = 9). Why can we not easily predict the effect for this school based on the separate model posterior?\n\n\nPartially pooled model for schools.\n\nThe partially pooled model formula is: \\(\\mu_{ijk} = \\mu_0 + \\mu_j\\)\n7.15 And the partially pooled hierarchical model:\n\n\n\n7.16 How many parameters are estimated in the partially pooled for schools model?\n\n\n7.17 Based on the posterior for the pooled model, what could reasonably be concluded about the effect of the intervention:\n\n\n7.18 What is the estimated effect for School 3 in District 71?\nMean: , lower 95% interval bound: , upper 95% interval bound: \n\n7.19 What is the estimated effect of School 7 in District 86?\nMean: , lower 95% interval bound: , upper 95% interval bound: \n\n7.20 Suppose there is a new school in an existing district (District = 86, School = 9). What would the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument with allow_new_levels = TRUE.\n\n\n7.21 Suppose there is a new school in a new district (District = 30, School = 1). What would the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument with allow_new_levels = TRUE.\n\n\nPartially pooled model for schools in districts:\n\nIn this data set, as there is a two-level structure, where schools are nested in districts, we can add another level to the hierarchy. The additional hierarchy formula is: \\(\\mu_{ijk} = \\mu_0 + \\mu_j + \\mu_k\\)\n\n7.22 What is the correct brms formula?\n\n\n7.23 How many parameters are estimated in the partially pooled for schools in districts model?\n\n\n7.24 Suppose there is a new school in an existing district (District = 86, School = 9). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument with allow_new_levels = TRUE.\n\n\n7.25 Suppose there is a new school in a new district (District = 30, School = 1). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the `newdata` argument with `allow_new_levels = TRUE`.\n\n\n7.26 Which of the following statements are true relating to exchangeability assumptions in the models:\n\n\n7.27 After fitting all the models, choose which of the following statements relating to the results are true:", + "crumbs": [ + "Assignments", + "Assignment 7" + ] + }, + { + "objectID": "assignment7.html#assignment-questions", + "href": "assignment7.html#assignment-questions", + "title": "Assignment 7", + "section": "", + "text": "For convenience the assignment questions are copied below. Answer the questions in MyCourses.\nLecture 7/Chapter 5 of BDA Quiz (96% of grade)\n\nThis week's assignment focuses on hierarchical models and modelling with brms.\n\n1. Exchangeability\nAn important building block of hierarchical models is the assumption of exchangeability. Let's review.\n\n1.1 Consider parameters \\(\\theta_j\\) for j in 1...J. Which of these statements correctly describes exchangeability?\n\n\n1.2 What best describes the difference between independence and exchangeability? \n\n\nConsider the following: assume a box has N black and white balls but we do not know how many of each color. We pick a ball \\(y_1\\) at random, and we do not put it back, we then pick another ball \\(y_2\\) at random. Denote the number of black balls by B and white by W.\n\n1.3: Are observations \\( y_1 \\) and \\( y_2 \\) exchangeable?\n\n\n1.4: Are observations \\(y_1\\) and \\(y_2\\) independent?\n\n\n1.5: Can we act as if the two observations are independent?\n\n\n1.6: Exchangeability allows us to express dependencies of data and parameters in a convenient form. Suppose we model a sequence of exchangeable random variables \\( \\theta \\) via a governing, or population distribution, where conditional on some unknown parameters \\( \\phi \\), we may assume independence between the elements of \\( \\theta \\). Assume that \\( \\theta \\) has J elements, write down an equation that conveniently factors the joint probability \\( p(\\theta,\\phi) \\):\n\n\nDe Finetti's theorem provides the theoretical basis for the equivalence to the independence assumption when J goes to infinity. In practice J is finite, but the difference may be small when J is relatively large. Check out the examples mentioned in the Chapter notes if you need more convincing. \n\n\\( \\phi \\) is in general unknown. so the marginal prior for \\( \\theta \\) must average over uncertainty in \\( \\phi \\): \\( \\int \\prod^J_{j=1} p(\\theta_j | \\phi) p(\\phi)d\\phi \\). \n\n1.7 Based on the marginal prior formulation in the paragraph above, what can you say about the covariances \\( \\text{cov}( \\theta_i, \\theta_j) \\)?\n\n\n1.8 As with the parameters above, we often think of exchangeability as arising for our data model conditional on extra information x, such that the tuple \\((x_i,y_i)\\) are exchangeable, whereas \\(y_i\\) might not be. In which modeling context is this useful? Assume that the target data is denoted by \\( y_i \\).\n \n\n2. Overview of hierarchical models\n\n2.1 Which of these best describes a hierarchical model?\n\n\n\n2.2: Consider that there are observations \\(y\\) indexed by observation number \\(i\\) and group \\(j\\). Suppose the data are modeled dependent on parameters \\( \\theta_j \\) where \\( j = 1,\\dotsc,J\\) indexes some meaningful grouping such as hospital-j specific health-outcomes, conditional on parameters \\( \\phi \\) which are hyper-parameters of the prior distribution of \\( \\theta_j \\). \\( \\phi \\) are themselved modeled by prior distribution \\( p(\\phi) \\). We think of the distribution \\( p( \\phi ) \\) as the population distribution which generates the values for \\( \\theta \\) in the hierarchical model. What are some of the benefits of hierarchical models compared to seperate models, which assume no relationship between j (and seperate models are estimated), and pooled models, which consider all j jointly, but do not model j specific parameters?\n \n\nThroughout the rest of the course, we will often compare the hierarchical model, to a seperate model and the pooled model. Suppose for the questions below that you are modeling data \\( y_{ij} \\) where \\( i \\) refers to an observation within the \\( j^{\\text{th}}\\) group, the observation model for \\( y_{ij} \\) is normal, and we consider hierarchies at the level of the parameters describing the location of \\( y_{ij} \\). \n\n2.3 Based on the description of the above, the lecture and BDA chapter 5 content, which of these below would suggest a seperate model for \\( y_{ij} \\)? \\(\\pi() \\) stands for some prior distribution and \\( \\eta_j \\) may be a vector of parameters such that \\(cov(\\eta_s,\\eta_r) = 0 \\) for \\( s \\neq r \\).\n\n\n2.4 Based on the description of the above, the lecture and BDA chapter 5 content, which of these below would suggest a pooled model for \\( y_{ij} \\)? \\(\\pi() \\) stands for some prior distribution and \\( \\eta_j \\) may be a vector of parameters such that \\(cov(\\eta_s,\\eta_r) = 0 \\) for \\( s \\neq r )\\.\n\n\n\n2.5: Based on the description of the above, the lecture and BDA chapter 5 content, which of these below would suggest a pooled model for \\( y_{ij} \\)? \\(\\pi() \\) stands for some prior distribution and \\( \\eta_j \\) may be a vector where \\(cov(\\eta_s,\\eta_r) = 0 \\) for \\( s \\neq r \\).\n\n\n\n2.6: Why are hierarchical models sometimes also referred to as partially pooled models. You may find it helpful to review the first unnumbered equation on page 115 of BDA3?\n\n\nIt may help, before translating the model to code, to first describe the relationship between variables as a directed acyclic graph (DAG) (see lecture 7). We interpret the DAG starting from top to bottom as describing the generative mechanism implied by priors and observation model for \\( y_{ij} \\). Top level variables feed into distributions of lower level parameters and the observation model. Distributions are typically not further described in the DAG. By drawing variables sequentially from top to bottom, you can generate fake observations (push-forward distribution). In the case that the parameters have not yet been updated by the data information, this push-forward distribution is called the prior predictive distribution of \\( y \\). You have already created such a prior predictive distribution in Assignment 1, question 7 and we will return to this in the sections below. \n\n2.7: What is the difference between sequential draws from priors and the data model as described above to drawing from the joint posterior in Stan?\n\n\nIn the Figure 1 below, assume that all circular nodes indicate that the object inside can be generated according to some distribution.\n\nFigure 1\n\n2.8: Which of these equations properly describes the posterior for the model shown in the diagram?\n\n\n\n3. Meta-analysis and hierarchical models\n\nOften similar research studies in areas such as medicine or social science will be published under slightly different conditions or replicated with different subjects. It is of interest to summarise and integrate those findings for which hierarchical models have become increasingly popular. \n\n3.1: Suppose \\(y_i\\) is the point estimate for an effect size of a single study, \\( i \\). Why is it often appropriate to model \\( y_i \\) by a normal distribution with known standard deviation \\( \\sigma_i \\) which is taken as the standard error estimate for \\( y_i \\)?\n\n\n3.2: Why are hiearchical models preferable to seperate and pooled models for meta-analysis?\n\n\n3.3: Suppose the assumption of exchangeability is false because you know the estimates of effects across studies depends on extra information \\( x_i \\), e.g. geolocation and geolocation has a significant impact on the estimates. What should you do in this circumstance?\n\n\n3.4: Based on the discussion above, which of the below would refer to a hierarchical model for \\( y_i\\)?\n\n\n\n4. Introduction to brms\n\nbrms is an R package which makes writing models with Stan easier.\n\nSuppose you have oberved the following variables, from different groups: \\(x, z, y\\). \\(i\\) is the observation number and \\(j\\) the group indicator. Assume a model \\(y_ij \\sim N(\\mu_ij, \\sigma)\\).\nTranslate the following equations for the linear predictor term (\\mu_ij\\) into brms syntax:\n\n4.1 \\(\\mu_ij = \\alpha_0\\):\n\n\n\n\n4.2 \\(\\mu_ij = \\alpha_0 + \\beta_1 x_i\\)\n\n\n\n4.3 \\(\\mu_ij = \\alpha_0 + \\beta_1 x_i + \\beta_2 z_i\\)\n\n\n\n4.4 \\(\\mu_ij = \\alpha_0 + \\alpha_j + \\beta_1 x_i\\)\n\n\n\n5. Simulation warmup\n\nIn this task, you will simulate data from a hierarchical model to gain better insight into it.\n\n\nThe following R code simulates data from a hierarchical structure, and then plots the results. Experiment with this function by using it on different values of the arguments, and answer the following questions. This code is also included in the notebook for this assignment.\n\nhierarchical_sim <- function(group_pop_mean,\n between_group_sd,\n within_group_sd,\n n_groups,\n n_obs_per_group\n ) {\n # Generate group means\n group_means <- rnorm(\n n = n_groups,\n mean = group_pop_mean,\n sd = between_group_sd\n )\n\n # Generate observations\n\n ## Create an empty vector for observations\n y <- numeric()\n ## Create a vector for the group identifier\n group <- rep(1:n_groups, each = n_obs_per_group)\n\n for (j in 1:n_groups) {\n ### Generate one group observations\n group_y <- rnorm(\n n = n_obs_per_group,\n mean = group_means[j],\n sd = within_group_sd\n )\n ### Append the group observations to the vector\n y <- c(y, group_y)\n }\n\n # Combine into a data frame\n data <- data.frame(\n group = factor(group),\n y = y\n )\n\n # Plot the data\n ggplot(data, aes(x = y, y = group)) +\n geom_point() +\n geom_vline(xintercept = group_pop_mean, linetype = \"dashed\")\n}\n\n5.1 Which of the following generative models does the code correspond to?\n\n\\(i\\) is the observation number, \\(j\\) is the group indicator.\n\n\n\n5.2 What does the vertical dashed line in the plot represent?\n\n\n5.3 Which function call would plausibly create a plot like the following?\n\n\n\n5.4 Which of these statements correctly describes the behaviour when the number of groups is increased?\n\n\n5.5 Which of these statements correctly describes the behaviour when the ratio of the between group and within group variance is changed?\n\n\n6. Sleep deprivation\nIn many cases the same people will have several data points collected (e.g. at different time points). A hierarchical model is well suited for this, with each group corresponding to a person. The sleepstudy dataset contains observations of reaction times for different people after differing number of days of sleep deprivation.\n\nYour task is to fit a hierarchical normal linear model in brms.\nThe observation model will be \\(Reaction_{ij} \\sim N(\\mu_{ij}, \\sigma)\\). But depending on the setup, the \\(\\mu_{ij}\\) term will differ.\n\nFirst fit a model with a population-level intercept, population-level effect of Days and varying intercepts per Subject. Note: in order to have the Intercept parameter be directly interpretable, use `center = FALSE` when creating the brms formula. See the code notebook for more details.\n\nUse the following priors (check the code notebook for how to specify):\n\n\\(\\alpha_0 \\sim normal(250, 100)\\)\n\\(\\beta_0 \\sim normal(0, 20)\\)\n\\(\\alpha_j \\sim normal(0, \\tau)\\)\n\\(\\tau \\sim normal^+(0, 100)\\)\n\\(\\sigma \\sim normal^+(0, 100)\\)\n\n\n6.1 Which of these formulae correctly defines the linear predictor term in a model with population-level intercept, population-level effect of Days and varying intercepts per Subject?\n\n\n6.2 Which is the correct brms formula for this model?\n\n\n\n6.3 Based on the posterior, the estimate of the population-level intercept is .\n\n6.4 The population-level intercept can be interpreted as:\n\n\n6.5 Based on the posterior, the estimate of the population-level effect of Days on Reaction is .\n\n6.6 The population-level effect of Days can be interpreted as:\n\n\n6.7 Based on the posterior, the estimate of the standard deviation of the Intercept between Subjects is .\n\n6.8 The standard deviation of the Subject-specific Intercept can be interpreted as:\n\n\nNext fit a model with subject-specific effects of Days in addition to the other terms.\n\nUse the prior \\(\\beta_j \\sim normal(0, 20)\\)\nIn this model, we can consider a correlation between the by-Subject intercept and Days effects. Use an \\(LKJ(2)\\) prior on this, \\(\\rho \\sim LKJ(2)\\).\n\n6.9 Which of these formulae correctly defines the linear predictor in this case?\n\n\n6.10 Which is the correct brms formula for this model?\n\n\n\n6.11 Based on the posterior, the estimate of the standard deviation of Subject-specific effect of Days is .\n\n6.12 The the standard deviation of the Subject specific Intercept can be interpreted as:\n\n6.13 Based on the posterior, the estimate of correlation between the Subject-specific Intercept and effect of Days is .\n\n6.14 After fitting both models, which of the statements describe the results best (here interpret \"plausible\" as what is contained within the 95% posterior interval):\n\n\n\n\n7. School calendar dataset\n\nAs you've learned from the above, hierarchical models are also useful for meta-analyses.\nThe dataset dat.konstantopolous2011 from the metadat package contains results from different studies conducted on the effect of changing the school calendar from a standard one with a long summer break to a modified one with more regular but shorter breaks.\n\nEach result of a study is a standardized estimated mean effect on student achievement, where positive values indicate improvement (yi) and the variance of the estimate (vi) for a school.\n\nThe meta-analysis model can be written as follows: \\(y_{ijk} \\sim normal(\\mu_{ijk}, \\sigma_{ijk})\\). \\(i\\) refers to an observation, \\(j\\) refers to a school, \\(k\\) refers to a district. \\(\\mu_{ijk}\\) thus refers to one observation \\(i\\) at school \\(j\\) in district \\(k\\). \\(\\mu_0\\) is the population-level effect, \\(\\mu_j\\) is the school-specific effect, and \\(\\mu_k\\) is the district-specific effect.\n\nIn this case, the \\(\\sigma_{ijk}\\) values are known (the standard errors reported from each study) and the \\(\\mu_{ijk}\\) term will differ depending on the model.\n\nIn this exercise you will fit four different models to the same data set: Pooled, separate, partially pooled within school-specific effects, and partially pooled within schools and district effects.\n\nFor all models, use brms with the default priors and use the following settings:\niter = 2000, warmup = 1000, chains = 4, seed = 2024\n\nThe dataset has separate columns for school and district, but for the models, we will want a unique identifier for each school. Before fitting any models, add a new column to the dataset called \"district_school\" which is the combination of the district and the school. This is then unique for each school, and will be used in the model formulae.\n\n\nPooled model:\n\nUse brms to fit a pooled model and answer the following questions. Check the code notebook for help starting.\n\nThe pooled model formula is: \\(\\mu_{ijk} = \\mu_0\\).\n7.1 The pooled model is written as:\n\n\n\n7.2 How many parameters are estimated in the pooled model?\n\n\n7.3 What is the Rhat value for the Intercept term?\n\n\n7.4 Based on the Rhat value, have the chains converged?\n\n\n7.5 What is the posterior mean (labelled Estimate) and lower and upper 95% posterior interval bounds (labelled CI) for the Intercept?\nMean: , lower 95% interval bound: , upper 95% interval bound: \n\n7.6 Based on the posterior of the Intercept, what would you conclude about the effect of the intervention:\n\n\n7.7 Suppose there is a new school in an existing district (School = 9, District = 86). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument.\n\n\n7.8 Suppose there is a new school in a new district (School = 1, District = 30). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument.\n\n\nSeparate model:\n\nUse brms to fit a separate model and answer the following questions. Check the code notebook for help starting.\n\nThe separate model formula is: \\(\\mu_{ijk}= \\mu_{ijk}\\).\n7.9 The separate model is written as:\n\n\n\n7.10 How many parameters are estimated in the separate model?\n\n\n7.11 What is the estimated effect for School 3 in District 71?\nMean: , lower 95% interval bound: , upper 95% interval bound: \n\n7.12 What is the estimated effect of School 7 in District 86?\nMean: , lower 95% interval bound: , upper 95% interval bound: \n\n7.13 Based on the posterior for the separate model, what could reasonably be concluded about the effect of the intervention:\n\n\n7.14 Suppose there is a new school in an existing district (District = 86, School = 9). Why can we not easily predict the effect for this school based on the separate model posterior?\n\n\nPartially pooled model for schools.\n\nThe partially pooled model formula is: \\(\\mu_{ijk} = \\mu_0 + \\mu_j\\)\n7.15 And the partially pooled hierarchical model:\n\n\n\n7.16 How many parameters are estimated in the partially pooled for schools model?\n\n\n7.17 Based on the posterior for the pooled model, what could reasonably be concluded about the effect of the intervention:\n\n\n7.18 What is the estimated effect for School 3 in District 71?\nMean: , lower 95% interval bound: , upper 95% interval bound: \n\n7.19 What is the estimated effect of School 7 in District 86?\nMean: , lower 95% interval bound: , upper 95% interval bound: \n\n7.20 Suppose there is a new school in an existing district (District = 86, School = 9). What would the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument with allow_new_levels = TRUE.\n\n\n7.21 Suppose there is a new school in a new district (District = 30, School = 1). What would the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument with allow_new_levels = TRUE.\n\n\nPartially pooled model for schools in districts:\n\nIn this data set, as there is a two-level structure, where schools are nested in districts, we can add another level to the hierarchy. The additional hierarchy formula is: \\(\\mu_{ijk} = \\mu_0 + \\mu_j + \\mu_k\\)\n\n7.22 What is the correct brms formula?\n\n\n7.23 How many parameters are estimated in the partially pooled for schools in districts model?\n\n\n7.24 Suppose there is a new school in an existing district (District = 86, School = 9). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the newdata argument with allow_new_levels = TRUE.\n\n\n7.25 Suppose there is a new school in a new district (District = 30, School = 1). What is the mean prediction for the effect in this school? Use the function`posterior_epred` and the `newdata` argument with `allow_new_levels = TRUE`.\n\n\n7.26 Which of the following statements are true relating to exchangeability assumptions in the models:\n\n\n7.27 After fitting all the models, choose which of the following statements relating to the results are true:", "crumbs": [ "Assignments", "Assignment 7" diff --git a/assignments/template3_files/figure-html/unnamed-chunk-10-2.png b/assignments/template3_files/figure-html/unnamed-chunk-10-2.png index 49f60ed9..72aa2212 100644 Binary files a/assignments/template3_files/figure-html/unnamed-chunk-10-2.png and b/assignments/template3_files/figure-html/unnamed-chunk-10-2.png differ diff --git a/assignments/template7.html b/assignments/template7.html index 49fbfb3d..dd6afbac 100644 --- a/assignments/template7.html +++ b/assignments/template7.html @@ -300,6 +300,9 @@

On this page

@@ -333,7 +336,328 @@

On this page

1 General information

-

This notebook will be published later!

+

This assignment relates to Lecture 7 and Chapter 5.

+

We recommend using JupyterHub (which has all the needed packages pre-installed).

+
+

Reading instructions:

+ +
+
+
+
+ +
+
+General Instructions for Answering the Assignment Questions +
+
+
+
+
+
    +
  • Questions below are exact copies of the text found in the MyCourses quiz and should serve as a notebook where you can store notes and code.
  • +
  • We recommend opening these notebooks in the Aalto JupyterHub, see how to use R and RStudio remotely.
  • +
  • For inspiration for code, have a look at the BDA R Demos and the specific Assignment code notebooks
  • +
  • Recommended additional self study exercises for each chapter in BDA3 are listed in the course web page. These will help to gain deeper understanding of the topic.
  • +
  • Common questions and answers regarding installation and technical problems can be found in Frequently Asked Questions (FAQ).
  • +
  • Deadlines for all assignments can be found on the course web page and in MyCourses.
  • +
  • You are allowed to discuss assignments with your friends, but it is not allowed to copy solutions directly from other students or from internet.
  • +
  • Do not share your answers publicly.
  • +
  • Do not copy answers from the internet or from previous years. We compare the answers to the answers from previous years and to the answers from other students this year.
  • +
  • Use of AI is allowed on the course, but the most of the work needs to by the student, and you need to report whether you used AI and in which way you used them (See points 5 and 6 in Aalto guidelines for use of AI in teaching).
  • +
  • All suspected plagiarism will be reported and investigated. See more about the Aalto University Code of Academic Integrity and Handling Violations Thereof.
  • +
  • If you have any suggestions or improvements to the course material, please post in the course chat feedback channel, create an issue, or submit a pull request to the public repository!
  • +
+
+
+
+
+
if (!require(tidybayes)) {
+    install.packages("tidybayes")
+    library(tidybayes)
+}
+
+
Loading required package: tidybayes
+
+
+
Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
+logical.return = TRUE, : there is no package called 'tidybayes'
+
+
+
Installing package into '/home/runner/work/_temp/Library'
+(as 'lib' is unspecified)
+
+
+
also installing the dependencies 'svUnit', 'arrayhelpers'
+
+
if (!require(brms)) {
+    install.packages("brms")
+    library(brms)
+}
+
+
Loading required package: brms
+
+
+
Loading required package: Rcpp
+
+
+
Loading 'brms' package (version 2.22.0). Useful instructions
+can be found by typing help('brms'). A more detailed introduction
+to the package is available through vignette('brms_overview').
+
+
+

+Attaching package: 'brms'
+
+
+
The following objects are masked from 'package:tidybayes':
+
+    dstudent_t, pstudent_t, qstudent_t, rstudent_t
+
+
+
The following object is masked from 'package:stats':
+
+    ar
+
+
if (!require(metadat)) {
+  install.packages("metadat")
+  library(metadat)
+}
+
+
Loading required package: metadat
+
+
+
Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
+logical.return = TRUE, : there is no package called 'metadat'
+
+
+
Installing package into '/home/runner/work/_temp/Library'
+(as 'lib' is unspecified)
+
+
+
also installing the dependency 'mathjaxr'
+
+
if(!require(cmdstanr)){
+    install.packages("cmdstanr", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))
+    library(cmdstanr)
+}
+
+
Loading required package: cmdstanr
+
+
+
This is cmdstanr version 0.8.1
+
+
+
- CmdStanR documentation and vignettes: mc-stan.org/cmdstanr
+
+
+
- CmdStan path: /home/runner/.cmdstan/cmdstan-2.35.0
+
+
+
- CmdStan version: 2.35.0
+
+
cmdstan_installed <- function(){
+  res <- try(out <- cmdstanr::cmdstan_path(), silent = TRUE)
+  !inherits(res, "try-error")
+}
+if(!cmdstan_installed()){
+    install_cmdstan()
+}
+
+
+
+

2 Simulation warm-up

+

Here is the function to simulate and plot observations from a hierarchical data-generating process.

+
+
hierarchical_sim <- function(group_pop_mean,
+                             between_group_sd,
+                             within_group_sd,
+                             n_groups,
+                             n_obs_per_group
+                             ) {
+  # Generate group means
+  group_means <- rnorm(
+    n = n_groups,
+    mean = group_pop_mean,
+    sd = between_group_sd
+  )
+
+  # Generate observations
+
+  ## Create an empty vector for observations
+  y <- numeric()
+  ## Create a vector for the group identifier
+  group <- rep(1:n_groups, each = n_obs_per_group)
+  
+  for (j in 1:n_groups) {
+    ### Generate one group observations
+    group_y <- rnorm(
+      n = n_obs_per_group,
+      mean = group_means[j],
+      sd = within_group_sd
+    )
+    ### Append the group observations to the vector
+    y <- c(y, group_y)
+  }
+
+  # Combine into a data frame
+  data <- data.frame(
+    group = factor(group),
+    y = y
+  )
+
+  # Plot the data
+  ggplot(data, aes(x = y, y = group)) +
+    geom_point() +
+    geom_vline(xintercept = group_pop_mean, linetype = "dashed")
+}
+
+

Example using the function:

+
+
hierarchical_sim(
+  group_pop_mean = 50,
+  between_group_sd = 5,
+  within_group_sd = 1,
+  n_groups = 10,
+  n_obs_per_group = 5
+  )
+
+
Error in ggplot(data, aes(x = y, y = group)): could not find function "ggplot"
+
+
+
+
+

3 Sleep deprivation

+

The dataset sleepstudy is available by using the command data(sleepstudy, package = "lme4")

+

Below is some code for fitting a brms model. This model is a simple pooled model. You will need to fit a hierarchical model as explained in the assignment, but this code should help getting started.

+

Load the dataset

+
+
data(sleepstudy, package = "lme4")
+
+
Error in find.package(package, lib.loc, verbose = verbose): there is no package called 'lme4'
+
+
+

Specify the formula and observation family:

+
+
sleepstudy_pooled_formula <- bf(
+  Reaction ~ 1 + Days,
+  family = "gaussian",
+  center = FALSE
+)
+
+

We can see the parameters and default priors with

+
+
get_prior(pooled_formula, data = sleepstudy)
+
+
Error: object 'pooled_formula' not found
+
+
+

We can then specify the priors:

+
+
(sleepstudy_pooled_priors <- c(
+  prior(
+    normal(400, 100),
+    class = "b",
+    coef = "Intercept"
+  ),
+  prior(
+    normal(0, 50),
+    class = "b",
+    coef = "Days"
+  ),
+  prior(
+    normal(0, 50),
+    class = "sigma"
+  )
+))
+
+
            prior class      coef group resp dpar nlpar   lb   ub source
+ normal(400, 100)     b Intercept                       <NA> <NA>   user
+    normal(0, 50)     b      Days                       <NA> <NA>   user
+    normal(0, 50) sigma                                 <NA> <NA>   user
+
+
+

And then fit the model:

+
+
sleepstudy_pooled_fit <- brm(
+  formula = pooled_formula,
+  prior = pooled_priors,
+  data = sleepstudy
+)
+
+
Error: object 'pooled_formula' not found
+
+
+

We can inspect the model fit:

+
+
summary(pooled_fit)
+
+
Error: object 'pooled_fit' not found
+
+
+
+
+

4 School calendar

+

Meta-analysis models can be fit in brms. When the standard error is known, the se() function can be used to specify it.

+

The dataset dat.konstantopoulos2011 has the observations for the school calendar intervention meta-analysis.

+
+
data(dat.konstantopoulos2011, package = "metadat")
+
+

As mentioned in the assignment instructions, a unique identifier for school needs to be created by combining the district and school:

+
+
schoolcalendar_data <- dat.konstantopoulos2011 |>
+  dplyr::mutate(
+    school = factor(school),
+    district = factor(district),
+    district_school = interaction(district, school, drop = TRUE, sep = "_")
+  )
+
+

Then the models can be fit

+

+schoolcalendar_pooled_formula <- bf(
+  formula = yi | se(sqrt(vi)) ~ 1,
+  family = "gaussian"
+)  
+
+schoolcalendar_pooled_fit <- brm(
+  formula = schoolcalendar_pooled_formula,
+  data = schoolcalendar_data
+)
+

Predictions for a new school can be made using the posterior_epred function:

+

+new_school <- data.frame(
+  school = factor(1),
+  district = factor(1),
+  district_school = factor("1_1"),
+  vi = 0 # the expectation of the prediction is not affected by the sampling variance, so this can be any number
+)
+  
+
+schoolcalendar_post_epred <- posterior_epred(
+    schoolcalendar_pooled_fit,
+    newdata = new_school,
+    allow_new_levels = TRUE
+  )
+

It can be helpful to plot the posterior estimates. Here is a function that will do this:

+
+
plot_school_posteriors <- function(fit, dataset) {
+  tidybayes::add_predicted_draws(dataset, fit) |>
+    ggplot(
+      aes(
+        x = .prediction,
+        y = interaction(district, school, sep = ", ", lex.order = TRUE))) +
+    tidybayes::stat_halfeye() +
+    ylab("District, school") +
+    xlab("Posterior effect")
+}
+
+

And can be used as follows:

+
plot_school_posteriors(
+  fit = schoolcalendar_pooled_fit,
+  dataset = school_calendar_data
+)
@@ -824,22 +1148,270 @@

1 General informa diff --git a/assignments_gsu.html b/assignments_gsu.html index 611a1c7c..82ff2c41 100644 --- a/assignments_gsu.html +++ b/assignments_gsu.html @@ -6,7 +6,7 @@ - + Bayesian Data Analysis course - Assignments (GSU) – Bayesian Data Analysis course