From 490dc8f254b4bfe724367e45e20e28a55c327bad Mon Sep 17 00:00:00 2001
From: a1ho <a1ho@ucsd.edu>
Date: Mon, 24 Feb 2025 16:56:08 -0800
Subject: [PATCH] disc09 solutions

---
 docs/disc09/index.html | 792 +++++++++++++++++++++++++++++++++++++++++
 pages/disc/disc09.yml  |   2 +-
 2 files changed, 793 insertions(+), 1 deletion(-)
diff --git a/docs/disc09/index.html b/docs/disc09/index.html
index 18b5daf..ef6e2e5 100644
--- a/docs/disc09/index.html
+++ b/docs/disc09/index.html
@@ -156,6 +156,35 @@ <h3 id="problem-1.1">Problem 1.1</h3>
 <li><p><input type="radio" disabled="" /> <code>b / np.sqrt(c)</code></p></li>
 <li><p><input type="radio" disabled="" /> <code>b * np.sqrt(c)</code></p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading1_1">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse1_1" aria-expanded="true" aria-controls="collapse1_1">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse1_1" class="accordion-collapse collapse"
+aria-labelledby="heading1_1" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> <code>b</code></p>
+<p>The function <code>np.std</code> directly calculated the standard
+deviation of array <code>oren</code>. Even though <code>oren</code> is
+sample of the population, its standard deviation is still a pretty good
+estimate for the standard deviation of the population because it is a
+random sample. The other options don’t really make sense in this
+context.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 57%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-1.2">Problem 1.2</h3>
 <p>Which expression best estimates the mean of <code>boots</code>?</p>
@@ -165,6 +194,33 @@ <h3 id="problem-1.2">Problem 1.2</h3>
 <li><p><input type="radio" disabled="" /> <code>(oren - a).mean()</code></p></li>
 <li><p><input type="radio" disabled="" /> <code>(oren - a) / b</code></p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading1_2">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse1_2" aria-expanded="true" aria-controls="collapse1_2">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse1_2" class="accordion-collapse collapse"
+aria-labelledby="heading1_2" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> <code>a</code></p>
+<p>Note that <code>a</code> is equal to the mean of <code>oren</code>,
+which is a pretty good estimator of the mean of the overall population
+as well as the mean of the distribution of sample means. The other
+options don’t really make sense in this context.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 89%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-1.3">Problem 1.3</h3>
 <p>What expression best estimates the standard deviation of
@@ -175,6 +231,35 @@ <h3 id="problem-1.3">Problem 1.3</h3>
 <li><p><input type="radio" disabled="" /> <code>b / np.sqrt(c)</code></p></li>
 <li><p><input type="radio" disabled="" /> <code>(a -b) / np.sqrt(c)</code></p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading1_3">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse1_3" aria-expanded="true" aria-controls="collapse1_3">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse1_3" class="accordion-collapse collapse"
+aria-labelledby="heading1_3" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> <code>b / np.sqrt(c)</code></p>
+<p>Note that we can use the Central Limit Theorem for this problem which
+states that the standard deviation (SD) of the distribution of sample
+means is equal to <code>(population SD) / np.sqrt(sample size)</code>.
+Since the SD of the sample is also the SD of the population in this
+case, we can plug our variables in to see that
+<code>b / np.sqrt(c)</code> is the answer.</p>
+<hr/>
+<h5>Difficulty: ⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 91%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-1.4">Problem 1.4</h3>
 <p>What is the dog price of $560 in standard units?</p>
@@ -185,6 +270,34 @@ <h3 id="problem-1.4">Problem 1.4</h3>
 <li><p><input type="radio" disabled="" /> <code>abs(560 - a) / b</code></p></li>
 <li><p><input type="radio" disabled="" /> <code>abs(560 - a) / (b / np.sqrt(c))</code></p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading1_4">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse1_4" aria-expanded="true" aria-controls="collapse1_4">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse1_4" class="accordion-collapse collapse"
+aria-labelledby="heading1_4" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> <code>(560 - a) / b</code></p>
+<p>To convert a value to standard units, we take the value, subtract the
+mean from it, and divide by SD. In this case that is
+<code>(560 - a) / b</code>, because <code>a</code> is the mean of our
+dog prices sample array and <code>b</code> is the SD of the dog prices
+sample array.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 80%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-1.5">Problem 1.5</h3>
 <p>The distribution of <code>boots</code> is normal because of the
@@ -193,6 +306,33 @@ <h3 id="problem-1.5">Problem 1.5</h3>
 <li><p><input type="radio" disabled="" /> True</p></li>
 <li><p><input type="radio" disabled="" /> False</p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading1_5">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse1_5" aria-expanded="true" aria-controls="collapse1_5">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse1_5" class="accordion-collapse collapse"
+aria-labelledby="heading1_5" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> True</p>
+<p>True. The central limit theorem states that if you have a population
+and you take a sufficiently large number of random samples from the
+population, then the distribution of the sample means will be
+approximately normally distributed.</p>
+<hr/>
+<h5>Difficulty: ⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 91%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-1.6">Problem 1.6</h3>
 <p>If Oren’s sample was 400 dogs instead of 200, the standard deviation
@@ -204,6 +344,34 @@ <h3 id="problem-1.6">Problem 1.6</h3>
 <li><p><input type="radio" disabled="" /> Decrease by a factor of <span class="math inline">\sqrt{2}</span></p></li>
 <li><p><input type="radio" disabled="" /> None of the above</p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading1_6">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse1_6" aria-expanded="true" aria-controls="collapse1_6">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse1_6" class="accordion-collapse collapse"
+aria-labelledby="heading1_6" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> Decrease by a factor of <span class="math inline">\sqrt{2}</span></p>
+<p>Recall that the central limit theorem states that the STD of the
+sample distribution is equal to
+<code>(population STD) / np.sqrt(sample size)</code>. So if we increase
+the sample size by a factor of 2, the STD of the sample distribution
+will decrease by a factor of <span class="math inline">\sqrt{2}</span>.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 80%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-1.7">Problem 1.7</h3>
 <p>If Oren took 4000 bootstrap resamples instead of 1000, the standard
@@ -215,6 +383,33 @@ <h3 id="problem-1.7">Problem 1.7</h3>
 <li><p><input type="radio" disabled="" /> Decrease by a factor of 4</p></li>
 <li><p><input type="radio" disabled="" /> None of the above</p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading1_7">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse1_7" aria-expanded="true" aria-controls="collapse1_7">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse1_7" class="accordion-collapse collapse"
+aria-labelledby="heading1_7" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> None of the above</p>
+<p>Again, from our formula given by the central limit theorem, the
+sample STD doesn’t depend on the number of bootstrap resamples so long
+as it’s “sufficiently large”. Thus increasing our bootstrap sample from
+1000 to 4000 will have no effect on the std of <code>boots</code></p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 74%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-1.8">Problem 1.8</h3>
 <p>Write one line of code that evaluates to the <strong>right
@@ -222,6 +417,40 @@ <h3 id="problem-1.8">Problem 1.8</h3>
 dog price. The following expressions may help:</p>
 <div class="sourceCode" id="cb2"><pre class="sourceCode py"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>stats.norm.cdf(<span class="fl">1.75</span>) <span class="co"># =&gt; 0.96</span></span>
 <span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a>stats.norm.cdf(<span class="fl">1.4</span>)  <span class="co"># =&gt; 0.92</span></span></code></pre></div>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading1_8">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse1_8" aria-expanded="true" aria-controls="collapse1_8">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse1_8" class="accordion-collapse collapse"
+aria-labelledby="heading1_8" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> <code>a + 1.75 * b / np.sqrt(c)</code></p>
+<p>Recall that a 92% confidence interval means an interval that consists
+of the middle 92% of the distribution. In other words, we want to “chop”
+off 4% from either end of the ditribution. Thus to get the right
+endpoint, we want the value corresponding to the 96th percentile in the
+mean dog price distribution, or
+<code>mean + 1.75 * (SD of population / np.sqrt(sample size)</code> or
+<code>a + 1.75 * b / np.sqrt(c)</code> (we divide by
+<code>np.sqrt(c)</code> due to the central limit theorem). Note that the
+second line of information that was given
+<code>stats.norm.cdf(1.4)</code> is irrelavant to this particular
+problem.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 48%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <hr />
 <h2 id="problem-2">Problem 2</h2>
@@ -239,6 +468,34 @@ <h3 id="problem-2.1">Problem 2.1</h3>
 <li><p><input type="radio" disabled="" /> The mean will be approximately equal to 400.</p></li>
 <li><p><input type="radio" disabled="" /> The mean will be approximately equal to 500.</p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading2_1">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse2_1" aria-expanded="true" aria-controls="collapse2_1">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse2_1" class="accordion-collapse collapse"
+aria-labelledby="heading2_1" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> The mean will be approximately equal to
+400.</p>
+<p>The distribution of bootstrapped means’ mean will be approximately
+400 since that is the mean of the sample and bootstrapping is taking
+many samples of the original sample. The mean will not be exactly 400 do
+to some randomness though it will be very close.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 54%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-2.2">Problem 2.2</h3>
 <p>Which of the following is closest to the standard deviation of the
@@ -249,6 +506,32 @@ <h3 id="problem-2.2">Problem 2.2</h3>
 <li><p><input type="radio" disabled="" /> 4</p></li>
 <li><p><input type="radio" disabled="" /> 0.4</p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading2_2">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse2_2" aria-expanded="true" aria-controls="collapse2_2">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse2_2" class="accordion-collapse collapse"
+aria-labelledby="heading2_2" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> 4</p>
+<p>To find the standard deviation of the distribution, we can take the
+sample standard deviation S divided by the square root of the sample
+size. From plugging in, we get 40 / 10 = 4.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 51%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <hr />
 <h2 id="problem-3">Problem 3</h2>
@@ -256,6 +539,77 @@ <h2 id="problem-3">Problem 3</h2>
 and standard deviation 15. What is the probability that your sample has
 a mean between 50 and 53? Input the probability below, as a number
 between 0 and 1, rounded to <strong>two decimal places</strong>.</p>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading3">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse3" aria-expanded="true" aria-controls="collapse3">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse3" class="accordion-collapse collapse"
+aria-labelledby="heading3" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> 0.48</p>
+<p>This problem is testing our understanding of the Central Limit
+Theorem and normal distributions. Recall, the Central Limit Theorem
+tells us that the distribution of the sample mean is roughly normal,
+with the following characteristics:</p>
+<p><span class="math display">\begin{align*}
+\text{Mean of Distribution of Possible Sample Means} &amp;=
+\text{Population Mean} = 50 \\
+\text{SD of Distribution of Possible Sample Means} &amp;=
+\frac{\text{Population SD}}{\sqrt{\text{Sample Size}}} =
+\frac{15}{\sqrt{100}} = 1.5
+\end{align*}
+</span></p>
+<p>Given this information, it may be easier to express the problem as
+“We draw a value from a normal distribution with mean 50 and SD 1.5.
+What is the probability that the value is between 50 and 53?” Note that
+this probability is equal to the <strong>proportion of values between 50
+and 53</strong> in a normal distribution whose mean is 50 and 1.5 (since
+probabilities can be thought of as proportions).</p>
+<p>In class, we typically worked with the <em>standard</em> normal
+distribution, in which the mean was 0, the SD was 1, and the <span class="math inline">x</span>-axis represented values in standard units.
+Let’s convert the quantities of interest in this problem to standard
+units, keeping in mind that the mean and SD we’re using now are the mean
+and SD of the distribution of possible sample means, not of the
+population.</p>
+<ul>
+<li>50 converted to standard units is <span class="math inline">\frac{50
+- \text{mean}}{\text{SD}} = \frac{50 - 50}{1.5} = 0</span> (no
+calculation was necessary – 0 in standard units is equal to the mean in
+original units).</li>
+<li>53 converted to standard units is <span class="math inline">\frac{53
+- \text{mean}}{\text{SD}} = \frac{53 - 50}{1.5} = 2</span>.</li>
+</ul>
+<p>Now, our problem boils down to finding the <strong>proportion of
+values in a standard normal distribution that are between 0 and
+2</strong>, or <strong>the proportion of values in a normal distribution
+that are in the interval <span class="math inline">[\text{mean},
+\text{mean} + 2 \text{ SDs}]</span></strong>.</p>
+<p>From class, we know that in a normal distribution, roughly 95% of
+values are within 2 standard deviations of the mean, i.e. the proportion
+of values in the interval <span class="math inline">[\text{mean} - 2
+\text{ SDs}, \text{mean} + 2 \text{ SDs}]</span> is 0.95.</p>
+<center><img src="../assets/images/wi21-final/normal-solution.png" width="50%"/></center>
+<p>Since the normal distribution is symmetric about the mean, half of
+the values in this interval are to the right of the mean, and half are
+to the left. This means that the proportion of values in the interval
+<span class="math inline">[\text{mean}, \text{mean} + 2 \text{
+SDs}]</span> is <span class="math inline">\frac{0.95}{2} = 0.475</span>,
+which rounds to 0.48, and thus the desired result is 0.48.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 48%.</p>
+</div>
+</div>
+</div>
+</div>
 <hr />
 <h2 id="problem-4">Problem 4</h2>
 <p>The DataFrame <code>apps</code> contains application data for a
@@ -275,6 +629,42 @@ <h3 id="problem-4.1">Problem 4.1</h3>
 <p>Give the endpoints of the CLT-based 95% confidence interval for the
 mean age of all applicants in <code>apps</code>, based on the data in
 <code>hundred_apps</code>.</p>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading4_1">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse4_1" aria-expanded="true" aria-controls="collapse4_1">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse4_1" class="accordion-collapse collapse"
+aria-labelledby="heading4_1" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer:</strong> Left endpoint = 33, Right endpoint = 37</p>
+<p>According to the Central Limit Theorem, the standard deviation of the
+distribution of the sample mean is <span class="math inline">\frac{\text{sample SD}}{\sqrt{\text{sample size}}} =
+\frac{10}{\sqrt{100}} = 1</span>. Then using the fact that the
+distribution of the sample mean is roughly normal, since 95% of the area
+of a normal curve falls within two standard deviations of the mean, we
+can find the endpoints of the 95% CLT-based confidence interval as <span class="math inline">35 - 2 = 33</span> and <span class="math inline">35
++ 2 = 37</span>.</p>
+<p>We can think of this as using the formula below: <span class="math display">
+\left[\text{sample mean} - 2\cdot \frac{\text{sample
+SD}}{\sqrt{\text{sample size}}}, \: \text{sample mean} + 2\cdot
+\frac{\text{sample SD}}{\sqrt{\text{sample size}}}
+\right].</span> Plugging in the appropriate quantities yields <span class="math inline">[35 - 2\cdot\frac{10}{\sqrt{100}}, 35 -
+2\cdot\frac{10}{\sqrt{100}}] = [33, 37]</span>.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 67%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-4.2">Problem 4.2</h3>
 <p>BruinCard reinstates our access to <code>apps</code> so that we can
@@ -292,6 +682,41 @@ <h3 id="problem-4.2">Problem 4.2</h3>
 distribution of <code>sample_means</code>?</p>
 <center><img src='../assets/images/fa22-final/3hist.png' width=45%></center>
 <p><br></p>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading4_2">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse4_2" aria-expanded="true" aria-controls="collapse4_2">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse4_2" class="accordion-collapse collapse"
+aria-labelledby="heading4_2" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer:</strong> Option 1</p>
+<p>As we found in the previous part, the distribution of the sample mean
+should have a standard deviation of 1. We also know it should be
+centered at the mean of our sample, at 35, but since all the options are
+centered here, that’s not too helpful. Only Option 1, however, has a
+standard deviation of 1. Remember, we can approximate the standard
+deviation of a normal curve as the distance between the mean and either
+of the inflection points. Only Option 1 looks like it has inflection
+points at 34 and 36, a distance of 1 from the mean of 35.</p>
+<p>If you chose Option 2, you probably confused the standard deviation
+of our original sample, 10, with the standard deviation of the
+distribution of the sample mean, which comes from dividing that value by
+the square root of the sample size.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 57%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-4.3">Problem 4.3</h3>
 <p>Which of the following statements are guaranteed to be true? Select
@@ -311,6 +736,83 @@ <h3 id="problem-4.3">Problem 4.3</h3>
 age of applicants in <code>apps</code>.</p></li>
 <li><p><input type="checkbox" disabled="" /> None of the above.</p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading4_3">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse4_3" aria-expanded="true" aria-controls="collapse4_3">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse4_3" class="accordion-collapse collapse"
+aria-labelledby="heading4_3" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer:</strong> A CLT-based 90% confidence interval for the
+mean age of credit card applicants, based on the data in
+<code>hundred_apps</code>, would be narrower than the interval you gave
+in part (a).</p>
+<p>Let’s analyze each of the options:</p>
+<ul>
+<li><p>Option 1: We are not using bootstrapping to compute sample means
+since we are sampling from the <code>apps</code> DataFrame, which is our
+population here. If we were bootstrapping, we’d need to sample from our
+first sample, which is <code>hundred_apps</code>.</p></li>
+<li><p>Option 2: We can’t be sure what the distribution of the ages of
+credit card applicants are. The Central Limit Theorem says that the
+distribution of <code>sample_means</code> is roughly normally
+distributed, but we know nothing about the population
+distribution.</p></li>
+<li><p>Option 3: The CLT-based 95% confidence interval that we
+calculated in part (a) was computed as follows: <span class="math display">\left[\text{sample mean} - 2\cdot
+\frac{\text{sample SD}}{\sqrt{\text{sample size}}},
+\text{sample mean} + 2\cdot \frac{\text{sample SD}}{\sqrt{\text{sample
+size}}}
+\right]</span> A CLT-based 90% confidence interval would be computed as
+<span class="math display">\left[\text{sample mean} - z\cdot
+\frac{\text{sample SD}}{\sqrt{\text{sample size}}},
+\text{sample mean} + z\cdot \frac{\text{sample SD}}{\sqrt{\text{sample
+size}}}
+\right]</span> for some value of <span class="math inline">z</span> less
+than 2. We know that 95% of the area of a normal curve is within two
+standard deviations of the mean, so to only pick up 90% of the area,
+we’d have to go slightly less than 2 standard deviations away. This
+means the 90% confidence interval will be narrower than the 95%
+confidence interval.</p></li>
+<li><p>Option 4: The left endpoint of the interval from part (a) was
+calculated using the Central Limit Theorem, whereas using
+<code>np.percentile(sample_means, 2.5)</code> is calculated empirically,
+using the data in <code>sample_means</code>. Empirically calculating a
+confidence interval doesn’t necessarily always give the exact same
+endpoints as using the Central Limit Theorem, but it should give you
+values close to those endpoints. These values are likely very similar
+but they are not guaranteed to be the same. One way to see this is that
+if we ran the code to generate <code>sample_means</code> again, we’d
+probably get a different value for
+<code>np.percentile(sample_means, 2.5)</code>.</p></li>
+<li><p>Option 5: The key observation is that if we used the data in
+<code>hundred_apps</code> to create 1,000 CLT-based 95% confidence
+intervals for the mean age of applicants in <code>apps</code>, all of
+these intervals would be exactly the same. Given a sample, there is only
+one CLT-based 95% confidence interval associated with it. In our case,
+given the sample <code>hundred_apps</code>, the one and only CLT-based
+95% confidence interval based on this sample is the one we found in part
+(a). Therefore if we generated 1,000 of these intervals, either they
+would all contain the parameter or none of them would. In order for a
+statement like the one here to be true, we would need to collect 1,000
+different samples, and calculate a confidence interval from each
+one.</p></li>
+</ul>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 49%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <hr />
 <h2 id="problem-5">Problem 5</h2>
@@ -323,6 +825,118 @@ <h2 id="problem-5">Problem 5</h2>
 dataset of 0’s and 1’s is no more than 0.5, calculate the minimum number
 of people you would need to survey. Input your answer below, as an
 <strong>integer</strong>.</p>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading5">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse5" aria-expanded="true" aria-controls="collapse5">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse5" class="accordion-collapse collapse"
+aria-labelledby="heading5" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer: </strong> 625</p>
+<p><em>Note: Before reviewing these solutions, it’s highly recommended
+to revisit the lecture on “Choosing Sample Sizes,” since this problem
+follows the main example from that lecture almost exactly.</em></p>
+<p>While this solution is long, keep in mind from the start that our
+goal is to solve for the <strong>smallest sample size necessary</strong>
+to create a confidence interval that achieves certain criteria.</p>
+<p>The Central Limit Theorem tells us that the distribution of the
+sample mean is roughly normal, regardless of the distribution of the
+population from which the samples are drawn. At first, it may not be
+clear how the Central Limit Theorem is relevant, but remember that
+proportions are means too – for instance, the proportion of adults who
+want to be vaccinated is equal to the mean of a collection of 1s and 0s,
+where we have a 1 for each adult that wants to be vaccinated and a 0 for
+each adult who doesn’t want to be vaccinated. What this means (😉) is
+that <strong>the Central Limit Theorem applies to the distribution of
+the sample proportion, so we can use it here too</strong>.</p>
+<p>Not only do we know that the distribution of sample proportions is
+roughly normal, but we know its mean and standard deviation, too:</p>
+<p><span class="math display">\begin{align*}
+\text{Mean of Distribution of Possible Sample Means} &amp;=
+\text{Population Mean} = \text{Population Proportion} \\
+\text{SD of Distribution of Possible Sample Means} &amp;=
+\frac{\text{Population SD}}{\sqrt{\text{Sample Size}}}
+\end{align*}
+</span></p>
+<p>Using this information, we can create a 95% confidence interval for
+the population proportion, using the fact that in a normal distribution,
+roughly 95% of values are within 2 standard deviations of the mean:</p>
+<p><span class="math display">\left[ \text{Population Proportion} - 2
+\cdot \frac{\text{Population SD}}{\sqrt{\text{Sample Size}}}, \:
+\text{Population Proportion} + 2 \cdot \frac{\text{Population
+SD}}{\sqrt{\text{Sample Size}}}  \right]</span></p>
+<p>However, this interval depends on the population proportion (mean)
+and SD, which we don’t know. (If we did know these parameters, there
+would be no need to collect a sample!) Instead, we’ll use the sample
+proportion and SD as rough estimates:</p>
+<p><span class="math display">\left[ \text{Sample Proportion} - 2 \cdot
+\frac{\text{Sample SD}}{\sqrt{\text{Sample Size}}}, \: \text{Sample
+Proportion} + 2 \cdot \frac{\text{Sample SD}}{\sqrt{\text{Sample
+Size}}}  \right]</span></p>
+<p>Note that the width of this interval – that is, its right endpoint
+minus its left endpoint – is: <span class="math display">\text{width} =
+4 \cdot \frac{\text{Sample SD}}{\sqrt{\text{Sample Size}}}</span></p>
+<p>In the problem, we’re told that we want our interval to be accurate
+to within 0.04, which is equivalent to wanting the width of our interval
+to be less than or equal to 0.08 (since the interval extends the same
+amount above and below the sample proportion). As such, we need to pick
+the <strong>smallest sample size necessary</strong> such that:</p>
+<p><span class="math display">\text{width} = 4 \cdot \frac{\text{Sample
+SD}}{\sqrt{\text{Sample Size}}} \leq 0.08</span></p>
+<p>We can re-arrange the inequality above to solve for our sample’s
+size:</p>
+<p><span class="math display">
+\begin{align*}
+4 \cdot \frac{\text{Sample SD}}{\sqrt{\text{Sample Size}}} &amp;\leq
+0.08 \\
+\frac{\text{Sample SD}}{\sqrt{\text{Sample Size}}} &amp;\leq 0.02 \\
+\frac{1}{\sqrt{\text{Sample Size}}} &amp;\leq \frac{0.02}{\text{Sample
+SD}} \\
+\frac{\text{Sample SD}}{0.02} &amp;\leq \sqrt{\text{Sample Size}} \\
+\left( \frac{\text{Sample SD}}{0.02} \right)^2 &amp;\leq \text{Sample
+Size}
+\end{align*}
+</span></p>
+<p>All we now need to do is pick the smallest sample size that satisfies
+the above inequality. But there’s an issue – <strong>we don’t know what
+our sample SD is, because we haven’t collected our sample</strong>!
+Notice that in the inequality above, as the sample SD increases, so does
+the minimum necessary sample size. In order to ensure we don’t collect
+too small of a sample (which would result in the width of our confidence
+interval being <em>larger</em> than desired), we can use an upper bound
+for the SD of our sample. In the problem, we’re told that the largest
+possible SD of a sample of 0s and 1s is 0.5 – this means that if we
+replace our sample SD with 0.5, we will find a sample size such that the
+width of our confidence interval is guaranteed to be less than or equal
+to 0.08. This sample size may be larger than necessary, but that’s
+better than it being smaller than necessary.</p>
+<p>By substituting 0.5 for the sample SD in the last inequality above,
+we get</p>
+<p><span class="math display">
+\begin{align*}
+\left( \frac{\text{Sample SD}}{0.02} \right)^2 &amp;\leq \text{Sample
+Size} \\\
+\left( \frac{0.5}{0.02} \right)^2 &amp;\leq \text{Sample Size} \\
+25^2 &amp;\leq \text{Sample Size} \implies \text{Sample Size} \geq 625
+\end{align*}
+</span></p>
+<p>We need to pick the smallest possible sample size that is greater
+than or equal to 625; that’s just 625.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 40%.</p>
+</div>
+</div>
+</div>
+</div>
 <hr />
 <h2 id="problem-6">Problem 6</h2>
 <p>It’s your first time playing a new game called <em>Brunch Menu</em>.
@@ -353,6 +967,55 @@ <h3 id="problem-6.1">Problem 6.1</h3>
 <li><p><input type="radio" disabled="" /> <span class="math inline">\frac{17}{81}</span></p></li>
 <li><p><input type="radio" disabled="" /> <span class="math inline">\frac{17}{96}</span></p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading6_1">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse6_1" aria-expanded="true" aria-controls="collapse6_1">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse6_1" class="accordion-collapse collapse"
+aria-labelledby="heading6_1" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer:</strong> <span class="math inline">\frac{17}{27}</span></p>
+<p>A Central Limit Theorem-based 95% confidence interval for a
+population proportion is given by the following:</p>
+<p><span class="math display">\left[ \text{Sample Proportion} - 2 \cdot
+\frac{\text{Sample SD}}{\sqrt{\text{Sample Size}}}, \text{Sample
+Proportion} + 2 \cdot \frac{\text{Sample SD}}{\sqrt{\text{Sample Size}}}
+\right]</span></p>
+<p>Note that this interval uses the fact that (about) 95% of values in a
+normal distribution are within 2 standard deviations of the mean. It’s
+key to divide by <span class="math inline">\sqrt{\text{Sample
+Size}}</span> when computing the standard deviation because the
+distribution that is roughly normal is the distribution of the sample
+mean (and hence, sample proportion), not the distribution of the sample
+itself.</p>
+<p>The width of the above interval – that is, the right endpoint minus
+the left endpoint – is</p>
+<p><span class="math display">\text{width} = 4 \cdot \frac{\text{Sample
+SD}}{\sqrt{\text{Sample Size}}}</span></p>
+<p>From the provided hint, we have that</p>
+<p><span class="math display">\text{Sample SD} = \sqrt{(\text{Prop. of
+0s}) \cdot (\text{Prop of 1s})} = \sqrt{\frac{3}{9} \cdot \frac{6}{9}} =
+\frac{\sqrt{18}}{9}</span></p>
+<p>Then, since we know that the sample size is 9 and that <span class="math inline">\sqrt{18}</span> is about <span class="math inline">\frac{17}{4}</span>, we have</p>
+<p><span class="math display">\text{width} =  4 \cdot \frac{\text{Sample
+SD}}{\sqrt{\text{Sample Size}}} = 4 \cdot
+\frac{\frac{\sqrt{18}}{9}}{\sqrt{9}} = 4 \cdot \frac{\sqrt{18}}{9 \cdot
+3} = 4 \cdot \frac{\frac{17}{4}}{27} = \frac{17}{27}</span></p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 51%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-6.2">Problem 6.2</h3>
 <p>Which of the following are limitations of trying to use the Central
@@ -367,6 +1030,44 @@ <h3 id="problem-6.2">Problem 6.2</h3>
 been normally distributed.</p></li>
 <li><p><input type="checkbox" disabled="" /> The CLT is for sample means and sums, not sample proportions.</p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading6_2">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse6_2" aria-expanded="true" aria-controls="collapse6_2">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse6_2" class="accordion-collapse collapse"
+aria-labelledby="heading6_2" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer:</strong> Options 1 and 2</p>
+<p><strong>Option 1:</strong> We use Central Limit Theorem (CLT) for
+large random samples, and a sample of 9 is considered to be very small.
+This makes it difficult to use CLT for this problem.</p>
+<p><strong>Option 2:</strong> Recall CLT happens when our sample is
+drawn with replacement. When we are handed nine cards we are never
+replacing cards back into our deck, which means that we are sampling
+without replacement.</p>
+<p><strong>Option 3:</strong> This is wrong because CLT states that a
+large sample is approximately a normal distribution even if the data
+itself is not normally distributed. This means it doesn’t matter if our
+data had not been normally distributed if we had a large enough sample
+we could use CLT.</p>
+<p><strong>Option 4:</strong> This is wrong because CLT does apply to
+the sample proportion distribution. Recall that proportions can be
+treated like means.</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 77%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <hr />
 <h2 id="problem-7">Problem 7</h2>
@@ -378,6 +1079,28 @@ <h3 id="problem-7.1">Problem 7.1</h3>
 <strong>widest possible width</strong> for the resulting confidence
 interval? Give your answer as a <strong>fully simplified
 fraction</strong>.</p>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading7_1">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse7_1" aria-expanded="true" aria-controls="collapse7_1">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse7_1" class="accordion-collapse collapse"
+aria-labelledby="heading7_1" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer:</strong> <span class="math inline">\frac{1}{15}</span> </p>
+<hr/>
+<h5>Difficulty:
+⭐️⭐️⭐️⭐️</h5>
+<p>The average score on this problem was 38%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-7.2">Problem 7.2</h3>
 <p>If you decide to survey 450 students instead of 900 students for your
@@ -388,6 +1111,29 @@ <h3 id="problem-7.2">Problem 7.2</h3>
 <li><p><input type="radio" disabled="" /> increase by more than double</p></li>
 <li><p><input type="radio" disabled="" /> increase by less than double</p></li>
 </ul>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading7_2">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse7_2" aria-expanded="true" aria-controls="collapse7_2">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse7_2" class="accordion-collapse collapse"
+aria-labelledby="heading7_2" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer:</strong> increase by less than double</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 60%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <hr />
 <h2 id="problem-8">Problem 8</h2>
@@ -404,12 +1150,58 @@ <h3 id="problem-8.1">Problem 8.1</h3>
 proportion must be at most <span class="math inline">T</span>. What is
 <span class="math inline">T</span>? Give your answer as an exact
 decimal.</p>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading8_1">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse8_1" aria-expanded="true" aria-controls="collapse8_1">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse8_1" class="accordion-collapse collapse"
+aria-labelledby="heading8_1" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer</strong>: 0.025</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 46%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <h3 id="problem-8.2">Problem 8.2</h3>
 <p>Using the fact that the standard deviation of any dataset of 0s and
 1s is no more than 0.5, calculate the minimum number of people you would
 need to survey so that the width of your confidence interval is at most
 0.10. Give your answer as an integer.</p>
+<div id="accordionExample" class="accordion">
+<div class="accordion-item">
+<h2 class="accordion-header" id="heading8_2">
+<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapse8_2" aria-expanded="true" aria-controls="collapse8_2">
+Click to view the solution.
+</button>
+</h2>
+<div id="collapse8_2" class="accordion-collapse collapse"
+aria-labelledby="heading8_2" data-bs-parent="#accordionExample">
+<div class="accordion-body">
+<header id="title-block-header">
+<h1 class="title"> </h1>
+</header>
+<p><strong>Answer</strong>: 400</p>
+<hr/>
+<h5>Difficulty: ⭐️⭐️</h5>
+<p>
+</p>
+<p>The average score on this problem was 81%.</p>
+</div>
+</div>
+</div>
+</div>
 <p><br></p>
 <hr />
 <h2 id="section"><span class="math display"> </span></h2>
diff --git a/pages/disc/disc09.yml b/pages/disc/disc09.yml
index a85f11d..acd12b1 100644
--- a/pages/disc/disc09.yml
+++ b/pages/disc/disc09.yml
@@ -3,7 +3,7 @@ context: >
   These problems are taken from past quizzes and exams. Work on them **on paper**, since the quizzes and exams you take in this course will also be on paper. 
   <br><br>We encourage you to complete these problems during discussion section. 
   Solutions will be made available after all discussion sections have concluded. You don't need to submit your answers anywhere.<br><br><b>Note: We do not plan to cover all of these problems during the discussion section</b>; the problems we don't cover can be used for extra practice.
-show_solution: false
+show_solution: true
 problems:
  - su22-final/q8-oren-stats
  - wi21-final/q22-bootstrap