Skip to content

Commit

Permalink
adding work done during lecture
Browse files Browse the repository at this point in the history
  • Loading branch information
rafalab committed Nov 27, 2023
1 parent 13ea128 commit ddc23d8
Show file tree
Hide file tree
Showing 8 changed files with 117 additions and 2 deletions.
56 changes: 56 additions & 0 deletions 33-smoothing.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -574,4 +574,60 @@ $$
head(pr_death_counts)
```

Explore the data and notice we should remove missing or incomplete values

```{r}
pr_death_counts |>
ggplot(aes(date, deaths)) +
geom_point()
dat <- pr_death_counts |> filter(date < make_date(2018, 4, 1))
```


We start by fitting a seasonal model. We will fit the model to 2015-2016 and then obtain residulas for 2017-2018

```{r}
dat <- mutate(dat, tt = as.numeric(date - make_date(2017, 9, 20))) |>
filter(!is.na(deaths)) |>
mutate(x1 = sin(2*pi*tt/365), x2 = cos(2*pi*tt/365),
x3 = sin(4*pi*tt/365), x4 = cos(4*pi*tt/365))
seasonal_fit <- lm(deaths ~ x1 + x2 + x3 + x4, data = filter(dat, year(date) < 2017))
```

Check the fit

```{r}
with(dat, plot(tt, deaths))
s <- predict(seasonal_fit, newdata = dat)
lines(dat$tt, s, col = 2, lwd = 2)
```


Let's now fit smooth curve to the residuals

```{r}
dat <- mutate(dat, resid = deaths - s)
fit <- loess(resid ~ tt, span = 1/20, degree = 1, data = dat)
fit <- predict(fit, se = TRUE)
dat <- mutate(dat, f = fit$fit) |>
mutate(lower = f - qnorm(0.995)*fit$se.fit, upper = f + qnorm(.995)*fit$se.fit)
dat |> ggplot(aes(x = date)) +
geom_point(aes(y = resid)) +
geom_line(aes(y = f), color = "red")
```

Is there some evidence of indirect effect after the first week after landfall?

```{r}
dat |>
filter(year(date) >= 2017) |>
ggplot(aes(date, f, ymin = lower, ymax = upper)) +
geom_ribbon(alpha = 0.5) +
geom_line(color = "red") +
geom_hline(yintercept = 0) +
theme_bw()
```

Note that this is only an exploratory analysis. A formal analysis would take into account changing demographics and perhaps fit a model with a discontinuity on the day the hurricane made landfall.
59 changes: 59 additions & 0 deletions docs/33-smoothing.html
Original file line number Diff line number Diff line change
Expand Up @@ -835,6 +835,65 @@ <h2 data-number="34.7" class="anchored" data-anchor-id="case-study-estimating-in
<div class="cell" data-hash="33-smoothing_cache/html/unnamed-chunk-4_723154c76e938bc69641b89ca170d608">
<div class="sourceCode cell-code" id="cb14"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="fu">head</span>(pr_death_counts)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<p>Explore the data and notice we should remove missing or incomplete values</p>
<div class="cell" data-hash="33-smoothing_cache/html/unnamed-chunk-5_37464810bb977d58c953df3f620fec16">
<div class="sourceCode cell-code" id="cb15"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a>pr_death_counts <span class="sc">|&gt;</span></span>
<span id="cb15-2"><a href="#cb15-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">ggplot</span>(<span class="fu">aes</span>(date, deaths)) <span class="sc">+</span></span>
<span id="cb15-3"><a href="#cb15-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning: Removed 4 rows containing missing values (`geom_point()`).</code></pre>
</div>
<div class="cell-output-display">
<p><img src="33-smoothing_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid" width="672"></p>
</div>
<div class="sourceCode cell-code" id="cb17"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a>dat <span class="ot">&lt;-</span> pr_death_counts <span class="sc">|&gt;</span> <span class="fu">filter</span>(date <span class="sc">&lt;</span> <span class="fu">make_date</span>(<span class="dv">2018</span>, <span class="dv">4</span>, <span class="dv">1</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<p>We start by fitting a seasonal model. We will fit the model to 2015-2016 and then obtain residulas for 2017-2018</p>
<div class="cell" data-hash="33-smoothing_cache/html/unnamed-chunk-6_63bfa258211141a871fe57a032cc43f0">
<div class="sourceCode cell-code" id="cb18"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a>dat <span class="ot">&lt;-</span> <span class="fu">mutate</span>(dat, <span class="at">tt =</span> <span class="fu">as.numeric</span>(date <span class="sc">-</span> <span class="fu">make_date</span>(<span class="dv">2017</span>, <span class="dv">9</span>, <span class="dv">20</span>))) <span class="sc">|&gt;</span></span>
<span id="cb18-2"><a href="#cb18-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(deaths)) <span class="sc">|&gt;</span></span>
<span id="cb18-3"><a href="#cb18-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">mutate</span>(<span class="at">x1 =</span> <span class="fu">sin</span>(<span class="dv">2</span><span class="sc">*</span>pi<span class="sc">*</span>tt<span class="sc">/</span><span class="dv">365</span>), <span class="at">x2 =</span> <span class="fu">cos</span>(<span class="dv">2</span><span class="sc">*</span>pi<span class="sc">*</span>tt<span class="sc">/</span><span class="dv">365</span>),</span>
<span id="cb18-4"><a href="#cb18-4" aria-hidden="true" tabindex="-1"></a> <span class="at">x3 =</span> <span class="fu">sin</span>(<span class="dv">4</span><span class="sc">*</span>pi<span class="sc">*</span>tt<span class="sc">/</span><span class="dv">365</span>), <span class="at">x4 =</span> <span class="fu">cos</span>(<span class="dv">4</span><span class="sc">*</span>pi<span class="sc">*</span>tt<span class="sc">/</span><span class="dv">365</span>)) </span>
<span id="cb18-5"><a href="#cb18-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb18-6"><a href="#cb18-6" aria-hidden="true" tabindex="-1"></a>seasonal_fit <span class="ot">&lt;-</span> <span class="fu">lm</span>(deaths <span class="sc">~</span> x1 <span class="sc">+</span> x2 <span class="sc">+</span> x3 <span class="sc">+</span> x4, <span class="at">data =</span> <span class="fu">filter</span>(dat, <span class="fu">year</span>(date) <span class="sc">&lt;</span> <span class="dv">2017</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<p>Check the fit</p>
<div class="cell" data-hash="33-smoothing_cache/html/unnamed-chunk-7_d0824b7011b17acc9af3d58c0046890f">
<div class="sourceCode cell-code" id="cb19"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb19-1"><a href="#cb19-1" aria-hidden="true" tabindex="-1"></a><span class="fu">with</span>(dat, <span class="fu">plot</span>(tt, deaths))</span>
<span id="cb19-2"><a href="#cb19-2" aria-hidden="true" tabindex="-1"></a>s <span class="ot">&lt;-</span> <span class="fu">predict</span>(seasonal_fit, <span class="at">newdata =</span> dat)</span>
<span id="cb19-3"><a href="#cb19-3" aria-hidden="true" tabindex="-1"></a><span class="fu">lines</span>(dat<span class="sc">$</span>tt, s, <span class="at">col =</span> <span class="dv">2</span>, <span class="at">lwd =</span> <span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output-display">
<p><img src="33-smoothing_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid" width="672"></p>
</div>
</div>
<p>Let’s now fit smooth curve to the residuals</p>
<div class="cell" data-hash="33-smoothing_cache/html/unnamed-chunk-8_8db0c83b75fb8e0be5aa93314c1ee797">
<div class="sourceCode cell-code" id="cb20"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a>dat <span class="ot">&lt;-</span> <span class="fu">mutate</span>(dat, <span class="at">resid =</span> deaths <span class="sc">-</span> s)</span>
<span id="cb20-2"><a href="#cb20-2" aria-hidden="true" tabindex="-1"></a>fit <span class="ot">&lt;-</span> <span class="fu">loess</span>(resid <span class="sc">~</span> tt, <span class="at">span =</span> <span class="dv">1</span><span class="sc">/</span><span class="dv">20</span>, <span class="at">degree =</span> <span class="dv">1</span>, <span class="at">data =</span> dat)</span>
<span id="cb20-3"><a href="#cb20-3" aria-hidden="true" tabindex="-1"></a>fit <span class="ot">&lt;-</span> <span class="fu">predict</span>(fit, <span class="at">se =</span> <span class="cn">TRUE</span>)</span>
<span id="cb20-4"><a href="#cb20-4" aria-hidden="true" tabindex="-1"></a>dat <span class="ot">&lt;-</span> <span class="fu">mutate</span>(dat, <span class="at">f =</span> fit<span class="sc">$</span>fit) <span class="sc">|&gt;</span></span>
<span id="cb20-5"><a href="#cb20-5" aria-hidden="true" tabindex="-1"></a> <span class="fu">mutate</span>(<span class="at">lower =</span> f <span class="sc">-</span> <span class="fu">qnorm</span>(<span class="fl">0.995</span>)<span class="sc">*</span>fit<span class="sc">$</span>se.fit, <span class="at">upper =</span> f <span class="sc">+</span> <span class="fu">qnorm</span>(.<span class="dv">995</span>)<span class="sc">*</span>fit<span class="sc">$</span>se.fit)</span>
<span id="cb20-6"><a href="#cb20-6" aria-hidden="true" tabindex="-1"></a>dat <span class="sc">|&gt;</span> <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> date)) <span class="sc">+</span></span>
<span id="cb20-7"><a href="#cb20-7" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>(<span class="fu">aes</span>(<span class="at">y =</span> resid)) <span class="sc">+</span></span>
<span id="cb20-8"><a href="#cb20-8" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_line</span>(<span class="fu">aes</span>(<span class="at">y =</span> f), <span class="at">color =</span> <span class="st">"red"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output-display">
<p><img src="33-smoothing_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid" width="672"></p>
</div>
</div>
<p>Is there some evidence of indirect effect after the first week after landfall?</p>
<div class="cell" data-hash="33-smoothing_cache/html/unnamed-chunk-9_9b674bb377bc629c626b2e7ca131076a">
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a>dat <span class="sc">|&gt;</span> </span>
<span id="cb21-2"><a href="#cb21-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">filter</span>(<span class="fu">year</span>(date) <span class="sc">&gt;=</span> <span class="dv">2017</span>) <span class="sc">|&gt;</span></span>
<span id="cb21-3"><a href="#cb21-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">ggplot</span>(<span class="fu">aes</span>(date, f, <span class="at">ymin =</span> lower, <span class="at">ymax =</span> upper)) <span class="sc">+</span></span>
<span id="cb21-4"><a href="#cb21-4" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_ribbon</span>(<span class="at">alpha =</span> <span class="fl">0.5</span>) <span class="sc">+</span></span>
<span id="cb21-5"><a href="#cb21-5" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_line</span>(<span class="at">color =</span> <span class="st">"red"</span>) <span class="sc">+</span></span>
<span id="cb21-6"><a href="#cb21-6" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_hline</span>(<span class="at">yintercept =</span> <span class="dv">0</span>) <span class="sc">+</span></span>
<span id="cb21-7"><a href="#cb21-7" aria-hidden="true" tabindex="-1"></a> <span class="fu">theme_bw</span>()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output-display">
<p><img src="33-smoothing_files/figure-html/unnamed-chunk-9-1.png" class="img-fluid" width="672"></p>
</div>
</div>
<p>Note that this is only an exploratory analysis. A formal analysis would take into account changing demographics and perhaps fit a model with a discontinuity on the day the hurricane made landfall.</p>


</section>
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/search.json
Original file line number Diff line number Diff line change
Expand Up @@ -1915,6 +1915,6 @@
"href": "33-smoothing.html#case-study-estimating-indirect-effects-of-natural-dissasters",
"title": "34  Smoothing",
"section": "34.7 Case Study: Estimating indirect effects of Natural Dissasters",
"text": "34.7 Case Study: Estimating indirect effects of Natural Dissasters\n\nhead(pr_death_counts)"
"text": "34.7 Case Study: Estimating indirect effects of Natural Dissasters\n\nhead(pr_death_counts)\n\nExplore the data and notice we should remove missing or incomplete values\n\npr_death_counts |&gt;\n ggplot(aes(date, deaths)) +\n geom_point()\n\nWarning: Removed 4 rows containing missing values (`geom_point()`).\n\n\n\n\ndat &lt;- pr_death_counts |&gt; filter(date &lt; make_date(2018, 4, 1))\n\nWe start by fitting a seasonal model. We will fit the model to 2015-2016 and then obtain residulas for 2017-2018\n\ndat &lt;- mutate(dat, tt = as.numeric(date - make_date(2017, 9, 20))) |&gt;\n filter(!is.na(deaths)) |&gt;\n mutate(x1 = sin(2*pi*tt/365), x2 = cos(2*pi*tt/365),\n x3 = sin(4*pi*tt/365), x4 = cos(4*pi*tt/365)) \n\nseasonal_fit &lt;- lm(deaths ~ x1 + x2 + x3 + x4, data = filter(dat, year(date) &lt; 2017))\n\nCheck the fit\n\nwith(dat, plot(tt, deaths))\ns &lt;- predict(seasonal_fit, newdata = dat)\nlines(dat$tt, s, col = 2, lwd = 2)\n\n\n\n\nLet’s now fit smooth curve to the residuals\n\ndat &lt;- mutate(dat, resid = deaths - s)\nfit &lt;- loess(resid ~ tt, span = 1/20, degree = 1, data = dat)\nfit &lt;- predict(fit, se = TRUE)\ndat &lt;- mutate(dat, f = fit$fit) |&gt;\n mutate(lower = f - qnorm(0.995)*fit$se.fit, upper = f + qnorm(.995)*fit$se.fit)\ndat |&gt; ggplot(aes(x = date)) +\n geom_point(aes(y = resid)) +\n geom_line(aes(y = f), color = \"red\")\n\n\n\n\nIs there some evidence of indirect effect after the first week after landfall?\n\ndat |&gt; \n filter(year(date) &gt;= 2017) |&gt;\n ggplot(aes(date, f, ymin = lower, ymax = upper)) +\n geom_ribbon(alpha = 0.5) +\n geom_line(color = \"red\") +\n geom_hline(yintercept = 0) +\n theme_bw()\n\n\n\n\nNote that this is only an exploratory analysis. A formal analysis would take into account changing demographics and perhaps fit a model with a discontinuity on the day the hurricane made landfall."
}
]
2 changes: 1 addition & 1 deletion docs/sitemap.xml
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,6 @@
</url>
<url>
<loc>http://datasciencelabs.github.io/2023/33-smoothing.html</loc>
<lastmod>2023-11-27T14:46:28.983Z</lastmod>
<lastmod>2023-11-27T17:43:41.566Z</lastmod>
</url>
</urlset>

0 comments on commit ddc23d8

Please sign in to comment.