-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy path07-functions.html
193 lines (193 loc) · 13.6 KB
/
07-functions.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<title>Software Carpentry: R for reproducible scientific analysis</title>
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap.css" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap-theme.css" />
<link rel="stylesheet" type="text/css" href="css/swc.css" />
<link rel="alternate" type="application/rss+xml" title="Software Carpentry Blog" href="http://software-carpentry.org/feed.xml"/>
<meta charset="UTF-8" />
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
</head>
<body class="lesson">
<div class="container card">
<div class="banner">
<a href="http://software-carpentry.org" title="Software Carpentry">
<img alt="Software Carpentry banner" src="img/software-carpentry-banner.png" />
</a>
</div>
<article>
<div class="row">
<div class="col-md-10 col-md-offset-1">
<a href="index.html"><h1 class="title">R for reproducible scientific analysis</h1></a>
<h2 class="subtitle">Creating functions</h2>
<section class="objectives panel panel-warning">
<div class="panel-heading">
<h2 id="learning-objectives"><span class="glyphicon glyphicon-certificate"></span>Learning objectives</h2>
</div>
<div class="panel-body">
<ul>
<li>Define a function that takes arguments.</li>
<li>Return a value from a function.</li>
<li>Test a function.</li>
<li>Set default values for function arguments.</li>
<li>Explain why we should divide programs into small, single-purpose functions.</li>
</ul>
</div>
</section>
<p>Any operation you will perform more than once can be put into a function. That way, rather than retyping all the commands (and potentially making errors), you can simply call the function, passing it a new dataset or parameters. This may seem cumbersome at first, but writing functions to automate repetitive tasks is incredibly powerful. E.g. each time you call <code>ggplot</code> you are calling a function that someone wrote. Imagine if each time you wanted to make a plot you had to copy and paste or write that code from scratch!</p>
<h3 id="defining-a-function">Defining a function</h3>
<p>Recall the components of a function. E.g. the <code>log</code> function (see <code>?log</code>) takes “arguments” <code>x</code> and <code>base</code> and “returns” the base-<code>base</code> logarithm of <code>x</code>. Functions take arguments as input and yield return-values as output. You can define functions to do any number of operations on any number of arguments, but always output a single return value (however there are complex objects into which you can put multiple objects, should you need to).</p>
<p>Let’s start by defining a simple function to add two numbers. This is the basic structure, which you can read as “assign to the variable <code>my_sum</code> a function that takes arguments <code>a</code> and <code>b</code> and returns <code>the_sum</code>.” The body of the function is delimited by the curly-braces. The statements in the body are indented. This makes the code easier to read but does not affect how the code operates.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">my_sum <-<span class="st"> </span>function(a, b) {
the_sum <-<span class="st"> </span>a +<span class="st"> </span>b
<span class="kw">return</span>(the_sum)
}</code></pre></div>
<p>Notice that no numbers were summed when we ran that code, but now the Environment has an object called <code>my_sum</code> that has type function. You can call <code>my_sum</code> just like you would any other function. When you do, the code between the curly-braces of the <code>my_sum</code> definition is run with whatever values you pass to <code>a</code> and <code>b</code> substituted in their place.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">my_sum</span>(<span class="dt">a =</span> <span class="dv">2</span>, <span class="dt">b =</span> <span class="dv">2</span>)</code></pre></div>
<pre class="output"><code>[1] 4
</code></pre>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">my_sum</span>(<span class="dv">3</span>, <span class="dv">4</span>)</code></pre></div>
<pre class="output"><code>[1] 7
</code></pre>
<p>Just like <code>log</code> provides a default value of <code>base</code> (<code>exp(1)</code>) so that you don’t have to type it every time, you can provide default values to any arguments of your function. Then if the user doesn’t specify them, the defaults will be used.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">my_sum2 <-<span class="st"> </span>function(<span class="dt">a =</span> <span class="dv">1</span>, <span class="dt">b =</span> <span class="dv">2</span>) {
the_sum <-<span class="st"> </span>a +<span class="st"> </span>b
<span class="kw">return</span>(the_sum)
}
<span class="kw">my_sum2</span>()</code></pre></div>
<pre class="output"><code>[1] 3
</code></pre>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">my_sum2</span>(<span class="dt">b =</span> <span class="dv">7</span>)</code></pre></div>
<pre class="output"><code>[1] 8
</code></pre>
<aside class="callout panel panel-info">
<div class="panel-heading">
<h2 id="tip"><span class="glyphicon glyphicon-pushpin"></span>Tip</h2>
</div>
<div class="panel-body">
<p>One feature unique to R is that the return statement is not required. R automatically returns the output of the last line of the body of the function unless a <code>return</code> statement is specified elsewhere. Since other languages require a <code>return</code> statement and because it can make reading a funciton easier, we will explicitly define the return statement.</p>
</div>
</aside>
<h4 id="temperature-conversion">Temperature conversion</h4>
<p>Let’s define a function F_to_K that converts temperatures from Fahrenheit to Kelvin:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">F_to_K <-<span class="st"> </span>function(temp) {
K <-<span class="st"> </span>((temp -<span class="st"> </span><span class="dv">32</span>) *<span class="st"> </span>(<span class="dv">5</span> /<span class="st"> </span><span class="dv">9</span>)) +<span class="st"> </span><span class="fl">273.15</span>
<span class="kw">return</span>(K)
}</code></pre></div>
<p>Calling our own function is no different from calling any other function:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="co"># freezing point of water</span>
<span class="kw">F_to_K</span>(<span class="dv">32</span>)</code></pre></div>
<pre class="output"><code>[1] 273.15
</code></pre>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="co"># boiling point of water</span>
<span class="kw">F_to_K</span>(<span class="dv">212</span>)</code></pre></div>
<pre class="output"><code>[1] 373.15
</code></pre>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h4 id="challenge-1"><span class="glyphicon glyphicon-pencil"></span>Challenge 1</h4>
</div>
<div class="panel-body">
<ul>
<li>Write a function called <code>K_to_C</code> that takes a temperature in K and returns that temperature in C</li>
<li>Hint: To convert from K to C you subtract 273.15</li>
<li>Create a new R script, copy <code>F_to_K</code> and <code>K_to_C</code> in it, and save it as functions.R in the <code>code</code> directory of your project.</li>
</ul>
</div>
</section>
<h4 id="sourceing-functions"><code>source()</code>ing functions</h4>
<p>You can load all the functions in your <code>code/functions.R</code> script without even opening the file, via the <code>source</code> function. This allows you to keep your functions separate from the analyses which use them.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">source</span>(<span class="st">'code/functions.R'</span>)</code></pre></div>
<h4 id="combining-functions">Combining functions</h4>
<p>The real power of functions comes from mixing, matching and combining them into ever large chunks to get the effect we want.</p>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h4 id="challenge-2"><span class="glyphicon glyphicon-pencil"></span>Challenge 2</h4>
</div>
<div class="panel-body">
<ul>
<li>Write a new function called <code>F_to_C</code> in your functions.R file that converts temperature directly from F to C by reusing the two functions above.</li>
<li>Load the function not by highlighting the code but by <code>source</code>-ing your functions.R file.</li>
<li>Use the function to find today’s high temperature in your location in C.</li>
</ul>
</div>
</section>
<h3 id="a-more-useful-function">A more-useful function</h3>
<p>Let’s write a function to calculate countries’ GDPs in their local currency.</p>
<ul>
<li><p>Start with Canada, exchange rate is 1.279098</p></li>
<li><p>Turn that into a function that takes country and exchange rate.</p></li>
<li><p>Here is a <a href="https://raw.githubusercontent.com/michaellevy/gapminder-R/gh-pages/data/simple_exchange_rate.csv">table of exchange rates for many countries</a>.</p></li>
<li><p>Call the function with some row</p></li>
<li><p>lapply over rows</p></li>
</ul>
<h3 id="do-by-joins">Do by joins</h3>
<p>And by year</p>
<ul>
<li><p>Get all the exchange rates over time. You can <a href="https://raw.githubusercontent.com/michaellevy/gapminder-R/gh-pages/data/exchange_rate.csv">download that here</a>.</p></li>
<li><p>join, filter</p></li>
</ul>
<aside class="callout panel panel-info">
<div class="panel-heading">
<h4 id="tip-pass-by-value"><span class="glyphicon glyphicon-pushpin"></span>Tip: Pass by value</h4>
</div>
<div class="panel-body">
<p>Functions in R almost always make copies of the data to operate on inside of a function body. When we modify <code>dat</code> inside the function we are modifying the copy of the gapminder dataset stored in <code>dat</code>, not the original variable we gave as the first argument.</p>
<p>This is called “pass-by-value” and it makes writing code much safer: you can always be sure that whatever changes you make within the body of the function, stay inside the body of the function.</p>
</div>
</aside>
<aside class="callout panel panel-info">
<div class="panel-heading">
<h4 id="tip-function-scope"><span class="glyphicon glyphicon-pushpin"></span>Tip: Function scope</h4>
</div>
<div class="panel-body">
<p>A related concept is scoping: any variables you create or modify inside the body of a function only exist for the lifetime of the function’s execution. When we call <code>calcGDP</code>, the variables <code>dat</code>, <code>years</code>, and <code>filteredDF</code> only exist inside the body of the function. Even if we have variables of the same name in our interactive R session, they are not modified in any way when executing a function.</p>
</div>
</aside>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h4 id="challenge-a-new-function"><span class="glyphicon glyphicon-pencil"></span>Challenge – A new function</h4>
</div>
<div class="panel-body">
<p>Write a new function that takes two arguments, the gapminder data.frame and the name of a country, and plots the change in the country’s population over time. That is, the return value from the function should be a ggplot object. - It is often easier to modify existing code than to start from scratch. Feel free to start with the calcGDP function code.</p>
</div>
</section>
<h3 id="challenge-solutions">Challenge solutions</h3>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h4 id="solution-a-new-function"><span class="glyphicon glyphicon-pencil"></span>Solution – A new function</h4>
</div>
<div class="panel-body">
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">plotPopGrowth <-<span class="st"> </span>function(countrytoplot, <span class="dt">dat =</span> gapminder) {
df <-<span class="st"> </span><span class="kw">filter</span>(dat, country ==<span class="st"> </span>countrytoplot)
plot <-<span class="st"> </span><span class="kw">ggplot</span>(df, <span class="kw">aes</span>(year, pop)) +<span class="st"> </span>
<span class="st"> </span><span class="kw">geom_line</span>()
<span class="kw">return</span>(plot)
}
<span class="kw">plotPopGrowth</span>(<span class="st">'Canada'</span>)</code></pre></div>
<p><img src="fig/07-functions/unnamed-chunk-9-1.png" title="plot of chunk unnamed-chunk-9" alt="plot of chunk unnamed-chunk-9" style="display: block; margin: auto;" /></p>
</div>
</section>
</div>
</div>
</article>
<div class="footer">
<a class="label swc-blue-bg" href="http://software-carpentry.org">Software Carpentry</a>
<a class="label swc-blue-bg" href="https://github.com/swcarpentry/lesson-template">Source</a>
<a class="label swc-blue-bg" href="mailto:[email protected]">Contact</a>
<a class="label swc-blue-bg" href="LICENSE.html">License</a>
</div>
</div>
<!-- Javascript placed at the end of the document so the pages load faster -->
<script src="http://software-carpentry.org/v5/js/jquery-1.9.1.min.js"></script>
<script src="css/bootstrap/bootstrap-js/bootstrap.js"></script>
</body>
</html>