forked from RDeconomist/RDeconomist.github.io
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdatascience.html
394 lines (300 loc) · 16.5 KB
/
datascience.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
<!DOCTYPE html>
<head>
<!-- START ANALYTICS -->
<!-- Global site tag (gtag.js) - Google Analytics -->
<!-- This stuff provided by Google from the Admin page of Analysics account -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-JP82FVH378"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-JP82FVH378');
</script>
<!-- END ANALYTICS -->
<!--Meta stuff-->
<meta charset="utf-8">
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="viewport" content="width=device-width, minimum-scale=1.0, maximum-scale=1.0, user-scalable=no">
<!--Facebook meta - this comes up when you post a link to the site-->
<meta property="og:title" content="Covid Data">
<meta property="og:description" content="A dashboard of interactive charts that runs directly from official government data. Automatically updates daily.">
<meta property="og:image" content="linkimage.PNG">
<meta property="og:url" content="http://coviddata.uk/index.htm">
<!--Favicon-->
<link rel="icon" href="icon-small.PNG">
<link rel="shortcut icon" href="icon.PNG" />
<link rel="apple-touch-icon" href="icon.PNG" />
<!--Title - this is the bit that comes up in the tab-->
<title>Data Science</title>
<!--Main Stylesheet-->
<link rel="stylesheet" href="css/main2.css">
<link rel="stylesheet" href="css/crypto.css">
<link rel="stylesheet" href="css/datascience.css">
<!-- Add icon library -->
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
<!--The next three lines allow the Vega embed-->
<script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
</head>
<body>
<!-- heading for page-->
<div class="wrap">
<h1><span style="background: linear-gradient(to right, #ffff0b, #af0236);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent">data</span>science</h1>
<h2>for economics</h2>
<span style="display:block; height: 30px;"></span>
<p class="rubric">Resources for the Data Science course designed and taught by Richard Davies with Denes Csala, Charlie Meyrick and Emilien Valat at the University of Bristol.</p>
<p class="rubric">Examples of projects produced by the 2021-22 cohort are <a href="datascience2022">here.</a></p>
<span style="display:block; height: 10px;"></span>
</div>
<!--Begin Grid 1 WEEK BY WEEK -->
<div class="grid_items">
<!--Week 1, B-->
<div class="grid_item">
<div class="practical">
<p class="fly">Week 1</p>
<p class="title">data portfolio</p>
<p class="description">In this course we extract data from web sites, often by building our own scrapers. To prepare for that, it is helpful to understand the basics of how the web works. In the first lecture and lab we introduce HTML, CSS and JavaScript as you build and style your first web site. You will also embed your first automated and interactive charts.</span></p>
<p class="concepts">Skills and concepts: Text editors, HTML, CSS, GitHub.</p>
</div>
</div>
<!--Week 2, B-->
<div class="grid_item">
<div class="practical">
<p class="fly">Week 2</p>
<p class="title">live data</p>
<p class="description">This week we will build two live charts, embedding them in the web site you built in week 1. The first will run direct from data provided by an API, auto-updating itself daily. The second will run from your GitHub repository. We will discuss the strengths and weaknesses of these two approaches, and how you can use them in your project.</span></p>
<p class="concepts">Skills and concepts: APIs, Javascript, JSON, Vega-Lite, Charts.js.</p>
</div>
</div>
<!--Week 3, B-->
<div class="grid_item">
<div class="practical">
<p class="fly">Week 3</p>
<p class="title">api data</p>
<p class="description">Many analysts access data by clicking download icons to get Excel or csv files. As data scientists we want to access data programmatically—without touching our mouse or keyboard—since this avoids error and repetitive tasks, means our work is transparent and verifiable, and makes time-saving automation possible. In our third class we access data from APIs, discussing the benefits, pitfalls and debugging. </span></p>
<p class="concepts">Skills and concepts: APIs, CORS, JSON.</p>
</div>
</div>
<!--Week 4, B-->
<div class="grid_item">
<div class="practical">
<p class="fly">Week 4</p>
<p class="title">scraping data</p>
<p class="description">Lots of interesting and useful data is not provided by an API but is embedded in a website. This week we will build our first scrapers to extract data from websites and build our own data sets. We learn the art of inspecting a web site to find the data within it, and use this new skill to extract data from three different websites, comparing the results we get.</span></p>
<p class="concepts">Skills and concepts: Python, BeautifulSoup, HTML, Stata.</p>
</div>
</div>
<!--Week 5, B-->
<div class="grid_item">
<div class="practical">
<p class="fly">Week 5</p>
<p class="title">automating data</p>
<p class="description">Repetition is dull, slow and a source of error. This is a problem since much of what we have learned so far: fetching data from an API, scraping a web site you will want to repeat many times. This week is devoted to loops, one of the most powerful tools in any coder’s arsenal. Our class will show how a loop takes you from a small dataset to the world of big data.</span></p>
<p class="concepts">Loops. Layers. Python. JavaScript.</p>
</div>
</div>
<!--Week 6, B-->
<div class="grid_item">
<div class="readingWeek">
<p class="fly">Week 6</p>
<p class="title">reading week</p>
<p class="description">There are no classes or office hours this week.</p>
<p class="concepts">Relax.</p>
<p class="concepts">Then work on your project!</p>
</div>
</div>
<!--Week 7, B-->
<div class="grid_item">
<div class="practical">
<p class="fly">Week 7</p>
<p class="title">cleaning data</p>
<p class="description">Data is only helpful when it is in a clean and useable form. This week we discuss how to get your dataset into shape for analysis. We focus on three unglamorous skills that are the foundation of data science: cleaning data; matching and merging datasets, and re-shaping data. </span></p>
<p class="concepts">Data manipulation functions, Python, Stata, JavaScript.</p>
</div>
</div>
<!--Week 8, B-->
<div class="grid_item">
<div class="practical">
<p class="fly">Week 8</p>
<p class="title">data learning</p>
<p class="description">By this stage you have the tools to access data and loops that allow you to automate and repeat this. The result is large and interesting datasets. We now build tools to learn from this data. This week we begin to discuss Machine Learning (ML) and the difference between “supervised” and “unsupervised” learning, and the use of labelled and unlabelled data.</span></p>
<p class="concepts">Machine learning. Python (PyTorch, TensorFlow)</p>
</div>
</div>
<!--Week 9, B-->
<div class="grid_item">
<div class="practical">
<p class="fly">Week 9</p>
<p class="title">data patterns</p>
<p class="description">We continue building our ML skills by discuss four common tasks that these tools can be used for: for supervised learning—classification and regression and for unsupervised—clustering and association. We apply these tools to example datasets and discuss how you could use them.</span></p>
<p class="concepts">Machine learning. Python (PyTorch, TensorFlow)</p>
</div>
</div>
<!--Week 10, B-->
<div class="grid_item">
<div class="practical">
<p class="fly">Week 10</p>
<p class="title">data stories</p>
<p class="description">Our final week of analysis we discuss the way data can be used to prove or disprove a point. We re-cap on how to use moments of a distribution—the spread and range of data—on correlation, and on steps to establish causation. We discuss, with examples, ways to calculate and visualise the results of this in-depth analysis.</span></p>
<p class="concepts">Machine learning. Python (PyTorch, TensorFlow)</p>
</div>
</div>
<!--Week 11, B-->
<div class="grid_item">
<div class="practical">
<p class="fly">Week 11</p>
<p class="title">interactive data</p>
<p class="description">As the course draws to a close we have the tools to define a research question, build and clean a complex data set and analyse it. In our final session we discuss how to make charts interactive in ways that help users draw their own stories and conclusions. We use large data sets to demonstrate this. </span></p>
<p class="concepts">Interactives (filters, toggles, sliders), colour, tone and opacity, Vega-Lite.</p>
</div>
</div>
<!--Week 4, B-->
<div class="grid_item">
<div class="project">
<p class="fly">Deadline day: Monday 9th January 2023.</p>
<p class="title">Your project</p>
<p class="description">We will discuss project ideas each week, and help you with code and data.</p>
<p class="description">A reminder that the deadline for your DS project is Monday 9th January 2023.</p>
<p class="concepts">Good luck!</p>
</div>
</div>
<!--END Grid 1 WEEK BY WEEK -->
</div>
<span style="display:block; height: 30px;"></span>
<hr>
<span style="display:block; height: 30px;"></span>
<!--begin Grid 1 COURSEWORK AND READINGS -->
<!-- heading for page-->
<div class="wrap">
<h1><span style="background: linear-gradient(to right, #380bff, #af0236);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent">data</span>resources</h1>
<span style="display:block; height: 10px;"></span>
<p class="rubric">Guidance on coursework, readings, further materal and office hours.</p>
<span style="display:block; height: 10px;"></span>
</div>
<div class="grid_items">
<!--Week 4, B-->
<div class="grid_item">
<div class="admin">
<p class="fly">build</p>
<p class="title">Coursework</p>
<p class="description">Your project will present between 3 and 8 charts. These must be embedded in your site, hosted by GitHub pages. You must also briefly discuss four topics: (1) the aims of your project; (2) the data you used, how you accessed it, including notes on automation/replication; (3) challenges in data cleaning and/or analysis, and the tools you used to overcome them; (4) your conclusions. Each section must not exceed 200 words.
</div>
</div>
<!--Week 4, B-->
<div class="grid_item">
<div class="admin">
<p class="fly">Get help</p>
<p class="title">office hours</p>
<p class="description">There are four office hours each week, at the following times:
<ul class="officeHour">
<li>RD: Mon, 14:00-15:00</li>
<li>DC: Thu, 15:00-16:00</li>
<li>CM: Wed, 09:00-10:00</li>
<li>EV: Thu, 14:00-15:00</li>
</ul>
</p>
<p class="description">There are no office hours during reading week (W6). The final slots are held in week 11.</p>
</div>
</div>
<!--Week 4, B-->
<div class="grid_item">
<div class="admin">
<p class="fly">Read</p>
<p class="title">books, papers and sites</p>
<p class="description">Some useful books and papers:
<ul class="officeHour">
<li><a class="readingList" href="https://www.playfairprize.com/william-playfair">Heroes and heroines.</a>Biographies of some key figures in data, past and present.</li>
<li><a class="readingList" href="https://nightingaledvs.com/">Nightingale Magazine.</a>The publication of the Data Visualisation Society</li>
<li><a class="readingList" href="https://developer.mozilla.org/en-US/docs/Learn/Getting_started_with_the_web">MDN Starters guide.</a>A superb intro to HMTL, CSS and JavaScript from Mozilla.</li>
</ul>
</p>
</div>
</div>
<!--Week 4, B-->
<div class="grid_item">
<div class="admin">
<p class="fly">Watch</p>
<p class="title">videos</p>
<p class="description">Links to videos that will help you cover the material and with your project:
<ul class="videos">
<li><a class="libraryButton" href="https://www.dropbox.com/s/b456p0m5iclhwo5/Data%20Science%20%28EFIM30006%29%20Set%20Up%20Video.mp4?dl=0">1. Day one: Setting up</a></li>
<!-- <li><a class="libraryButton" href="https://www.youtube.com/watch?v=XoQq8WEVEuY">2. Big Data: Big Responsibility!</a></li>
<li><a class="libraryButton" href="https://www.youtube.com/watch?v=j1PHCJ9KOdo&ab_channel=CsalaD%C3%A9nes">3. Week 1: recap</a></li>
<li><a class="libraryButton" href="https://www.youtube.com/watch?v=lD6uj2yP0_Y">4. Week 2: recap</a></li>
<li><a class="libraryButton" href="https://www.youtube.com/watch?v=NvxoWjx95Tk">5. Week 3: recap</a></li>
<li><a class="libraryButton" href="https://www.youtube.com/watch?v=UWVZX4JsXYo">6. Week 4: recap</a></li> -->
</ul>
</p>
</div>
</div>
<!--Week 4, B-->
<div class="grid_item">
<div class="admin">
<p class="fly">Resources</p>
<p class="title">links</p>
<p class="description">Tools and links to assist your Data Science project.
<ul class="videos">
<li><a class="libraryButton" href="https://github.com/RDeconomist/RDeconomist.github.io">RapidCharts repo</a></li>
<li><a class="libraryButton" href="https://www.dropbox.com/sh/6g6zriuds5s9m6t/AACrhLEr31T4fY0Vk9fp9QQva?dl=0">DropBox</a></li>
<!-- <li><a class="libraryButton" href="https://www.dropbox.com/sh/xkw12cmttvs2z93/AAD-EJReAFimuKkpxiIXBFVca?dl=0">DropBox2021</a></li> -->
<li><a class="libraryButton" href="https://docs.google.com/spreadsheets/d/1LkAxdFIolQU7MfsUCGJdt4nqktMkjXjupdtGLlrlFIs/edit#gid=0">Google Sheet</a></li>
<li><a class="libraryButton" href="https://vega.github.io/vega-lite/">Vega-Lite</a></li>
<li><a class="libraryButton" href="https://www.economicsobservatory.com/data-hub">ECO Data Hub</a></li>
<li><a class="libraryButton" href="https://www.playfairprize.com/">Playfair Prize</a></li>
<li><a class="libraryButton" href="library">Chart library</a></li>
<li><a class="libraryButton" href="apiTester">API debug</a></li>
<li><a class="libraryButton" href="apiLibrary">API list</a></li>
<li><a class="libraryButton" href="https://developer.mozilla.org/en-US/">MDN Web Docs</a></li>
<li><a class="libraryButton" href="https://www.dropbox.com/sh/2bgbujfhz5jtnw0/AABCDYCCQ3ASmouctU09zQUda?dl=0">Prices Data</a></li>
</ul>
</p>
</div>
</div>
<span style="display:block; height: 40px;"></span>
<!--End Grid 2-->
</div>
<span style="display:block; height: 40px;"></span>
<!--/////////////////////////////////////END SECTION - SAME ALL PAGES//////////////////////////-->
<div class="footer-dark">
<footer>
<div class="container">
<div class="row">
<div class="column">
<h3>About</h3>
<ul>
<li><a href="#">Mission</a></li>
<li><a href="#">Team</a></li>
</ul>
</div>
<div class="column">
<h3>Resources</h3>
<ul>
<li><a href="datascience">Course</a></li>
<li><a href="build">Chart Builder</a></li>
</ul>
</div>
</div>
<div class="footer-finalrow">
<h3>Rapid Charts</h3>
<p>Automated and interactive data analysis.</p> </div>
<div class="col item social">
<a href="https://www.economicsobservatory.com" class="fa fa-university"></a>
<a href="https://twitter.com/RD_Economist" class="fa fa-twitter"></a>
<a href="https://www.instagram.com/rapidcharts" class="fa fa-instagram"></a>
<a href="https://www.linkedin.com/in/rd-economist" class="fa fa-linkedin"></a>
</div>
</div>
<p class="copyright">Richard Davies © 2022</p>
</footer>
</div>
<!--/////////////////////////////////////END SECTION - SAME ALL PAGES//////////////////////////-->
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/4.1.3/js/bootstrap.bundle.min.js"></script>
</body>
</html>