Skip to content

Latest commit

 

History

History
41 lines (26 loc) · 2.06 KB

README.md

File metadata and controls

41 lines (26 loc) · 2.06 KB

Introduction to data analysis in Python

Last updated: November 2024

Author: Aislinn Keogh

Welcome

This two-class course will introduce you to working with structured (tabular) data in Python. We will cover the basics of:

  • importing datasets
  • data cleaning, including dealing with missing or incorrectly-formatted values
  • data wrangling
  • extracting summary statistics
  • data visualisation

By the end of this course, you will be familiar with two key Python libraries used for data analysis: pandas (for working with dataframes), and seaborn (for data visualisation).

We will work together on a sample dataset, but you are also welcome to bring your own.

This is an intermediate-level course. We will assume that you are already familiar with the basics of Python, including variables, loops, functions, and basic data types like lists and dictionaries. No prior knowledge of any external libraries is assumed.

How to use these materials

Throughout this course we will be using the Noteable platform to run Jupyter notebooks. This is a cloud-based computational notebook system that runs in your browser from any device.

Start Noteable

  1. Open the following link in a new tab: https://noteable.edina.ac.uk/login.
  2. Login with your EASE credentials.
  3. Under 'Standard Notebook (Python 3)' click 'Start'

Download the files to Noteable

  1. From the Noteable home page, click on the '+GitRepo' button at the top right of the screen.
  2. In the 'Git Repository URL' field copy the link to this GitHub repository, "https://github.com/DCS-training/data-analysis-in-python". Ignore all other fields.
  3. Once filled in, click the 'clone' button. After a few moments, you will then see a new folder appear with the files.

Feedback

If you attended this course in November 2024, please fill in our feedback form.

License: CC BY-NC 4.0