The purpose of this project is to setup and effectively utilize a cloud-based Jupyter Notebook environment, with a specific emphasis on harnessing the capabilities of Google Colab. This project also executes diverse data manipulation tasks on a provided sample dataset.
Dataset: Iris Dataset
To achieve the objectives of this project, the following steps are taken:
-
Setup and Configuration
a. Access Google Colab via a web browser with Google accounts.
b. Create a new Jupyter Notebook in Google Colab.
c. Configure the runtime settings for optimal performance. -
Data Manipulation Tasks
a. Load the dataset into the Jupyter Notebook.
b. Explore the dataset's structure, data types, and basic statistics.
c. Perform data manipulation tasks such as filtering, sorting, and grouping the data.
d. Create visualizations using Matplotlib to convey insights from the data.
e. Conduct basic data analysis, including calculating summary statistics.
To run the project, you can use the Makefile and follow these commands:
-
# To install the required the python packages make install
-
# To check code style make lint
-
# To run tests make test
-
# To format the code make format
On running the above commands, it runs successfully:
The documentation demonstrating the tasks performed can be found in Documentation.md.
Created a scatter plot to visualize the relationship between Sepal Length and Sepal Width for different species.