This workshop will walk you through model development stages of the Data Science / ML workflow. We will be exploring different ways to build machine learning models and deploying those models. By the end of these labs/tutorials, you should understand:
- The use of Jupyter Notebooks in IBM Watson Studio
- Code based approaches to building ML models
- Low-code and Zero code based approached to building ML Models
- Deploying ML models to Watson Machine Learning.
- Watson Studio (docs)
- Watson Machine Learning (docs)
- Jupyter Notebooks
- Scikit Learn
- Several Python Libraries: Pandas, Seaborn, Matplotlib, PixieDust
Most of these labs are written in python and using Jupyter Notebooks. One of the most common python libraries used for data analysis and manipulation is pandas, which is used throughout the labs. For a quick tutorial around pandas, feel free to run through the material found here - IBMDeveloperUK pandas-workshop.
-
Store this repository on your local computer.
If you have GIT on your machine, clone this repository locally. Open a terminal and run:
$ git clone https://github.com/lee-zhg/intro-machine-learning.git
If you do NOT have GIT on your machine, you can just download the repository as a ZIP file. In the browser window, select :
-
Ensure you have access to a Waston Studio Instance. If you need to provision an instance, see the instructions in the Setup Watson Studio doc
In this first lab, we will explore the traditional approaching to building models in code, using Python and Spark as our implementation of choice. Follow the instructions in the Readme for code based approach
Finally, in this lab, we will explore an approach to building models that uses Watson Studio to optimize the pipeline process. Instead of selecting configuration options as we did in Lab 3, we simply select what kind of model output we are looking for and AutoAI does the rest. Follow the instructions in the Readme for AutoAI approach
There is lots of great information, tutorials, articles, etc on the IBM Developer site as well as broader web. Here are a subset of good examples related to data understanding, visualization and processing: