Added Hyperdrive Examples (#2)

* Added Hyperdrive Example via CLI * Updated readme * Added hyperdrive notebooks
csiebler · Dec 17, 2020 · dc45473 · dc45473
1 parent 15d9cea
commit dc45473
Show file tree

Hide file tree

Showing 10 changed files with 433 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -22,6 +22,7 @@ A workshop for doing MLOps on Azure Machine Learning.
   * :weight_lifting: Exercise - Single-step pipeline - [`pipelines-single-training-step`](pipelines-single-training-step/)
   * :weight_lifting: Exercise - Multi-step pipeline with parameters - [`pipelines-multi-step-pipeline`](pipelines-multi-step-pipeline/)
   * :weight_lifting: Exercise - ParallelRunStep pipeline for batch scoring - [`pipelines-parallel-run-step`](pipelines-parallel-run-step/)
+  * :weight_lifting: Exercise - Hyperparametertuning pipeline - [`pipelines-hyperdrive-step`](pipelines-hyperdrive-step/)
 * MLOps on Azure DevOps
   * :weight_lifting_woman: Exercise - Deploy AML pipeline as Published Endpoint - [`devops-deploy-simple-pipeline`](devops-deploy-simple-pipeline/)
   * :weight_lifting_woman: Exercise - Deploy AML pipeline as Published Endpoint, automatically test it and then add it to a Pipeline Endpoint - [`devops-deploy-pipeline-with-tests`](devops-deploy-pipeline-with-tests/)

diff --git a/media/hyperdrive_example.png b/media/hyperdrive_example.png
diff --git a/pipelines-hyperdrive-step/README.md b/pipelines-hyperdrive-step/README.md
@@ -0,0 +1,27 @@
+# Exercise Instructions
+
+Open [`hyperdrive_pipeline.ipynb`](hyperdrive_pipeline.ipynb) and follow the instructions in the notebook.
+
+# Running this via CLI
+
+You can also run the Hyperdrive Hyperparameter Tuning via CLI:
+
+```console
+az ml folder attach -w <YOUR WORKSPACE NAME> -g <YOUR RESOURCE GROUP>
+az ml run submit-hyperdrive --hyperdrive-configuration-name hyperdrive_config.yml -c hyperdrive -e hyperdrive-test
+```
+
+In this case:
+* [`hyperdrive_config.yml`](hyperdrive_config.yml) holds the configuration for the hyperparameter tuning. Full details on the parameters can be found [here](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters#define-the-search-space)
+* [`hyperdrive.runconfig`](hyperdrive.runconfig) holds the general script definition (which dataset, cluster, etc.)
+* [`train.py`](train.py) takes all the hyperparameters as argument inputs
+
+You can check the results in the Studio UI (navigate to the run, then select `Child Runs`):
+
+![Hyperdrive in Studio UI](../media/hyperdrive_example.png)
+
+Each Hyperparameter permutation is its own child run.
+
+# Knowledge Check
+
+To be written
diff --git a/pipelines-hyperdrive-step/conda.yml b/pipelines-hyperdrive-step/conda.yml
@@ -0,0 +1,12 @@
+name: workshop-env
+channels:
+  - conda-forge
+  - defaults
+dependencies:
+  - python=3.6.2
+  - pip:
+    - azureml-defaults
+    - azureml-sdk
+    - scikit-learn==0.20.3
+    - pandas==0.25.3
+    - joblib==0.13.2
diff --git a/pipelines-hyperdrive-step/hyperdrive.runconfig b/pipelines-hyperdrive-step/hyperdrive.runconfig
@@ -0,0 +1,29 @@
+script: train.py
+arguments: [--data-path, /data]
+target: cpu-cluster
+framework: Python
+communicator: None
+nodeCount: 1
+environment:
+  environmentVariables:
+    EXAMPLE_ENV_VAR: EXAMPLE_VALUE
+  python:
+    userManagedDependencies: false
+    interpreterPath: python
+    condaDependenciesFile: conda.yml
+  docker:
+    enabled: true
+    baseImage: mcr.microsoft.com/azureml/base:intelmpi2018.3-ubuntu16.04
+    arguments: []
+mpi:
+    processCountPerNode: 1
+data:
+  training_dataset:
+    environmentVariableName: training_dataset
+    dataLocation:
+      dataset:
+        name: german-credit-train-tutorial
+        version: 1
+    mechanism: download
+    pathOnCompute: /data
+    overwrite: true
diff --git a/pipelines-hyperdrive-step/hyperdrive_config.yml b/pipelines-hyperdrive-step/hyperdrive_config.yml
@@ -0,0 +1,17 @@
+# For more details, visit:
+# https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters#define-the-search-space
+sampling: 
+    type: random # Supported options: Random, Grid, Bayesian
+    parameter_space: # specify a name|expression|values tuple for each parameter.
+    - name: --c # The name of a script parameter to generate values for.
+      expression: choice # supported options: choice, randint, uniform, quniform, loguniform, qloguniform, normal, qnormal, lognormal, qlognormal
+      values: [0.5, 1, 1.5] # The list of values, the number of values is dependent on the expression specified.
+policy: 
+    type: BanditPolicy # Supported options: BanditPolicy, MedianStoppingPolicy, TruncationSelectionPolicy, NoTerminationPolicy
+    evaluation_interval: 1 # Policy properties are policy specific. See the above link for policy specific parameter details.
+    slack_factor: 0.2
+primary_metric_name: Test accuracy # The metric used when evaluating the policy
+primary_metric_goal: Maximize # Maximize or Minimize
+max_total_runs: 8 # The maximum number of runs to generate
+max_concurrent_runs: 1 # The number of runs that can run concurrently.
+max_duration_minutes: 60 # The maximum length of time to run the experiment before cancelling.
diff --git a/pipelines-hyperdrive-step/hyperdrive_pipeline.ipynb b/pipelines-hyperdrive-step/hyperdrive_pipeline.ipynb
@@ -0,0 +1,197 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Hyperparameter Tuning pipeline examples\n",
+    "\n",
+    "In this example, we'll build a pipeline for Hyperparameter tuning. This pipeline will test multiple hyperparameter permutations and then register the best model.\n",
+    "\n",
+    "**Note:** This example requires that you've ran the notebook from the first tutorial, so that the dataset and compute cluster are set up."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "import azureml.core\n",
+    "from azureml.core import Workspace, Experiment, Dataset, RunConfiguration\n",
+    "from azureml.pipeline.core import Pipeline, PipelineData\n",
+    "from azureml.pipeline.steps import PythonScriptStep, HyperDriveStep, HyperDriveStepRun\n",
+    "from azureml.data.dataset_consumption_config import DatasetConsumptionConfig\n",
+    "from azureml.train.hyperdrive import RandomParameterSampling, BanditPolicy, HyperDriveConfig, PrimaryMetricGoal\n",
+    "from azureml.train.hyperdrive import choice, loguniform, uniform\n",
+    "from azureml.core import ScriptRunConfig\n",
+    "\n",
+    "print(\"Azure ML SDK version:\", azureml.core.VERSION)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "First, we will connect to the workspace. The command `Workspace.from_config()` will either:\n",
+    "* Read the local `config.json` with the workspace reference (given it is there) or\n",
+    "* Use the `az` CLI to connect to the workspace and use the workspace attached to via `az ml folder attach -g <resource group> -w <workspace name>`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "ws = Workspace.from_config()\n",
+    "print(f'WS name: {ws.name}\\nRegion: {ws.location}\\nSubscription id: {ws.subscription_id}\\nResource group: {ws.resource_group}')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Preparation\n",
+    "\n",
+    "Let's reference the dataset from the first tutorial:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "training_dataset = Dataset.get_by_name(ws, \"german-credit-train-tutorial\")\n",
+    "training_dataset_consumption = DatasetConsumptionConfig(\"training_dataset\", training_dataset).as_download()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Here, we define the parameter sampling (defines the search space for our hyperparameters we want to try), early termination policy (allows to kill poorly performing runs early), then we put this togehter as a `HyperDriveConfig` and execute it in an `HyperDriveStep`. Lastly, we have a short step to register the best model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "runconfig = RunConfiguration.load(\"runconfig.yml\")\n",
+    "script_run_config = ScriptRunConfig(source_directory=\"./\",\n",
+    "                                    run_config=runconfig)\n",
+    "script_run_config.data_references = None\n",
+    "\n",
+    "ps = RandomParameterSampling(\n",
+    "    {\n",
+    "        '--c': uniform(0.1, 1.9)\n",
+    "    }\n",
+    ")\n",
+    "early_termination_policy = BanditPolicy(evaluation_interval=2, slack_factor=0.1)\n",
+    "\n",
+    "hd_config = HyperDriveConfig(run_config=script_run_config, \n",
+    "                             hyperparameter_sampling=ps,\n",
+    "                             policy=early_termination_policy,\n",
+    "                             primary_metric_name='Test accuracy', \n",
+    "                             primary_metric_goal=PrimaryMetricGoal.MAXIMIZE, \n",
+    "                             max_total_runs=4,\n",
+    "                             max_concurrent_runs=1)\n",
+    "\n",
+    "hd_step = HyperDriveStep(name='hyperparameter-tuning',\n",
+    "                         hyperdrive_config=hd_config,\n",
+    "                         estimator_entry_script_arguments=['--data-path', training_dataset_consumption],\n",
+    "                         inputs=[training_dataset_consumption],\n",
+    "                         outputs=None)\n",
+    "\n",
+    "register_step = PythonScriptStep(script_name='register.py',\n",
+    "                                 runconfig=runconfig,\n",
+    "                                 name=\"register-model\",\n",
+    "                                 compute_target=\"cpu-cluster\",\n",
+    "                                 arguments=['--model_name', 'best_model'],\n",
+    "                                 allow_reuse=False)\n",
+    "\n",
+    "# Explicitly state that registration runs after training, as there is not direct dependency through inputs/outputs\n",
+    "register_step.run_after(hd_step)\n",
+    "\n",
+    "steps = [hd_step, register_step]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Finally, we can create our pipeline object and validate it. This will check the input and outputs are properly linked and that the pipeline graph is a non-cyclic graph:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "pipeline = Pipeline(workspace=ws, steps=steps)\n",
+    "pipeline.validate()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Lastly, we can submit the pipeline against an experiment:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": [
+     "outputPrepend"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "pipeline_run = Experiment(ws, 'hyperparameter-pipeline').submit(pipeline)\n",
+    "pipeline_run.wait_for_completion()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.9-final"
+  },
+  "orig_nbformat": 2,
+  "kernelspec": {
+   "name": "python3",
+   "display_name": "Python 3",
+   "language": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/pipelines-hyperdrive-step/register.py b/pipelines-hyperdrive-step/register.py
@@ -0,0 +1,41 @@
+import json
+import os
+import ast
+import argparse
+import azureml.core
+from azureml.core import Run
+from azureml.pipeline.steps.hyper_drive_step import HyperDriveStepRun
+
+def getRuntimeArgs():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--model_name', type=str)
+    args = parser.parse_args()
+    return args
+
+def main():
+    args = getRuntimeArgs()
+    model_name = args.model_name
+
+    # current run is the registration step
+    current_run = Run.get_context()
+
+    # parent run is the overall pipeline
+    parent_run = current_run.parent
+
+    # Get the HyperDriveStep of the pipeline by name (make sure only 1 exists)
+    hd_step_run = HyperDriveStepRun(step_run=parent_run.find_step_run('hyperparameter-tuning')[0])
+
+    # Get RunID for best run
+    best_run = hd_step_run.get_best_run_by_primary_metric()
+    best_run_id = best_run.id
+
+    # Get the best run's metrics and hyperparameters
+    hyperparameters = ast.literal_eval(hd_step_run.get_hyperparameters()[best_run_id].replace('--', ''))
+    metrics = hd_step_run.get_metrics()[best_run_id]
+
+    best_run.register_model(model_path='outputs/model.pkl',
+                            model_name=model_name,
+                            properties={**metrics, **hyperparameters})
+
+if __name__ == "__main__":
+    main()
diff --git a/pipelines-hyperdrive-step/runconfig.yml b/pipelines-hyperdrive-step/runconfig.yml
@@ -0,0 +1,19 @@
+script: train.py
+arguments: [] # This is set in our pipeline definition script
+target: cpu-cluster
+framework: Python
+communicator: None
+nodeCount: 1
+environment:
+  environmentVariables:
+    EXAMPLE_ENV_VAR: EXAMPLE_VALUE
+  python:
+    userManagedDependencies: false
+    interpreterPath: python
+    condaDependenciesFile: conda.yml
+  docker:
+    enabled: true
+    baseImage: mcr.microsoft.com/azureml/base:intelmpi2018.3-ubuntu16.04
+    arguments: []
+mpi:
+    processCountPerNode: 1