diff --git a/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-classification-multiclass-task-sentiment-analysis/automl-nlp-multiclass-sentiment-mlflow.ipynb b/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-classification-multiclass-task-sentiment-analysis/automl-nlp-multiclass-sentiment-mlflow.ipynb
index 8dbabf2265..4a41279bec 100644
--- a/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-classification-multiclass-task-sentiment-analysis/automl-nlp-multiclass-sentiment-mlflow.ipynb
+++ b/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-classification-multiclass-task-sentiment-analysis/automl-nlp-multiclass-sentiment-mlflow.ipynb
@@ -1,6 +1,7 @@
{
"cells": [
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -15,6 +16,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -46,6 +48,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -88,6 +91,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -111,6 +115,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -174,6 +179,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -183,6 +189,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -242,6 +249,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -288,6 +296,108 @@
]
},
{
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## 2.3 Runs with models from Hugging Face (Preview)\n",
+ "\n",
+ "In addition to the model algorithms supported natively by AutoML, you can launch individual runs to explore any model algorithm from HuggingFace transformers library that supports text classification. Please refer to this [documentation](https://huggingface.co/models?pipeline_tag=text-classification&library=transformers&sort=trending) for the list of models.\n",
+ "\n",
+ "If you wish to try a model algorithm (say microsoft/deberta-large-mnli), you can specify the job for your AutoML NLP runs as follows:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Compute target setup\n",
+ "\n",
+ "from azure.ai.ml.entities import AmlCompute\n",
+ "from azure.core.exceptions import ResourceNotFoundError\n",
+ "\n",
+ "compute_name = \"gpu-cluster-nc6s-v3\"\n",
+ "\n",
+ "try:\n",
+ " _ = ml_client.compute.get(compute_name)\n",
+ " print(\"Found existing compute target.\")\n",
+ "except ResourceNotFoundError:\n",
+ " print(\"Creating a new compute target...\")\n",
+ " compute_config = AmlCompute(\n",
+ " name=compute_name,\n",
+ " type=\"amlcompute\",\n",
+ " size=\"Standard_NC6s_v3\",\n",
+ " idle_time_before_scale_down=120,\n",
+ " min_instances=0,\n",
+ " max_instances=4,\n",
+ " )\n",
+ " ml_client.begin_create_or_update(compute_config).result()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Create the AutoML job with the related factory-function.\n",
+ "\n",
+ "text_classification_hf_job = automl.text_classification(\n",
+ " experiment_name=exp_name,\n",
+ " compute=compute_name,\n",
+ " training_data=my_training_data_input,\n",
+ " validation_data=my_validation_data_input,\n",
+ " target_column_name=\"Sentiment\",\n",
+ " primary_metric=\"accuracy\",\n",
+ " tags={\"my_custom_tag\": \"My custom value\"},\n",
+ ")\n",
+ "\n",
+ "text_classification_hf_job.set_limits(timeout_minutes=120)\n",
+ "text_classification_hf_job.set_featurization(dataset_language=dataset_language_code)\n",
+ "text_classification_hf_job.set_training_parameters(\n",
+ " model_name=\"roberta-base-openai-detector\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Submit the AutoML job\n",
+ "\n",
+ "returned_hf_job = ml_client.jobs.create_or_update(\n",
+ " text_classification_hf_job\n",
+ ") # submit the job to the backend\n",
+ "\n",
+ "print(f\"Created job: {returned_job}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "ml_client.jobs.stream(returned_hf_job.name)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## 2.4 Hyperparameter Sweep Runs (Public Preview)\n",
+ "\n",
+ "AutoML allows you to easily train models for Single Label Text Classification on your text data. You can control the model algorithm to be used, specify hyperparameter values for your model, as well as perform a sweep across the hyperparameter space to generate an optimal model.\n",
+ "\n",
+ "When using AutoML for text tasks, you can specify the model algorithm using the `model_name` parameter. You can either specify a single model or choose to sweep over multiple models. Please refer to the sweep notebook for detailed instructions on configuring and submitting a sweep job."
+ ]
+ },
+ {
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -308,6 +418,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -366,6 +477,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -388,6 +500,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -419,6 +532,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -476,6 +590,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -561,6 +676,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -627,6 +743,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -702,6 +819,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -741,6 +859,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -807,6 +926,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -839,6 +959,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
diff --git a/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-classification-multilabel-task-paper-categorization/automl-nlp-multilabel-paper-cat.ipynb b/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-classification-multilabel-task-paper-categorization/automl-nlp-multilabel-paper-cat.ipynb
index 0845fb007d..333f4cc48a 100644
--- a/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-classification-multilabel-task-paper-categorization/automl-nlp-multilabel-paper-cat.ipynb
+++ b/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-classification-multilabel-task-paper-categorization/automl-nlp-multilabel-paper-cat.ipynb
@@ -1,6 +1,7 @@
{
"cells": [
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -21,6 +22,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -50,6 +52,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -91,6 +94,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -125,6 +129,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -134,6 +139,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -182,6 +188,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -227,6 +234,108 @@
]
},
{
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## 2.3 Runs with models from Hugging Face (Preview)\n",
+ "\n",
+ "In addition to the model algorithms supported natively by AutoML, you can launch individual runs to explore any model algorithm from HuggingFace transformers library that supports text classification. Please refer to this [documentation](https://huggingface.co/models?pipeline_tag=text-classification&library=transformers&sort=trending) for the list of models.\n",
+ "\n",
+ "If you wish to try a model algorithm (say microsoft/deberta-large-mnli), you can specify the job for your AutoML NLP runs as follows:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Compute target setup\n",
+ "\n",
+ "from azure.ai.ml.entities import AmlCompute\n",
+ "from azure.core.exceptions import ResourceNotFoundError\n",
+ "\n",
+ "compute_name = \"gpu-cluster-nc6s-v3\"\n",
+ "\n",
+ "try:\n",
+ " _ = ml_client.compute.get(compute_name)\n",
+ " print(\"Found existing compute target.\")\n",
+ "except ResourceNotFoundError:\n",
+ " print(\"Creating a new compute target...\")\n",
+ " compute_config = AmlCompute(\n",
+ " name=compute_name,\n",
+ " type=\"amlcompute\",\n",
+ " size=\"Standard_NC6s_v3\",\n",
+ " idle_time_before_scale_down=120,\n",
+ " min_instances=0,\n",
+ " max_instances=4,\n",
+ " )\n",
+ " ml_client.begin_create_or_update(compute_config).result()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Create the AutoML job with the related factory-function.\n",
+ "\n",
+ "text_classification_multilabel_hf_job = automl.text_classification_multilabel(\n",
+ " experiment_name=exp_name,\n",
+ " compute=compute_name,\n",
+ " training_data=my_training_data_input,\n",
+ " validation_data=my_validation_data_input,\n",
+ " target_column_name=\"terms\",\n",
+ " primary_metric=\"accuracy\",\n",
+ " tags={\"my_custom_tag\": \"My custom value\"},\n",
+ ")\n",
+ "\n",
+ "text_classification_multilabel_hf_job.set_limits(timeout_minutes=exp_timeout)\n",
+ "\n",
+ "text_classification_multilabel_hf_job.set_training_parameters(\n",
+ " model_name=\"roberta-base-openai-detector\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Submit the AutoML job\n",
+ "\n",
+ "returned_hf_job = ml_client.jobs.create_or_update(\n",
+ " text_classification_multilabel_hf_job\n",
+ ") # submit the job to the backend\n",
+ "\n",
+ "print(f\"Created job: {returned_hf_job}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "ml_client.jobs.stream(returned_hf_job.name)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## 2.4 Hyperparameter Sweep Runs (Public Preview)\n",
+ "\n",
+ "AutoML allows you to easily train models for Multilabel Text Classification on your text data. You can control the model algorithm to be used, specify hyperparameter values for your model, as well as perform a sweep across the hyperparameter space to generate an optimal model.\n",
+ "\n",
+ "When using AutoML for text tasks, you can specify the model algorithm using the `model_name` parameter. You can either specify a single model or choose to sweep over multiple models. Please refer to the sweep notebook for detailed instructions on configuring and submitting a sweep job."
+ ]
+ },
+ {
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -247,6 +356,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -341,6 +451,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -379,6 +490,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -436,6 +548,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -501,6 +614,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -576,6 +690,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -645,6 +760,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -715,6 +831,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -754,6 +871,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -838,6 +956,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -870,6 +989,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
diff --git a/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-named-entity-recognition-task-distributed-sweeping/automl-nlp-text-ner-task-distributed-with-sweeping.ipynb b/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-named-entity-recognition-task-distributed-sweeping/automl-nlp-text-ner-task-distributed-with-sweeping.ipynb
index 68b5bfe3b6..6f9ed0795d 100644
--- a/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-named-entity-recognition-task-distributed-sweeping/automl-nlp-text-ner-task-distributed-with-sweeping.ipynb
+++ b/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-named-entity-recognition-task-distributed-sweeping/automl-nlp-text-ner-task-distributed-with-sweeping.ipynb
@@ -1,6 +1,7 @@
{
"cells": [
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -29,6 +30,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -62,6 +64,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -103,6 +106,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -126,6 +130,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -163,6 +168,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -172,6 +178,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -208,6 +215,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -218,6 +226,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -309,6 +318,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -354,6 +364,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -443,6 +454,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -460,6 +472,57 @@
]
},
{
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## 4.3 Manual hyperparameter sweeping for models from Hugging Face (Preview)\n",
+ "You can use any model algorithm from Hugging face transformers library for either an individual run or you can also include these models to perform a hyperparameter sweep. You can also choose a combination of model algorithms supported supported natively by [AutoML](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-nlp-models?view=azureml-api-2&tabs=cli) and model algorithms from [Hugging Face](https://huggingface.co/models?pipeline_tag=token-classification&library=transformers&sort=trending).\n",
+ "\n",
+ "In this example, we sweep over bert-base-cased, microsoft/xdoc-base-funsd, and xlm-roberta-large-finetuned-conll03-english, models choosing from a range of values for learning_rate, number_of_epochs, etc., to generate a model with the optimal 'accuracy'."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Create the AutoML job with the related factory-function.\n",
+ "\n",
+ "text_ner_job = automl.text_ner(\n",
+ " compute=compute_name,\n",
+ " # name=\"dpv2-text-ner-job-02\",\n",
+ " experiment_name=exp_name,\n",
+ " training_data=my_training_data_input,\n",
+ " validation_data=my_validation_data_input,\n",
+ " tags={\"my_custom_tag\": \"My custom value\"},\n",
+ ")\n",
+ "\n",
+ "text_ner_job.set_limits(timeout_minutes=120, max_trials=4, max_concurrent_trials=2)\n",
+ "\n",
+ "text_ner_job.extend_search_space(\n",
+ " [\n",
+ " SearchSpace(\n",
+ " model_name=Choice([\"bert-large-cased\", \"roberta-base\"]),\n",
+ " ),\n",
+ " SearchSpace(\n",
+ " model_name=Choice([\"roberta-base-openai-detector\"]),\n",
+ " weight_decay=Uniform(0.01, 0.1),\n",
+ " ),\n",
+ " ]\n",
+ ")\n",
+ "\n",
+ "text_ner_job.set_sweep(\n",
+ " sampling_algorithm=\"Random\",\n",
+ " early_termination=BanditPolicy(\n",
+ " evaluation_interval=2, slack_factor=0.05, delay_evaluation=6\n",
+ " ),\n",
+ ")"
+ ]
+ },
+ {
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -475,6 +538,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -523,6 +587,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -558,6 +623,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -581,6 +647,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -601,6 +668,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -646,6 +714,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -653,6 +722,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -707,6 +777,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -749,6 +820,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -805,6 +877,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -828,6 +901,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -879,6 +953,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -897,6 +972,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -904,6 +980,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
diff --git a/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-named-entity-recognition-task/automl-nlp-text-ner-task.ipynb b/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-named-entity-recognition-task/automl-nlp-text-ner-task.ipynb
index 0935ec1da3..a8decbbab6 100644
--- a/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-named-entity-recognition-task/automl-nlp-text-ner-task.ipynb
+++ b/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-named-entity-recognition-task/automl-nlp-text-ner-task.ipynb
@@ -1,6 +1,7 @@
{
"cells": [
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -21,6 +22,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -51,6 +53,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -92,6 +95,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -129,6 +133,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -138,6 +143,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -183,6 +189,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -229,6 +236,103 @@
]
},
{
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## 2.3 Runs with models from Hugging Face (Preview)\n",
+ "\n",
+ "In addition to the model algorithms supported natively by AutoML, you can launch individual runs to explore any model algorithm from HuggingFace transformers library that supports text classification. Please refer to this [documentation](https://huggingface.co/models?pipeline_tag=token-classification&library=transformers&sort=trending) for the list of models.\n",
+ "\n",
+ "If you wish to try a model algorithm (say microsoft/xdoc-base-funsd), you can specify the job for your AutoML NLP runs as follows:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Compute target setup\n",
+ "\n",
+ "from azure.ai.ml.entities import AmlCompute\n",
+ "from azure.core.exceptions import ResourceNotFoundError\n",
+ "\n",
+ "compute_name = \"gpu-cluster-nc6s-v3\"\n",
+ "\n",
+ "try:\n",
+ " _ = ml_client.compute.get(compute_name)\n",
+ " print(\"Found existing compute target.\")\n",
+ "except ResourceNotFoundError:\n",
+ " print(\"Creating a new compute target...\")\n",
+ " compute_config = AmlCompute(\n",
+ " name=compute_name,\n",
+ " type=\"amlcompute\",\n",
+ " size=\"Standard_NC6s_v3\",\n",
+ " idle_time_before_scale_down=120,\n",
+ " min_instances=0,\n",
+ " max_instances=4,\n",
+ " )\n",
+ " ml_client.begin_create_or_update(compute_config).result()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Create the AutoML job with the related factory-function.\n",
+ "\n",
+ "text_ner_hf_job = automl.text_ner(\n",
+ " experiment_name=exp_name,\n",
+ " compute=compute_name,\n",
+ " training_data=my_training_data_input,\n",
+ " validation_data=my_validation_data_input,\n",
+ " tags={\"my_custom_tag\": \"My custom value\"},\n",
+ ")\n",
+ "\n",
+ "text_ner_hf_job.set_limits(timeout_minutes=exp_timeout)\n",
+ "text_ner_hf_job.set_training_parameters(model_name=\"roberta-base-openai-detector\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Submit the AutoML job\n",
+ "\n",
+ "returned_hf_job = ml_client.jobs.create_or_update(\n",
+ " text_ner_hf_job\n",
+ ") # submit the job to the backend\n",
+ "\n",
+ "print(f\"Created job: {returned_hf_job}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "ml_client.jobs.stream(returned_job.name)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## 2.4 Hyperparameter Sweep Runs (Public Preview)\n",
+ "\n",
+ "AutoML allows you to easily train models for Named Entity Recognition on your text data. You can control the model algorithm to be used, specify hyperparameter values for your model, as well as perform a sweep across the hyperparameter space to generate an optimal model.\n",
+ "\n",
+ "When using AutoML for text tasks, you can specify the model algorithm using the `model_name` parameter. You can either specify a single model or choose to sweep over multiple models. Please refer to the sweep notebook for detailed instructions on configuring and submitting a sweep job."
+ ]
+ },
+ {
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -249,6 +353,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -343,6 +448,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -381,6 +487,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -438,6 +545,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -503,6 +611,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -578,6 +687,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -647,6 +757,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -717,6 +828,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -756,6 +868,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -817,6 +930,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
@@ -849,6 +963,7 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [