diff --git a/demos/README.ipynb b/demos/README.ipynb
index 6711c930..a0cd480f 100644
--- a/demos/README.ipynb
+++ b/demos/README.ipynb
@@ -38,17 +38,15 @@
" \n",
"## Image Classification\n",
"\n",
- "The [**image-classification**](image-classification/01-image-classification.ipynb) demo demonstrates image recognition: the application builds and trains an ML model that identifies (recognizes) and classifies images.\n",
+ "The [**image-classification**](image-classification/01-image-classification.ipynb) demo demonstrates an end-to-end solution for image recognition: the application uses TensorFlow, Keras, Horovod, and Nuclio to build and train an ML model that identifies (recognizes) and classifies images. \n",
+ "The application consists of four MLRun and Nuclio functions for performing the following operations:\n",
"\n",
- "This example is using TensorFlow, Horovod, and Nuclio demonstrating end to end solution for image classification, \n",
- "it consists of 4 MLRun and Nuclio functions:\n",
+ "1. Import an image archive from from an Amazon Simple Storage (S3) bucket to the platform's data store.\n",
+ "2. Tag the images based on their name structure.\n",
+ "3. Train the image-classification ML model by using [TensorFlow](https://www.tensorflow.org/) and [Keras](https://keras.io/); use [Horovod](https://eng.uber.com/horovod/) to perform distributed training over either GPUs or CPUs.\n",
+ "4. Automatically deploy a Nuclio model-serving function from [Jupyter Notebook](nuclio-serving-tf-images.ipynb) or from a [Dockerfile](./inference-docker).\n",
"\n",
- "1. import an image archive from S3 to the cluster file system\n",
- "2. Tag the images based on their name structure \n",
- "3. Distrubuted training using TF, Keras and Horovod\n",
- "4. Automated deployment of Nuclio model serving function (form [Notebook](nuclio-serving-tf-images.ipynb) and from [Dockerfile](./inference-docker))\n",
- "\n",
- "The Example also demonstrate an [automated pipeline](mlrun_mpijob_pipe.ipynb) using MLRun and KubeFlow pipelines "
+ "This demo also provides an example of an [automated pipeline](image-classification/02-create_pipeline.ipynb) using [MLRun](https://github.com/mlrun/mlrun) and [Kubeflow pipelines](https://github.com/kubeflow/pipelines)."
]
},
{
diff --git a/demos/README.md b/demos/README.md
index 2184ef7a..8298a68f 100644
--- a/demos/README.md
+++ b/demos/README.md
@@ -17,17 +17,15 @@ The **demos** tutorials directory contains full end-to-end use-case applications
## Image Classification
-The [**image-classification**](image-classification/01-image-classification.ipynb) demo demonstrates image recognition: the application builds and trains an ML model that identifies (recognizes) and classifies images.
+The [**image-classification**](image-classification/01-image-classification.ipynb) demo demonstrates an end-to-end solution for image recognition: the application uses TensorFlow, Keras, Horovod, and Nuclio to build and train an ML model that identifies (recognizes) and classifies images.
+The application consists of four MLRun and Nuclio functions for performing the following operations:
-This example is using TensorFlow, Horovod, and Nuclio demonstrating end to end solution for image classification,
-it consists of 4 MLRun and Nuclio functions:
+1. Import an image archive from from an Amazon Simple Storage (S3) bucket to the platform's data store.
+2. Tag the images based on their name structure.
+3. Train the image-classification ML model by using [TensorFlow](https://www.tensorflow.org/) and [Keras](https://keras.io/); use [Horovod](https://eng.uber.com/horovod/) to perform distributed training over either GPUs or CPUs.
+4. Automatically deploy a Nuclio model-serving function from [Jupyter Notebook](nuclio-serving-tf-images.ipynb) or from a [Dockerfile](./inference-docker).
-1. import an image archive from S3 to the cluster file system
-2. Tag the images based on their name structure
-3. Distrubuted training using TF, Keras and Horovod
-4. Automated deployment of Nuclio model serving function (form [Notebook](nuclio-serving-tf-images.ipynb) and from [Dockerfile](./inference-docker))
-
-The Example also demonstrate an [automated pipeline](mlrun_mpijob_pipe.ipynb) using MLRun and KubeFlow pipelines
+This demo also provides an example of an [automated pipeline](image-classification/02-create_pipeline.ipynb) using [MLRun](https://github.com/mlrun/mlrun) and [Kubeflow pipelines](https://github.com/kubeflow/pipelines).
## Predictive Infrastructure Monitoring
diff --git a/demos/gpu/README.ipynb b/demos/gpu/README.ipynb
index 36596ff6..ca362b65 100644
--- a/demos/gpu/README.ipynb
+++ b/demos/gpu/README.ipynb
@@ -25,14 +25,16 @@
"- A **horovod** directory with applications that use Uber's [Horovod](https://eng.uber.com/horovod/) distributed deep-learning framework, which can be used to convert a single-GPU TensorFlow, Keras, or PyTorch model-training program to a distributed program that trains the model simultaneously over multiple GPUs.\n",
" The objective is to speed up your model training with minimal changes to your existing single-GPU code and without complicating the execution.\n",
" Horovod code can also run over CPUs with only minor modifications.\n",
- " The Horovod tutorials include the following:\n",
- " - Benchmark tests (**benchmark-tf.ipynb**, which executes **tf_cnn_benchmarks.py**).\n",
- " - Note that under the demo folder you will find an image classificaiton demo that is also running with Horovod and can be set to run with GPU \n",
+ " For more information and examples, see the [Horovod GitHub repository](https://github.com/horovod/horovod).\n",
+ " \n",
+ " The Horovod GPU tutorials include benchmark tests (**benchmark-tf.ipynb**, which executes **tf_cnn_benchmarks.py**). \n",
+ " In addition, the image-classification demo ([**demos/image-classification/**](../image-classification/01-image-classification.ipynb)) demonstrates how to use Horovod for image recognition, and can be configured to run over GPUs.\n",
"\n",
"- A **rapids** directory with applications that use NVIDIA's [RAPIDS](https://rapids.ai/) open-source libraries suite for executing end-to-end data science and analytics pipelines entirely on GPUs.\n",
+ "\n",
" The RAPIDS tutorials include the following:\n",
"\n",
- " - Demo applications that use the [cuDF](https://rapidsai.github.io/projects/cudf/en/latest/index.html) RAPIDS GPU DataFrame library to perform batching and aggregation of data that's read from a Kafaka stream, and then write the results to a Parquet file. \n",
+ " - Demo applications that use the [cuDF](https://rapidsai.github.io/projects/cudf/en/latest/index.html) RAPIDS GPU DataFrame library to perform batching and aggregation of data that's read from a Kafka stream, and then write the results to a Parquet file. \n",
" The **nuclio-cudf-agg.ipynb** demo implements this by using a Nuclio serverless function while the **python-agg.ipynb** demo implements this by using a standalone Python function.\n",
" - Benchmark tests that compare the performance of RAPIDS cuDF to pandas DataFrames (**benchmark-cudf-vs-pd.ipynb**)."
]
diff --git a/demos/gpu/README.md b/demos/gpu/README.md
index a6fc7725..3432aea1 100644
--- a/demos/gpu/README.md
+++ b/demos/gpu/README.md
@@ -16,17 +16,14 @@ The **demos/gpu** directory includes the following:
Horovod code can also run over CPUs with only minor modifications.
For more information and examples, see the [Horovod GitHub repository](https://github.com/horovod/horovod).
- The Horovod tutorials include the following:
-
- - An image-recognition demo application for execution over GPUs (**image-classification**).
- - A slightly modified version of the GPU image-classification demo application for execution over CPUs (**cpu/image-classification**).
- - Benchmark tests (**benchmark-tf.ipynb**, which executes **tf_cnn_benchmarks.py**).
+ The Horovod GPU tutorials include benchmark tests (**benchmark-tf.ipynb**, which executes **tf_cnn_benchmarks.py**).
+ In addition, the image-classification demo ([**demos/image-classification/**](../image-classification/01-image-classification.ipynb)) demonstrates how to use Horovod for image recognition, and can be configured to run over GPUs.
- A **rapids** directory with applications that use NVIDIA's [RAPIDS](https://rapids.ai/) open-source libraries suite for executing end-to-end data science and analytics pipelines entirely on GPUs.
The RAPIDS tutorials include the following:
- - Demo applications that use the [cuDF](https://rapidsai.github.io/projects/cudf/en/latest/index.html) RAPIDS GPU DataFrame library to perform batching and aggregation of data that's read from a Kafaka stream, and then write the results to a Parquet file.
+ - Demo applications that use the [cuDF](https://rapidsai.github.io/projects/cudf/en/latest/index.html) RAPIDS GPU DataFrame library to perform batching and aggregation of data that's read from a Kafka stream, and then write the results to a Parquet file.
The **nuclio-cudf-agg.ipynb** demo implements this by using a Nuclio serverless function while the **python-agg.ipynb** demo implements this by using a standalone Python function.
- Benchmark tests that compare the performance of RAPIDS cuDF to pandas DataFrames (**benchmark-cudf-vs-pd.ipynb**).
diff --git a/demos/image-classification/README.ipynb b/demos/image-classification/README.ipynb
new file mode 100644
index 00000000..d809b468
--- /dev/null
+++ b/demos/image-classification/README.ipynb
@@ -0,0 +1,71 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Image Classification Using Distributed Training\n",
+ "\n",
+ "- [Overview](#image-classif-demo-overview)\n",
+ "- [Notebooks and Code](#image-classif-demo-nbs-n-code)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ " \n",
+ "## Overview\n",
+ "\n",
+ "This demo demonstrates an end-to-end solution for image recognition: the application uses TensorFlow, Keras, Horovod, and Nuclio to build and train an ML model that identifies (recognizes) and classifies images. \n",
+ "The application consists of four MLRun and Nuclio functions for performing the following operations:\n",
+ "\n",
+ "1. Import an image archive from from an Amazon Simple Storage (S3) bucket to the platform's data store.\n",
+ "2. Tag the images based on their name structure.\n",
+ "3. Train the image-classification ML model by using [TensorFlow](https://www.tensorflow.org/) and [Keras](https://keras.io/); use [Horovod](https://eng.uber.com/horovod/) to perform distributed training over either GPUs or CPUs.\n",
+ "4. Automatically deploy a Nuclio model-serving function from [Jupyter Notebook](nuclio-serving-tf-images.ipynb) or from a [Dockerfile](./inference-docker).\n",
+ "\n",
+ "
\n",
+ "\n",
+ "This demo also provides an example of an [automated pipeline](image-classification/02-create_pipeline.ipynb) using [MLRun](https://github.com/mlrun/mlrun) and [Kubeflow pipelines](https://github.com/kubeflow/pipelines)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ " \n",
+ "## Notebooks and Code\n",
+ "\n",
+ "- [**01-image-classification.ipynb**](01-image-classification.ipynb) — all-in-one: import, tag, launch train, deploy, and serve\n",
+ "- [**horovod-training.py**](horovod-training.py) — train function code\n",
+ "- [**nuclio-serving-tf-images.ipynb**](nuclio-serving-tf-images.ipynb) — serve function development and test\n",
+ "- [**02-create_pipeline.ipynb**](02-create_pipeline.ipynb) — auto-generate a Kubeflow pipeline workflow\n",
+ "- **inference-docker/** — build and serve functions using a Dockerfile:\n",
+ " - [**main.py**](./inference-docker/main.py) — function code\n",
+ " - [**Dockerfile**](./inference-docker/Dockerfile) — a Dockerfile"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.6.8"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/demos/image-classification/README.md b/demos/image-classification/README.md
index c6a573fc..98d67746 100644
--- a/demos/image-classification/README.md
+++ b/demos/image-classification/README.md
@@ -1,24 +1,30 @@
# Image Classification Using Distributed Training
-This example is using TensorFlow, Horovod, and Nuclio demonstrating end to end solution for image classification,
-it consists of 4 MLRun and Nuclio functions:
+- [Overview](#image-classif-demo-overview)
+- [Notebooks and Code](#image-classif-demo-nbs-n-code)
-1. import an image archive from S3 to the cluster file system
-2. Tag the images based on their name structure
-3. Distrubuted training using TF, Keras and Horovod
-4. Automated deployment of Nuclio model serving function (form [Notebook](nuclio-serving-tf-images.ipynb) and from [Dockerfile](./inference-docker))
+
+## Overview
-
+This demo demonstrates an end-to-end solution for image recognition: the application uses TensorFlow, Keras, Horovod, and Nuclio to build and train an ML model that identifies (recognizes) and classifies images.
+The application consists of four MLRun and Nuclio functions for performing the following operations:
+
+1. Import an image archive from from an Amazon Simple Storage (S3) bucket to the platform's data store.
+2. Tag the images based on their name structure.
+3. Train the image-classification ML model by using [TensorFlow](https://www.tensorflow.org/) and [Keras](https://keras.io/); use [Horovod](https://eng.uber.com/horovod/) to perform distributed training over either GPUs or CPUs.
+4. Automatically deploy a Nuclio model-serving function from [Jupyter Notebook](nuclio-serving-tf-images.ipynb) or from a [Dockerfile](./inference-docker).
-The Example also demonstrate an [automated pipeline](mlrun_mpijob_pipe.ipynb) using MLRun and KubeFlow pipelines
+
-## Notebooks & Code
+This demo also provides an example of an [automated pipeline](image-classification/02-create_pipeline.ipynb) using [MLRun](https://github.com/mlrun/mlrun) and [Kubeflow pipelines](https://github.com/kubeflow/pipelines).
-* [All-in-one: Import, tag, launch training, deploy serving](01-image-classification.ipynb)
-* [Training function code](horovod-training.py)
-* [Serving function development and testing](nuclio-serving-tf-images.ipynb)
-* [Auto generation of KubeFlow pipelines workflow](02-create_pipeline.ipynb)
-* [Building serving function using Dockerfile](./inference-docker)
- * [function code](./inference-docker/main.py)
- * [Dockerfile](./inference-docker/Dockerfile)
+
+## Notebooks and Code
+- [**01-image-classification.ipynb**](01-image-classification.ipynb) — all-in-one: import, tag, launch train, deploy, and serve
+- [**horovod-training.py**](horovod-training.py) — train function code
+- [**nuclio-serving-tf-images.ipynb**](nuclio-serving-tf-images.ipynb) — serve function development and test
+- [**02-create_pipeline.ipynb**](02-create_pipeline.ipynb) — auto-generate a Kubeflow pipeline workflow
+- **inference-docker/** — build and serve functions using a Dockerfile:
+ - [**main.py**](./inference-docker/main.py) — function code
+ - [**Dockerfile**](./inference-docker/Dockerfile) — a Dockerfile
diff --git a/getting-started/frames.ipynb b/getting-started/frames.ipynb
index 59115e0a..f66bdd5d 100644
--- a/getting-started/frames.ipynb
+++ b/getting-started/frames.ipynb
@@ -28,13 +28,13 @@
"The `Client` class features the following object methods for supporting basic data operations; the type of data is derived from the backend type (`tsdb` — TSDB table / `kv` — NoSQL table / `stream` — data stream):\n",
"\n",
"- `create` — creates a new TSDB table or stream (\"backend data\").\n",
- "- `delete` — deletes a table or stream or specific NoSQL (\"KV\") table items.\n",
+ "- `delete` — deletes a table or stream.\n",
"- `read` — reads data from a table or stream into pandas DataFrames.\n",
"- `write` — writes data from pandas DataFrames to a table or stream.\n",
"- `execute` — executes a command on a table or stream.\n",
" Each backend may support multiple commands.\n",
"\n",
- "For a detailed description of the Frames API, see the [Frames documentation](https://github.com/v3io/frames/blob/development/README.md). \n",
+ "For a detailed description of the Frames API, see the [Frames API reference](https://www.iguazio.com/docs/reference/latest-release/api-reference/frames/). \n",
"For more help and usage details, use the internal API help — `.?` in Jupyter Notebook or `print(..__doc__)`. \n",
"For example, the following command returns information about the read operation for a client object named `client`:\n",
"```\n",
@@ -111,7 +111,7 @@
"outputs": [],
"source": [
"# Relative path to the NoSQL table within the parent platform data container\n",
- "table = os.path.join(os.getenv(\"V3IO_USERNAME\") + \"/examples/bank\")\n",
+ "table = os.path.join(os.getenv(\"V3IO_USERNAME\"), \"examples/bank\")\n",
"\n",
"# Full path to the NoSQL table for SQL queries (platform Presto data-path syntax);\n",
"# use the same data container as used for the Frames client (\"users\")\n",
@@ -324,7 +324,7 @@
"metadata": {},
"outputs": [],
"source": [
- "out = client.write(\"kv\", table=table, dfs=df)"
+ "client.write(\"kv\", table=table, dfs=df)"
]
},
{
@@ -380,175 +380,175 @@
" \n",
" \n",
" no \n",
- " primary \n",
+ " tertiary \n",
" 0 \n",
- " no \n",
+ " yes \n",
" unknown \n",
- " 323 \n",
- " single \n",
+ " 397 \n",
+ " married \n",
" no \n",
- " 11262 \n",
- " aug \n",
+ " 14220 \n",
+ " sep \n",
" cellular \n",
" 1 \n",
" yes \n",
- " 368 \n",
- " technician \n",
- " 26 \n",
- " 60 \n",
+ " 2962 \n",
+ " retired \n",
+ " 9 \n",
+ " 71 \n",
" -1 \n",
" \n",
" \n",
" no \n",
- " secondary \n",
+ " tertiary \n",
" 0 \n",
- " no \n",
+ " yes \n",
" unknown \n",
- " 14 \n",
- " married \n",
+ " 95 \n",
+ " single \n",
" no \n",
- " 17555 \n",
+ " 11797 \n",
" aug \n",
" cellular \n",
- " 14 \n",
+ " 2 \n",
" no \n",
- " 1776 \n",
+ " 3177 \n",
" management \n",
- " 26 \n",
- " 43 \n",
+ " 11 \n",
+ " 32 \n",
" -1 \n",
" \n",
" \n",
- " no \n",
- " primary \n",
- " 4 \n",
" yes \n",
- " success \n",
- " 146 \n",
- " married \n",
+ " tertiary \n",
+ " 0 \n",
+ " yes \n",
+ " unknown \n",
+ " 197 \n",
+ " divorced \n",
" no \n",
- " 12519 \n",
- " apr \n",
+ " 13204 \n",
+ " nov \n",
" cellular \n",
" 2 \n",
" no \n",
- " 602 \n",
- " blue-collar \n",
- " 17 \n",
- " 50 \n",
- " 147 \n",
+ " 3329 \n",
+ " management \n",
+ " 20 \n",
+ " 34 \n",
+ " -1 \n",
" \n",
" \n",
" no \n",
" secondary \n",
" 0 \n",
- " yes \n",
+ " no \n",
" unknown \n",
- " 60 \n",
+ " 223 \n",
" married \n",
" no \n",
- " 14440 \n",
- " nov \n",
+ " 16873 \n",
+ " oct \n",
" cellular \n",
" 1 \n",
" no \n",
- " 3910 \n",
+ " 64 \n",
" admin. \n",
- " 21 \n",
- " 49 \n",
+ " 7 \n",
+ " 56 \n",
" -1 \n",
" \n",
" \n",
" no \n",
- " tertiary \n",
+ " secondary \n",
" 0 \n",
" no \n",
" unknown \n",
- " 420 \n",
+ " 113 \n",
" married \n",
" no \n",
- " 15520 \n",
- " nov \n",
- " cellular \n",
+ " 11084 \n",
+ " jun \n",
+ " unknown \n",
" 1 \n",
" no \n",
- " 1778 \n",
- " management \n",
- " 18 \n",
- " 56 \n",
+ " 670 \n",
+ " blue-collar \n",
+ " 11 \n",
+ " 40 \n",
" -1 \n",
" \n",
" \n",
- " no \n",
- " secondary \n",
+ " yes \n",
+ " tertiary \n",
" 0 \n",
- " no \n",
+ " yes \n",
" unknown \n",
- " 29 \n",
- " married \n",
+ " 117 \n",
+ " single \n",
" no \n",
- " 12186 \n",
- " jun \n",
- " unknown \n",
- " 3 \n",
+ " 16874 \n",
+ " may \n",
+ " cellular \n",
+ " 2 \n",
" no \n",
- " 272 \n",
- " management \n",
- " 20 \n",
- " 46 \n",
+ " 3485 \n",
+ " entrepreneur \n",
+ " 15 \n",
+ " 25 \n",
" -1 \n",
" \n",
" \n",
" no \n",
" secondary \n",
" 0 \n",
- " no \n",
+ " yes \n",
" unknown \n",
- " 272 \n",
- " single \n",
+ " 66 \n",
+ " married \n",
" no \n",
- " 10177 \n",
+ " 10910 \n",
" may \n",
" cellular \n",
- " 4 \n",
+ " 2 \n",
" no \n",
- " 1211 \n",
- " admin. \n",
- " 5 \n",
- " 66 \n",
+ " 4394 \n",
+ " blue-collar \n",
+ " 15 \n",
+ " 43 \n",
" -1 \n",
" \n",
" \n",
" no \n",
" tertiary \n",
- " 1 \n",
- " no \n",
- " failure \n",
- " 172 \n",
- " married \n",
+ " 3 \n",
+ " yes \n",
+ " success \n",
+ " 638 \n",
+ " single \n",
" no \n",
- " 15834 \n",
- " apr \n",
+ " 13711 \n",
+ " may \n",
" cellular \n",
- " 3 \n",
+ " 1 \n",
" no \n",
- " 1805 \n",
- " retired \n",
- " 5 \n",
- " 70 \n",
- " 186 \n",
+ " 1779 \n",
+ " technician \n",
+ " 14 \n",
+ " 32 \n",
+ " 175 \n",
" \n",
""
],
"text/plain": [
- "[('no', 'primary', 0, 'no', 'unknown', 323, 'single', 'no', 11262, 'aug', 'cellular', 1, 'yes', 368, 'technician', 26, 60, -1),\n",
- " ('no', 'secondary', 0, 'no', 'unknown', 14, 'married', 'no', 17555, 'aug', 'cellular', 14, 'no', 1776, 'management', 26, 43, -1),\n",
- " ('no', 'primary', 4, 'yes', 'success', 146, 'married', 'no', 12519, 'apr', 'cellular', 2, 'no', 602, 'blue-collar', 17, 50, 147),\n",
- " ('no', 'secondary', 0, 'yes', 'unknown', 60, 'married', 'no', 14440, 'nov', 'cellular', 1, 'no', 3910, 'admin.', 21, 49, -1),\n",
- " ('no', 'tertiary', 0, 'no', 'unknown', 420, 'married', 'no', 15520, 'nov', 'cellular', 1, 'no', 1778, 'management', 18, 56, -1),\n",
- " ('no', 'secondary', 0, 'no', 'unknown', 29, 'married', 'no', 12186, 'jun', 'unknown', 3, 'no', 272, 'management', 20, 46, -1),\n",
- " ('no', 'secondary', 0, 'no', 'unknown', 272, 'single', 'no', 10177, 'may', 'cellular', 4, 'no', 1211, 'admin.', 5, 66, -1),\n",
- " ('no', 'tertiary', 1, 'no', 'failure', 172, 'married', 'no', 15834, 'apr', 'cellular', 3, 'no', 1805, 'retired', 5, 70, 186)]"
+ "[('no', 'tertiary', 0, 'yes', 'unknown', 397, 'married', 'no', 14220, 'sep', 'cellular', 1, 'yes', 2962, 'retired', 9, 71, -1),\n",
+ " ('no', 'tertiary', 0, 'yes', 'unknown', 95, 'single', 'no', 11797, 'aug', 'cellular', 2, 'no', 3177, 'management', 11, 32, -1),\n",
+ " ('yes', 'tertiary', 0, 'yes', 'unknown', 197, 'divorced', 'no', 13204, 'nov', 'cellular', 2, 'no', 3329, 'management', 20, 34, -1),\n",
+ " ('no', 'secondary', 0, 'no', 'unknown', 223, 'married', 'no', 16873, 'oct', 'cellular', 1, 'no', 64, 'admin.', 7, 56, -1),\n",
+ " ('no', 'secondary', 0, 'no', 'unknown', 113, 'married', 'no', 11084, 'jun', 'unknown', 1, 'no', 670, 'blue-collar', 11, 40, -1),\n",
+ " ('yes', 'tertiary', 0, 'yes', 'unknown', 117, 'single', 'no', 16874, 'may', 'cellular', 2, 'no', 3485, 'entrepreneur', 15, 25, -1),\n",
+ " ('no', 'secondary', 0, 'yes', 'unknown', 66, 'married', 'no', 10910, 'may', 'cellular', 2, 'no', 4394, 'blue-collar', 15, 43, -1),\n",
+ " ('no', 'tertiary', 3, 'yes', 'success', 638, 'single', 'no', 13711, 'may', 'cellular', 1, 'no', 1779, 'technician', 14, 32, 175)]"
]
},
"execution_count": 5,
@@ -651,6 +651,26 @@
" \n",
" \n",
" \n",
+ " 1821 \n",
+ " 51 \n",
+ " 21244 \n",
+ " 2 \n",
+ " cellular \n",
+ " 4 \n",
+ " no \n",
+ " 166 \n",
+ " unknown \n",
+ " no \n",
+ " housemaid \n",
+ " yes \n",
+ " married \n",
+ " aug \n",
+ " -1 \n",
+ " unknown \n",
+ " 0 \n",
+ " no \n",
+ " \n",
+ " \n",
" 2624 \n",
" 53 \n",
" 22370 \n",
@@ -691,26 +711,6 @@
" no \n",
" \n",
" \n",
- " 1821 \n",
- " 51 \n",
- " 21244 \n",
- " 2 \n",
- " cellular \n",
- " 4 \n",
- " no \n",
- " 166 \n",
- " unknown \n",
- " no \n",
- " housemaid \n",
- " yes \n",
- " married \n",
- " aug \n",
- " -1 \n",
- " unknown \n",
- " 0 \n",
- " no \n",
- " \n",
- " \n",
" 871 \n",
" 31 \n",
" 26965 \n",
@@ -751,6 +751,26 @@
" no \n",
" \n",
" \n",
+ " 650 \n",
+ " 33 \n",
+ " 23663 \n",
+ " 2 \n",
+ " cellular \n",
+ " 16 \n",
+ " no \n",
+ " 199 \n",
+ " tertiary \n",
+ " yes \n",
+ " housemaid \n",
+ " no \n",
+ " single \n",
+ " apr \n",
+ " 146 \n",
+ " failure \n",
+ " 2 \n",
+ " no \n",
+ " \n",
+ " \n",
" 3830 \n",
" 57 \n",
" 27069 \n",
@@ -790,26 +810,6 @@
" 0 \n",
" no \n",
" \n",
- " \n",
- " 650 \n",
- " 33 \n",
- " 23663 \n",
- " 2 \n",
- " cellular \n",
- " 16 \n",
- " no \n",
- " 199 \n",
- " tertiary \n",
- " yes \n",
- " housemaid \n",
- " no \n",
- " single \n",
- " apr \n",
- " 146 \n",
- " failure \n",
- " 2 \n",
- " no \n",
- " \n",
" \n",
"\n",
""
@@ -817,25 +817,25 @@
"text/plain": [
" age balance campaign contact day default duration education \\\n",
"idx \n",
+ "1821 51 21244 2 cellular 4 no 166 unknown \n",
"2624 53 22370 1 unknown 15 no 106 tertiary \n",
"4014 41 21515 1 unknown 5 no 87 secondary \n",
- "1821 51 21244 2 cellular 4 no 166 unknown \n",
"871 31 26965 2 cellular 21 no 654 primary \n",
"1483 43 27733 7 unknown 3 no 164 tertiary \n",
+ "650 33 23663 2 cellular 16 no 199 tertiary \n",
"3830 57 27069 3 unknown 20 no 174 tertiary \n",
"2989 42 42045 2 cellular 8 no 205 tertiary \n",
- "650 33 23663 2 cellular 16 no 199 tertiary \n",
"\n",
" housing job loan marital month pdays poutcome previous y \n",
"idx \n",
+ "1821 no housemaid yes married aug -1 unknown 0 no \n",
"2624 yes entrepreneur no married may -1 unknown 0 no \n",
"4014 yes admin. no married jun -1 unknown 0 no \n",
- "1821 no housemaid yes married aug -1 unknown 0 no \n",
"871 no housemaid no single apr -1 unknown 0 yes \n",
"1483 yes technician no single jun -1 unknown 0 no \n",
+ "650 yes housemaid no single apr 146 failure 2 no \n",
"3830 no technician yes married jun -1 unknown 0 no \n",
- "2989 no entrepreneur no married aug -1 unknown 0 no \n",
- "650 yes housemaid no single apr 146 failure 2 no "
+ "2989 no entrepreneur no married aug -1 unknown 0 no "
]
},
"execution_count": 6,
@@ -889,7 +889,8 @@
}
],
"source": [
- "dfs = client.read(backend=\"kv\", table=table, filter=\"balance > 20000\", iterator=True)\n",
+ "dfs = client.read(backend=\"kv\", table=table, filter=\"balance > 20000\",\n",
+ " iterator=True)\n",
"for df in dfs:\n",
" print(df.head())"
]
@@ -927,6 +928,7 @@
"- [Create a TSDB Table](#frames-tsdb-create)\n",
"- [Write to the TSDB Table](#frames-tsdb-write)\n",
"- [Read from the TSDB Table](#frames-tsdb-read)\n",
+ " - [Conditional Read](#frames-tsdb-read-conditional)\n",
"- [Delete the TSDB Table](#frames-tsdb-delete)"
]
},
@@ -948,7 +950,7 @@
"outputs": [],
"source": [
"# Relative path to the TSDB table within the parent platform data container\n",
- "tsdb_table = os.path.join(os.getenv(\"V3IO_USERNAME\") + \"/examples/tsdb_tab\")"
+ "tsdb_table = os.path.join(os.getenv(\"V3IO_USERNAME\"), \"examples/tsdb_tab\")"
]
},
{
@@ -973,8 +975,8 @@
"metadata": {},
"outputs": [],
"source": [
- "# Create a new TSDB table; ingestion rate = one sample per minute (\"1/m\")\n",
- "client.create(backend=\"tsdb\", table=tsdb_table, attrs={\"rate\": \"1/m\"})"
+ "# Create a new TSDB table; ingestion rate = one sample per hour (\"1/h\")\n",
+ "client.create(backend=\"tsdb\", table=tsdb_table, attrs={\"rate\": \"1/h\"})"
]
},
{
@@ -990,16 +992,16 @@
"You can add labels to TSDB table items in one of two ways; you can also combine these methods:\n",
"\n",
"- Use the `labels` dictionary parameter of the `write` method to add labels to all the written metric-sample table items (DataFrame rows) — `{: [, : , ...]}`. \n",
- " For example, `{\"node\": \"11\"}` in the following code example.\n",
- " Note that the values of the metric labels must be of type string.\n",
+ " For example, `{\"node\": \"11\", \"os\": \"linux\"}`.\n",
+ " Note that the label values must be provided as strings.\n",
"- Define DataFrame index columns for the labels.\n",
" All DataFrame index columns except for the sample-time index column are automatically converted into labels for the respective table items.\n",
- " > **Note:** If you wish to use regular columns in your DataFrames as TSDB labels, convert these columns to index columns.\n",
- " > The following example converts the `symbol` and `exchange` columns to index columns that will be used as TSDB labels (in addition to the `time` index column): \n",
+ " > **Note:** If you wish to use regular columns in your DataFrames as metric labels, convert these columns to index columns.\n",
+ " > The following example converts the `symbol` and `exchange` columns to index columns that will be used as metric labels (in addition to the `time` index column): \n",
" > ```python\n",
- " > df.index.name=\"time\" # Ensure that the sample-time index column is named \"time\"\n",
+ " > df.index.name=\"time\" # Name the sample-time index column \"time\"\n",
" > df.reset_index(level=0, inplace=True) # Reset the DataFrame indexes\n",
- " > df = df.set_index([\"time\", \"symbol\", \"exchange\"]) # Convert the \"time\" column and additional TSDB-label columns to DataFrame indexes\n",
+ " > df = df.set_index([\"time\", \"symbol\", \"exchange\"]) # Define the time and label columns as index columns\n",
" > ```"
]
},
@@ -1007,71 +1009,71 @@
"cell_type": "code",
"execution_count": 11,
"metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "\n",
- "DatetimeIndex: 60 entries, 2019-12-04 14:05:00-05:00 to 2019-12-04 19:00:00-05:00\n",
- "Freq: 300S\n",
- "Data columns (total 3 columns):\n",
- "cpu 60 non-null float64\n",
- "mem 60 non-null float64\n",
- "disk 60 non-null float64\n",
- "dtypes: float64(3)\n",
- "memory usage: 1.9 KB\n",
- "None cpu mem disk\n",
- "2019-12-04 14:05:00-05:00 -0.902722 -1.481140 0.388379\n",
- "2019-12-04 14:10:00-05:00 -1.442563 -1.527384 -0.063397\n",
- "2019-12-04 14:15:00-05:00 -1.635814 -3.987430 1.085080\n",
- "2019-12-04 14:20:00-05:00 -0.320096 -4.944848 1.271489\n",
- "2019-12-04 14:25:00-05:00 0.475710 -6.538720 0.503685\n"
- ]
- }
- ],
+ "outputs": [],
"source": [
- "# Prepare metric samples to ingets to the TSDB table\n",
"import numpy as np\n",
"from datetime import datetime, timedelta\n",
"\n",
- "end = datetime.now().replace(minute=0, second=0, microsecond=0)\n",
- "rng = pd.date_range(end=end, periods=60, freq=\"300s\", tz=\"EST\")\n",
- "df = pd.DataFrame(np.random.randn(len(rng), 3), index=rng, columns=[\"cpu\", \"mem\", \"disk\"])\n",
- "df = df.cumsum()\n",
- "print(df.info(), df.head())"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {},
- "outputs": [],
- "source": [
- "# Ingest data into the TSDB table\n",
- "client.write(backend=\"tsdb\", table=tsdb_table, dfs=df, labels={\"node\": \"11\"})"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- " \n",
- "### Read from the TSDB Table\n",
"\n",
- "Use the `read` method of the Frames client with the `tsdb` backend to read data from your TSDB table (i.e., query the database). \n",
- "You can define the target TSDB table either in the `table` parameter of the `read` method or within the query string set in the optional `query` parameter, as demonstrated in the following example.\n",
- "When the query includes the target table, the `table` parameter (if set) is ignored. \n",
- "You can set the optional `multi_index` parameter to `True` to return labels as index columns, as demonstrated in the following example.\n",
- "By default, only the metric sample-time primary-key attribute is returned as an index column. \n",
- "See the [Frames documentation](https://github.com/v3io/frames/blob/master/README.md) for more information about the supported parameters of the `read` method for the `tsdb` backend."
+ "# Genearte a DataFrame with TSDB metric samples and a \"time\" index column\n",
+ "def gen_df_w_tsdb_data(num_items=24, freq=\"1H\", end=None, start=None,\n",
+ " start_delta=None, tz=None, normalize=False, zero=False,\n",
+ " attrs=[\"cpu\", \"mem\", \"disk\"]):\n",
+ " if (start is None and start_delta is not None and end is not None):\n",
+ " start = end - timedelta(days=start_delta)\n",
+ " if (zero):\n",
+ " if (end is not None):\n",
+ " end = end.replace(minute=0, second=0, microsecond=0)\n",
+ " if (start is not None):\n",
+ " start = start.replace(minute=0, second=0, microsecond=0)\n",
+ " # If `start`, `end`, `num_items` (date_range() `periods`), and `freq`\n",
+ " # are set, ignore `freq`\n",
+ " if (freq is not None and start is not None and end is not None and\n",
+ " num_items is not None):\n",
+ " freq = None\n",
+ " times = pd.date_range(periods=num_items, freq=freq, start=start, end=end,\n",
+ " tz=tz, normalize=normalize)\n",
+ " data = np.random.rand(num_items, len(attrs)) * 100\n",
+ " df = pd.DataFrame(data, index=times, columns=attrs)\n",
+ " df.index.name = \"time\"\n",
+ " return df\n"
]
},
{
"cell_type": "code",
- "execution_count": 13,
- "metadata": {},
+ "execution_count": 12,
+ "metadata": {
+ "collapsed": true,
+ "jupyter": {
+ "outputs_hidden": true
+ }
+ },
"outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\n",
+ "** dfs[0] **\n",
+ "\n",
+ "DatetimeIndex: 24 entries, 2019-12-03 11:00:00+00:00 to 2019-12-10 11:00:00+00:00\n",
+ "Data columns (total 3 columns):\n",
+ "cpu 24 non-null float64\n",
+ "mem 24 non-null float64\n",
+ "disk 24 non-null float64\n",
+ "dtypes: float64(3)\n",
+ "memory usage: 768.0 bytes\n"
+ ]
+ },
+ {
+ "data": {
+ "text/plain": [
+ "None"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
{
"data": {
"text/html": [
@@ -1093,26 +1095,12 @@
" \n",
" \n",
" \n",
- " \n",
- " avg(mem) \n",
- " max(mem) \n",
- " min(mem) \n",
- " avg(cpu) \n",
- " max(cpu) \n",
- " min(cpu) \n",
- " avg(disk) \n",
- " max(disk) \n",
- " min(disk) \n",
+ " cpu \n",
+ " mem \n",
+ " disk \n",
" \n",
" \n",
" time \n",
- " node \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " \n",
" \n",
" \n",
" \n",
@@ -1120,45 +1108,1594 @@
" \n",
" \n",
" \n",
- " 2019-12-04 18:09:37 \n",
- " 11 \n",
- " -1.48114 \n",
- " -1.48114 \n",
- " -1.48114 \n",
- " -0.902722 \n",
- " -0.902722 \n",
- " -0.902722 \n",
- " 0.388379 \n",
- " 0.388379 \n",
- " 0.388379 \n",
+ " 2019-12-03 11:00:00+00:00 \n",
+ " 57.181841 \n",
+ " 40.317855 \n",
+ " 22.423267 \n",
+ " \n",
+ " \n",
+ " 2019-12-03 18:18:15.652173913+00:00 \n",
+ " 21.630993 \n",
+ " 54.356134 \n",
+ " 86.894053 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 01:36:31.304347826+00:00 \n",
+ " 9.511790 \n",
+ " 10.029074 \n",
+ " 61.113160 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 08:54:46.956521739+00:00 \n",
+ " 60.610728 \n",
+ " 52.688734 \n",
+ " 22.355693 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 16:13:02.608695652+00:00 \n",
+ " 32.611691 \n",
+ " 2.353238 \n",
+ " 59.946857 \n",
" \n",
" \n",
"\n",
""
],
"text/plain": [
- " avg(mem) max(mem) min(mem) avg(cpu) max(cpu) \\\n",
- "time node \n",
- "2019-12-04 18:09:37 11 -1.48114 -1.48114 -1.48114 -0.902722 -0.902722 \n",
- "\n",
- " min(cpu) avg(disk) max(disk) min(disk) \n",
- "time node \n",
- "2019-12-04 18:09:37 11 -0.902722 0.388379 0.388379 0.388379 "
+ " cpu mem disk\n",
+ "time \n",
+ "2019-12-03 11:00:00+00:00 57.181841 40.317855 22.423267\n",
+ "2019-12-03 18:18:15.652173913+00:00 21.630993 54.356134 86.894053\n",
+ "2019-12-04 01:36:31.304347826+00:00 9.511790 10.029074 61.113160\n",
+ "2019-12-04 08:54:46.956521739+00:00 60.610728 52.688734 22.355693\n",
+ "2019-12-04 16:13:02.608695652+00:00 32.611691 2.353238 59.946857"
]
},
- "execution_count": 13,
"metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "# Read time-series aggregates from the TSDB table as a data stream; use concat to assemble the DataFrames\n",
- "query_str= \"select avg(*), max(*), min(*) from '\" + tsdb_table + \"'\"\n",
- "tsdf = client.read(backend=\"tsdb\", query=query_str, step=\"60m\", start=\"now-7d\", end=\"now\", multi_index=True)\n",
- "tsdf.head()"
- ]
- },
- {
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\n",
+ "** dfs[1] **\n",
+ "\n",
+ "DatetimeIndex: 24 entries, 2019-12-03 11:00:00+00:00 to 2019-12-10 11:00:00+00:00\n",
+ "Data columns (total 3 columns):\n",
+ "cpu 24 non-null float64\n",
+ "mem 24 non-null float64\n",
+ "disk 24 non-null float64\n",
+ "dtypes: float64(3)\n",
+ "memory usage: 768.0 bytes\n"
+ ]
+ },
+ {
+ "data": {
+ "text/plain": [
+ "None"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " cpu \n",
+ " mem \n",
+ " disk \n",
+ " \n",
+ " \n",
+ " time \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 2019-12-03 11:00:00+00:00 \n",
+ " 94.437214 \n",
+ " 49.207328 \n",
+ " 13.501160 \n",
+ " \n",
+ " \n",
+ " 2019-12-03 18:18:15.652173913+00:00 \n",
+ " 58.218118 \n",
+ " 36.327243 \n",
+ " 49.152192 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 01:36:31.304347826+00:00 \n",
+ " 51.480895 \n",
+ " 70.583209 \n",
+ " 69.679659 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 08:54:46.956521739+00:00 \n",
+ " 5.339464 \n",
+ " 43.764870 \n",
+ " 5.459282 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 16:13:02.608695652+00:00 \n",
+ " 3.510461 \n",
+ " 3.805860 \n",
+ " 40.373110 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " cpu mem disk\n",
+ "time \n",
+ "2019-12-03 11:00:00+00:00 94.437214 49.207328 13.501160\n",
+ "2019-12-03 18:18:15.652173913+00:00 58.218118 36.327243 49.152192\n",
+ "2019-12-04 01:36:31.304347826+00:00 51.480895 70.583209 69.679659\n",
+ "2019-12-04 08:54:46.956521739+00:00 5.339464 43.764870 5.459282\n",
+ "2019-12-04 16:13:02.608695652+00:00 3.510461 3.805860 40.373110"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\n",
+ "** dfs[2] **\n",
+ "\n",
+ "DatetimeIndex: 24 entries, 2019-12-03 11:00:00+00:00 to 2019-12-10 11:00:00+00:00\n",
+ "Data columns (total 3 columns):\n",
+ "cpu 24 non-null float64\n",
+ "mem 24 non-null float64\n",
+ "disk 24 non-null float64\n",
+ "dtypes: float64(3)\n",
+ "memory usage: 768.0 bytes\n"
+ ]
+ },
+ {
+ "data": {
+ "text/plain": [
+ "None"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " cpu \n",
+ " mem \n",
+ " disk \n",
+ " \n",
+ " \n",
+ " time \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 2019-12-03 11:00:00+00:00 \n",
+ " 72.253130 \n",
+ " 48.638564 \n",
+ " 88.366517 \n",
+ " \n",
+ " \n",
+ " 2019-12-03 18:18:15.652173913+00:00 \n",
+ " 87.686276 \n",
+ " 27.742501 \n",
+ " 61.147908 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 01:36:31.304347826+00:00 \n",
+ " 86.861995 \n",
+ " 94.197867 \n",
+ " 79.770651 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 08:54:46.956521739+00:00 \n",
+ " 96.169328 \n",
+ " 97.307368 \n",
+ " 3.146355 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 16:13:02.608695652+00:00 \n",
+ " 20.436635 \n",
+ " 1.412696 \n",
+ " 29.898394 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " cpu mem disk\n",
+ "time \n",
+ "2019-12-03 11:00:00+00:00 72.253130 48.638564 88.366517\n",
+ "2019-12-03 18:18:15.652173913+00:00 87.686276 27.742501 61.147908\n",
+ "2019-12-04 01:36:31.304347826+00:00 86.861995 94.197867 79.770651\n",
+ "2019-12-04 08:54:46.956521739+00:00 96.169328 97.307368 3.146355\n",
+ "2019-12-04 16:13:02.608695652+00:00 20.436635 1.412696 29.898394"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\n",
+ "** dfs[3] **\n",
+ "\n",
+ "DatetimeIndex: 24 entries, 2019-12-03 11:00:00+00:00 to 2019-12-10 11:00:00+00:00\n",
+ "Data columns (total 3 columns):\n",
+ "cpu 24 non-null float64\n",
+ "mem 24 non-null float64\n",
+ "disk 24 non-null float64\n",
+ "dtypes: float64(3)\n",
+ "memory usage: 768.0 bytes\n"
+ ]
+ },
+ {
+ "data": {
+ "text/plain": [
+ "None"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " cpu \n",
+ " mem \n",
+ " disk \n",
+ " \n",
+ " \n",
+ " time \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 2019-12-03 11:00:00+00:00 \n",
+ " 48.479839 \n",
+ " 24.805339 \n",
+ " 50.948250 \n",
+ " \n",
+ " \n",
+ " 2019-12-03 18:18:15.652173913+00:00 \n",
+ " 45.921080 \n",
+ " 83.194919 \n",
+ " 61.947495 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 01:36:31.304347826+00:00 \n",
+ " 94.934040 \n",
+ " 17.558910 \n",
+ " 59.574690 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 08:54:46.956521739+00:00 \n",
+ " 3.711991 \n",
+ " 34.290391 \n",
+ " 63.072701 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 16:13:02.608695652+00:00 \n",
+ " 44.258770 \n",
+ " 39.525828 \n",
+ " 95.296697 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " cpu mem disk\n",
+ "time \n",
+ "2019-12-03 11:00:00+00:00 48.479839 24.805339 50.948250\n",
+ "2019-12-03 18:18:15.652173913+00:00 45.921080 83.194919 61.947495\n",
+ "2019-12-04 01:36:31.304347826+00:00 94.934040 17.558910 59.574690\n",
+ "2019-12-04 08:54:46.956521739+00:00 3.711991 34.290391 63.072701\n",
+ "2019-12-04 16:13:02.608695652+00:00 44.258770 39.525828 95.296697"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# Prepare DataFrames with randomly generated metric samples\n",
+ "end_t = datetime.now()\n",
+ "start_delta = 7 # start time = ent_t - 7 days\n",
+ "dfs = []\n",
+ "for i in range(4):\n",
+ " # Generate a new DataFrame with TSDB metrics\n",
+ " dfs.append(gen_df_w_tsdb_data(end=end_t, start_delta=7, zero=True))\n",
+ " # Display DataFrame info & head (optional - for testing)\n",
+ " print(\"\\n** dfs[\" + str(i) + \"] **\")\n",
+ " display(dfs[i].info(), dfs[i].head())\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Write to a TSDB table\n",
+ "\n",
+ "# Prepare metric labels to write\n",
+ "labels = [\n",
+ " {\"node\": \"11\", \"os\": \"linux\"},\n",
+ " {\"node\": \"2\", \"os\": \"windows\"},\n",
+ " {\"node\": \"11\", \"os\": \"windows\"},\n",
+ " {\"node\": \"2\", \"os\": \"linux\"}\n",
+ "]\n",
+ "\n",
+ "# Write the contents of the prepared DataFrames to a TSDB table. Use multiple\n",
+ "# write commands with the `labels` parameter to set different label values.\n",
+ "num_dfs = len(dfs)\n",
+ "for i in range(num_dfs):\n",
+ " client.write(\"tsdb\", table=tsdb_table, dfs=dfs[i], labels=labels[i])\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ " \n",
+ "### Read from the TSDB Table\n",
+ "\n",
+ "- [Overview and Basic Examples](#frames-tsdb-read-basic)\n",
+ "- [Conditional Read](#frames-tsdb-read-conditional)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ " \n",
+ "#### Overview and Basic Examples\n",
+ "\n",
+ "Use the `read` method of the Frames client with the `tsdb` backend to read data from your TSDB table (i.e., query the database). \n",
+ "You can perform one of two types of queries (but you cannot mix the two); note that you also cannot mix raw sample-data queries and aggregation queries:\n",
+ "\n",
+ "- **A non-SQL query** — set the `table` parameter to the path to the TSDB table, and optionally set additional method parameters to configure the query.\n",
+ " `columns` defines the query metrics (default = all); `aggregators` defines aggregation functions (\"aggregators\") to execute for all the configured metrics; `filter` restricts the query by using a platform [filter expression](https://www.iguazio.com/docs/reference/latest-release/expressions/condition-expression/#filter-expression); and `group by` allows grouping the results by specific metric labels.\n",
+ "- **An SQL query** \\[Tech Preview\\] — set the `query` parameter to an SQL query string of the following format:\n",
+ " ```\n",
+ " select from '' [where ] [group by ]\n",
+ " ```\n",
+ " > **Note:**\n",
+ " > - In SQL queries, the path to the TSDB table is set in the `FROM` clause of the `query` string and not in the `read` method's `table` parameter.\n",
+ " > - The `where` filter expression is similar to that passed to the `filter` parameter for a non-SQL query, except it's in SQL format, so the expression isn't embedded within quotation marks and comparisons are done by using the '`=`' operator instead of the '`==`' operator.\n",
+ " > - The `select` clause can optionally include a comma-separated list of either over-time aggregators (such as `avg` or `sum`) or cross-series aggregators (such as `avg_all` or `sum_all`), but you cannot mix these aggregation types.\n",
+ " > The aggregation functions receive a metric-name parameter (for example, `avg(cpu)`, `avg_all(cpu)`, or `avg(*)` for all metrics).\n",
+ " > Cross-series aggregations functions can also optionally receive an interpolation function — `next` (default) | `prev` | `linear` | `none` — in which case the metric name is passed as a parameter of the interpolation function (and not as a direct parameter of the aggregation function); the interpolation function can also optionally receive an interpolation-tolerance string of the format `\"[0-9]+[mhd]\"` (for example, `avg_all(prev(cpu,'1h'))`).\n",
+ "\n",
+ "For both types of queries, you can also optionally set additional parameters.\n",
+ "`start` and `end` define the query's time range — the metric-sample timestamps to which to apply the query (the default end time is `\"now\"` and the default start time is 1 hour before the end time); `step` defines the interval for aggregation or raw-data downsampling (default = the query's time range); and`aggregationWindow` defines the aggregation time frame for over-time aggregation (default = `step`). \n",
+ "You can set the optional `multi_index` parameter to `True` to return labels as index columns, as demonstrated in the following examples.\n",
+ "By default, only the metric sample-time primary-key attribute is returned as an index column. \n",
+ "See the [Frames API reference](https://www.iguazio.com/docs/reference/latest-release/api-reference/frames/tsdb/read/) for more information about the `read` parameters that are supported for the `tsdb` backend."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " cpu \n",
+ " disk \n",
+ " mem \n",
+ " \n",
+ " \n",
+ " time \n",
+ " node \n",
+ " os \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 2019-12-03 11:00:00.000 \n",
+ " 11 \n",
+ " windows \n",
+ " 72.253130 \n",
+ " 88.366517 \n",
+ " 48.638564 \n",
+ " \n",
+ " \n",
+ " 2019-12-03 18:18:15.652 \n",
+ " 11 \n",
+ " windows \n",
+ " 87.686276 \n",
+ " 61.147908 \n",
+ " 27.742501 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 01:36:31.304 \n",
+ " 11 \n",
+ " windows \n",
+ " 86.861995 \n",
+ " 79.770651 \n",
+ " 94.197867 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 08:54:46.956 \n",
+ " 11 \n",
+ " windows \n",
+ " 96.169328 \n",
+ " 3.146355 \n",
+ " 97.307368 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 16:13:02.608 \n",
+ " 11 \n",
+ " windows \n",
+ " 20.436635 \n",
+ " 29.898394 \n",
+ " 1.412696 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 23:31:18.260 \n",
+ " 11 \n",
+ " windows \n",
+ " 37.375834 \n",
+ " 2.454959 \n",
+ " 92.302583 \n",
+ " \n",
+ " \n",
+ " 2019-12-05 06:49:33.913 \n",
+ " 11 \n",
+ " windows \n",
+ " 58.476529 \n",
+ " 86.797440 \n",
+ " 31.443326 \n",
+ " \n",
+ " \n",
+ " 2019-12-05 14:07:49.565 \n",
+ " 11 \n",
+ " windows \n",
+ " 69.632766 \n",
+ " 98.109378 \n",
+ " 19.366588 \n",
+ " \n",
+ " \n",
+ " 2019-12-05 21:26:05.217 \n",
+ " 11 \n",
+ " windows \n",
+ " 75.110088 \n",
+ " 90.717712 \n",
+ " 4.499338 \n",
+ " \n",
+ " \n",
+ " 2019-12-06 04:44:20.869 \n",
+ " 11 \n",
+ " windows \n",
+ " 23.327160 \n",
+ " 93.248763 \n",
+ " 83.516672 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " cpu disk mem\n",
+ "time node os \n",
+ "2019-12-03 11:00:00.000 11 windows 72.253130 88.366517 48.638564\n",
+ "2019-12-03 18:18:15.652 11 windows 87.686276 61.147908 27.742501\n",
+ "2019-12-04 01:36:31.304 11 windows 86.861995 79.770651 94.197867\n",
+ "2019-12-04 08:54:46.956 11 windows 96.169328 3.146355 97.307368\n",
+ "2019-12-04 16:13:02.608 11 windows 20.436635 29.898394 1.412696\n",
+ "2019-12-04 23:31:18.260 11 windows 37.375834 2.454959 92.302583\n",
+ "2019-12-05 06:49:33.913 11 windows 58.476529 86.797440 31.443326\n",
+ "2019-12-05 14:07:49.565 11 windows 69.632766 98.109378 19.366588\n",
+ "2019-12-05 21:26:05.217 11 windows 75.110088 90.717712 4.499338\n",
+ "2019-12-06 04:44:20.869 11 windows 23.327160 93.248763 83.516672"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# Read all metrics from the TSDB table (start=\"0\"; default `end` time = \"now\")\n",
+ "# into a single DataFrame (default `Iterator`=False) and display the first 10\n",
+ "# items; show metric labels as index columns (multi_index=True)\n",
+ "df = client.read(backend=\"tsdb\", table=tsdb_table, start=\"0\", multi_index=True)\n",
+ "display(df.head(10))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " mem \n",
+ " cpu \n",
+ " disk \n",
+ " \n",
+ " \n",
+ " time \n",
+ " node \n",
+ " os \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 2019-12-03 11:00:00.000 \n",
+ " 11 \n",
+ " windows \n",
+ " 48.638564 \n",
+ " 72.253130 \n",
+ " 88.366517 \n",
+ " \n",
+ " \n",
+ " 2019-12-03 18:18:15.652 \n",
+ " 11 \n",
+ " windows \n",
+ " 27.742501 \n",
+ " 87.686276 \n",
+ " 61.147908 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 01:36:31.304 \n",
+ " 11 \n",
+ " windows \n",
+ " 94.197867 \n",
+ " 86.861995 \n",
+ " 79.770651 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 08:54:46.956 \n",
+ " 11 \n",
+ " windows \n",
+ " 97.307368 \n",
+ " 96.169328 \n",
+ " 3.146355 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 16:13:02.608 \n",
+ " 11 \n",
+ " windows \n",
+ " 1.412696 \n",
+ " 20.436635 \n",
+ " 29.898394 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 23:31:18.260 \n",
+ " 11 \n",
+ " windows \n",
+ " 92.302583 \n",
+ " 37.375834 \n",
+ " 2.454959 \n",
+ " \n",
+ " \n",
+ " 2019-12-05 06:49:33.913 \n",
+ " 11 \n",
+ " windows \n",
+ " 31.443326 \n",
+ " 58.476529 \n",
+ " 86.797440 \n",
+ " \n",
+ " \n",
+ " 2019-12-05 14:07:49.565 \n",
+ " 11 \n",
+ " windows \n",
+ " 19.366588 \n",
+ " 69.632766 \n",
+ " 98.109378 \n",
+ " \n",
+ " \n",
+ " 2019-12-05 21:26:05.217 \n",
+ " 11 \n",
+ " windows \n",
+ " 4.499338 \n",
+ " 75.110088 \n",
+ " 90.717712 \n",
+ " \n",
+ " \n",
+ " 2019-12-06 04:44:20.869 \n",
+ " 11 \n",
+ " windows \n",
+ " 83.516672 \n",
+ " 23.327160 \n",
+ " 93.248763 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " mem cpu disk\n",
+ "time node os \n",
+ "2019-12-03 11:00:00.000 11 windows 48.638564 72.253130 88.366517\n",
+ "2019-12-03 18:18:15.652 11 windows 27.742501 87.686276 61.147908\n",
+ "2019-12-04 01:36:31.304 11 windows 94.197867 86.861995 79.770651\n",
+ "2019-12-04 08:54:46.956 11 windows 97.307368 96.169328 3.146355\n",
+ "2019-12-04 16:13:02.608 11 windows 1.412696 20.436635 29.898394\n",
+ "2019-12-04 23:31:18.260 11 windows 92.302583 37.375834 2.454959\n",
+ "2019-12-05 06:49:33.913 11 windows 31.443326 58.476529 86.797440\n",
+ "2019-12-05 14:07:49.565 11 windows 19.366588 69.632766 98.109378\n",
+ "2019-12-05 21:26:05.217 11 windows 4.499338 75.110088 90.717712\n",
+ "2019-12-06 04:44:20.869 11 windows 83.516672 23.327160 93.248763"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# Read the full table contents, as in the previous example but use an SQL query\n",
+ "query_str = f\"select * from '{tsdb_table}'\"\n",
+ "df = client.read(backend=\"tsdb\", query=query_str, start=\"0\", multi_index=True)\n",
+ "display(df.head(10))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " avg(mem) \n",
+ " max(mem) \n",
+ " min(mem) \n",
+ " avg(cpu) \n",
+ " max(cpu) \n",
+ " min(cpu) \n",
+ " avg(disk) \n",
+ " max(disk) \n",
+ " min(disk) \n",
+ " \n",
+ " \n",
+ " time \n",
+ " node \n",
+ " os \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 2019-12-09 12:20:00 \n",
+ " 11 \n",
+ " windows \n",
+ " 35.270953 \n",
+ " 35.270953 \n",
+ " 35.270953 \n",
+ " 12.590616 \n",
+ " 12.590616 \n",
+ " 12.590616 \n",
+ " 46.874492 \n",
+ " 46.874492 \n",
+ " 46.874492 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 20:20:00 \n",
+ " 11 \n",
+ " windows \n",
+ " 68.894958 \n",
+ " 68.894958 \n",
+ " 68.894958 \n",
+ " 94.696647 \n",
+ " 94.696647 \n",
+ " 94.696647 \n",
+ " 30.485137 \n",
+ " 30.485137 \n",
+ " 30.485137 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 03:20:00 \n",
+ " 11 \n",
+ " windows \n",
+ " 17.774372 \n",
+ " 17.774372 \n",
+ " 17.774372 \n",
+ " 62.101750 \n",
+ " 62.101750 \n",
+ " 62.101750 \n",
+ " 1.101887 \n",
+ " 1.101887 \n",
+ " 1.101887 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 10:20:00 \n",
+ " 11 \n",
+ " windows \n",
+ " 81.269016 \n",
+ " 81.269016 \n",
+ " 81.269016 \n",
+ " 86.975761 \n",
+ " 86.975761 \n",
+ " 86.975761 \n",
+ " 11.483470 \n",
+ " 11.483470 \n",
+ " 11.483470 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 12:20:00 \n",
+ " 11 \n",
+ " linux \n",
+ " 20.302082 \n",
+ " 20.302082 \n",
+ " 20.302082 \n",
+ " 60.791782 \n",
+ " 60.791782 \n",
+ " 60.791782 \n",
+ " 39.986561 \n",
+ " 39.986561 \n",
+ " 39.986561 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 20:20:00 \n",
+ " 11 \n",
+ " linux \n",
+ " 69.135578 \n",
+ " 69.135578 \n",
+ " 69.135578 \n",
+ " 7.081664 \n",
+ " 7.081664 \n",
+ " 7.081664 \n",
+ " 5.506636 \n",
+ " 5.506636 \n",
+ " 5.506636 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 03:20:00 \n",
+ " 11 \n",
+ " linux \n",
+ " 0.030557 \n",
+ " 0.030557 \n",
+ " 0.030557 \n",
+ " 35.653121 \n",
+ " 35.653121 \n",
+ " 35.653121 \n",
+ " 86.742539 \n",
+ " 86.742539 \n",
+ " 86.742539 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 10:20:00 \n",
+ " 11 \n",
+ " linux \n",
+ " 69.652420 \n",
+ " 69.652420 \n",
+ " 69.652420 \n",
+ " 42.516088 \n",
+ " 42.516088 \n",
+ " 42.516088 \n",
+ " 37.099772 \n",
+ " 37.099772 \n",
+ " 37.099772 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 12:20:00 \n",
+ " 2 \n",
+ " linux \n",
+ " 59.308987 \n",
+ " 59.308987 \n",
+ " 59.308987 \n",
+ " 3.156173 \n",
+ " 3.156173 \n",
+ " 3.156173 \n",
+ " 29.013744 \n",
+ " 29.013744 \n",
+ " 29.013744 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 20:20:00 \n",
+ " 2 \n",
+ " linux \n",
+ " 45.498953 \n",
+ " 45.498953 \n",
+ " 45.498953 \n",
+ " 16.176501 \n",
+ " 16.176501 \n",
+ " 16.176501 \n",
+ " 3.935345 \n",
+ " 3.935345 \n",
+ " 3.935345 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 03:20:00 \n",
+ " 2 \n",
+ " linux \n",
+ " 20.137033 \n",
+ " 20.137033 \n",
+ " 20.137033 \n",
+ " 19.118773 \n",
+ " 19.118773 \n",
+ " 19.118773 \n",
+ " 86.545337 \n",
+ " 86.545337 \n",
+ " 86.545337 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 10:20:00 \n",
+ " 2 \n",
+ " linux \n",
+ " 84.036040 \n",
+ " 84.036040 \n",
+ " 84.036040 \n",
+ " 8.025058 \n",
+ " 8.025058 \n",
+ " 8.025058 \n",
+ " 77.700351 \n",
+ " 77.700351 \n",
+ " 77.700351 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 12:20:00 \n",
+ " 2 \n",
+ " windows \n",
+ " 65.290912 \n",
+ " 65.290912 \n",
+ " 65.290912 \n",
+ " 0.297331 \n",
+ " 0.297331 \n",
+ " 0.297331 \n",
+ " 48.721278 \n",
+ " 48.721278 \n",
+ " 48.721278 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 20:20:00 \n",
+ " 2 \n",
+ " windows \n",
+ " 73.327954 \n",
+ " 73.327954 \n",
+ " 73.327954 \n",
+ " 12.638177 \n",
+ " 12.638177 \n",
+ " 12.638177 \n",
+ " 90.991214 \n",
+ " 90.991214 \n",
+ " 90.991214 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 03:20:00 \n",
+ " 2 \n",
+ " windows \n",
+ " 83.916436 \n",
+ " 83.916436 \n",
+ " 83.916436 \n",
+ " 37.458137 \n",
+ " 37.458137 \n",
+ " 37.458137 \n",
+ " 70.774701 \n",
+ " 70.774701 \n",
+ " 70.774701 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 10:20:00 \n",
+ " 2 \n",
+ " windows \n",
+ " 41.901232 \n",
+ " 41.901232 \n",
+ " 41.901232 \n",
+ " 60.144993 \n",
+ " 60.144993 \n",
+ " 60.144993 \n",
+ " 81.769719 \n",
+ " 81.769719 \n",
+ " 81.769719 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " avg(mem) max(mem) min(mem) avg(cpu) \\\n",
+ "time node os \n",
+ "2019-12-09 12:20:00 11 windows 35.270953 35.270953 35.270953 12.590616 \n",
+ "2019-12-09 20:20:00 11 windows 68.894958 68.894958 68.894958 94.696647 \n",
+ "2019-12-10 03:20:00 11 windows 17.774372 17.774372 17.774372 62.101750 \n",
+ "2019-12-10 10:20:00 11 windows 81.269016 81.269016 81.269016 86.975761 \n",
+ "2019-12-09 12:20:00 11 linux 20.302082 20.302082 20.302082 60.791782 \n",
+ "2019-12-09 20:20:00 11 linux 69.135578 69.135578 69.135578 7.081664 \n",
+ "2019-12-10 03:20:00 11 linux 0.030557 0.030557 0.030557 35.653121 \n",
+ "2019-12-10 10:20:00 11 linux 69.652420 69.652420 69.652420 42.516088 \n",
+ "2019-12-09 12:20:00 2 linux 59.308987 59.308987 59.308987 3.156173 \n",
+ "2019-12-09 20:20:00 2 linux 45.498953 45.498953 45.498953 16.176501 \n",
+ "2019-12-10 03:20:00 2 linux 20.137033 20.137033 20.137033 19.118773 \n",
+ "2019-12-10 10:20:00 2 linux 84.036040 84.036040 84.036040 8.025058 \n",
+ "2019-12-09 12:20:00 2 windows 65.290912 65.290912 65.290912 0.297331 \n",
+ "2019-12-09 20:20:00 2 windows 73.327954 73.327954 73.327954 12.638177 \n",
+ "2019-12-10 03:20:00 2 windows 83.916436 83.916436 83.916436 37.458137 \n",
+ "2019-12-10 10:20:00 2 windows 41.901232 41.901232 41.901232 60.144993 \n",
+ "\n",
+ " max(cpu) min(cpu) avg(disk) max(disk) \\\n",
+ "time node os \n",
+ "2019-12-09 12:20:00 11 windows 12.590616 12.590616 46.874492 46.874492 \n",
+ "2019-12-09 20:20:00 11 windows 94.696647 94.696647 30.485137 30.485137 \n",
+ "2019-12-10 03:20:00 11 windows 62.101750 62.101750 1.101887 1.101887 \n",
+ "2019-12-10 10:20:00 11 windows 86.975761 86.975761 11.483470 11.483470 \n",
+ "2019-12-09 12:20:00 11 linux 60.791782 60.791782 39.986561 39.986561 \n",
+ "2019-12-09 20:20:00 11 linux 7.081664 7.081664 5.506636 5.506636 \n",
+ "2019-12-10 03:20:00 11 linux 35.653121 35.653121 86.742539 86.742539 \n",
+ "2019-12-10 10:20:00 11 linux 42.516088 42.516088 37.099772 37.099772 \n",
+ "2019-12-09 12:20:00 2 linux 3.156173 3.156173 29.013744 29.013744 \n",
+ "2019-12-09 20:20:00 2 linux 16.176501 16.176501 3.935345 3.935345 \n",
+ "2019-12-10 03:20:00 2 linux 19.118773 19.118773 86.545337 86.545337 \n",
+ "2019-12-10 10:20:00 2 linux 8.025058 8.025058 77.700351 77.700351 \n",
+ "2019-12-09 12:20:00 2 windows 0.297331 0.297331 48.721278 48.721278 \n",
+ "2019-12-09 20:20:00 2 windows 12.638177 12.638177 90.991214 90.991214 \n",
+ "2019-12-10 03:20:00 2 windows 37.458137 37.458137 70.774701 70.774701 \n",
+ "2019-12-10 10:20:00 2 windows 60.144993 60.144993 81.769719 81.769719 \n",
+ "\n",
+ " min(disk) \n",
+ "time node os \n",
+ "2019-12-09 12:20:00 11 windows 46.874492 \n",
+ "2019-12-09 20:20:00 11 windows 30.485137 \n",
+ "2019-12-10 03:20:00 11 windows 1.101887 \n",
+ "2019-12-10 10:20:00 11 windows 11.483470 \n",
+ "2019-12-09 12:20:00 11 linux 39.986561 \n",
+ "2019-12-09 20:20:00 11 linux 5.506636 \n",
+ "2019-12-10 03:20:00 11 linux 86.742539 \n",
+ "2019-12-10 10:20:00 11 linux 37.099772 \n",
+ "2019-12-09 12:20:00 2 linux 29.013744 \n",
+ "2019-12-09 20:20:00 2 linux 3.935345 \n",
+ "2019-12-10 03:20:00 2 linux 86.545337 \n",
+ "2019-12-10 10:20:00 2 linux 77.700351 \n",
+ "2019-12-09 12:20:00 2 windows 48.721278 \n",
+ "2019-12-09 20:20:00 2 windows 90.991214 \n",
+ "2019-12-10 03:20:00 2 windows 70.774701 \n",
+ "2019-12-10 10:20:00 2 windows 81.769719 "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# Read over-time aggregates with a 1-hour aggregation step for all metric\n",
+ "# samples created in the last 2 days; use an SQL query (see `query`)\n",
+ "query_str = f\"select avg(*), max(*), min(*) from '{tsdb_table}'\"\n",
+ "df = client.read(backend=\"tsdb\", query=query_str, step=\"1h\", start=\"now-1d\",\n",
+ " end=\"now\", multi_index=True)\n",
+ "display(df)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " avg(cpu) \n",
+ " avg(disk) \n",
+ " avg(mem) \n",
+ " max(cpu) \n",
+ " max(disk) \n",
+ " max(mem) \n",
+ " min(cpu) \n",
+ " min(disk) \n",
+ " min(mem) \n",
+ " \n",
+ " \n",
+ " time \n",
+ " os \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 2019-12-09 12:20:00 \n",
+ " windows \n",
+ " 6.443973 \n",
+ " 47.797885 \n",
+ " 50.280933 \n",
+ " 12.590616 \n",
+ " 48.721278 \n",
+ " 65.290912 \n",
+ " 0.297331 \n",
+ " 46.874492 \n",
+ " 35.270953 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 20:20:00 \n",
+ " windows \n",
+ " 53.667412 \n",
+ " 60.738175 \n",
+ " 71.111456 \n",
+ " 94.696647 \n",
+ " 90.991214 \n",
+ " 73.327954 \n",
+ " 12.638177 \n",
+ " 30.485137 \n",
+ " 68.894958 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 03:20:00 \n",
+ " windows \n",
+ " 49.779943 \n",
+ " 35.938294 \n",
+ " 50.845404 \n",
+ " 62.101750 \n",
+ " 70.774701 \n",
+ " 83.916436 \n",
+ " 37.458137 \n",
+ " 1.101887 \n",
+ " 17.774372 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 10:20:00 \n",
+ " windows \n",
+ " 73.560377 \n",
+ " 46.626594 \n",
+ " 61.585124 \n",
+ " 86.975761 \n",
+ " 81.769719 \n",
+ " 81.269016 \n",
+ " 60.144993 \n",
+ " 11.483470 \n",
+ " 41.901232 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 12:20:00 \n",
+ " linux \n",
+ " 31.973978 \n",
+ " 34.500153 \n",
+ " 39.805535 \n",
+ " 60.791782 \n",
+ " 39.986561 \n",
+ " 59.308987 \n",
+ " 3.156173 \n",
+ " 29.013744 \n",
+ " 20.302082 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 20:20:00 \n",
+ " linux \n",
+ " 11.629083 \n",
+ " 4.720991 \n",
+ " 57.317266 \n",
+ " 16.176501 \n",
+ " 5.506636 \n",
+ " 69.135578 \n",
+ " 7.081664 \n",
+ " 3.935345 \n",
+ " 45.498953 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 03:20:00 \n",
+ " linux \n",
+ " 27.385947 \n",
+ " 86.643938 \n",
+ " 10.083795 \n",
+ " 35.653121 \n",
+ " 86.742539 \n",
+ " 20.137033 \n",
+ " 19.118773 \n",
+ " 86.545337 \n",
+ " 0.030557 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 10:20:00 \n",
+ " linux \n",
+ " 25.270573 \n",
+ " 57.400062 \n",
+ " 76.844230 \n",
+ " 42.516088 \n",
+ " 77.700351 \n",
+ " 84.036040 \n",
+ " 8.025058 \n",
+ " 37.099772 \n",
+ " 69.652420 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " avg(cpu) avg(disk) avg(mem) max(cpu) \\\n",
+ "time os \n",
+ "2019-12-09 12:20:00 windows 6.443973 47.797885 50.280933 12.590616 \n",
+ "2019-12-09 20:20:00 windows 53.667412 60.738175 71.111456 94.696647 \n",
+ "2019-12-10 03:20:00 windows 49.779943 35.938294 50.845404 62.101750 \n",
+ "2019-12-10 10:20:00 windows 73.560377 46.626594 61.585124 86.975761 \n",
+ "2019-12-09 12:20:00 linux 31.973978 34.500153 39.805535 60.791782 \n",
+ "2019-12-09 20:20:00 linux 11.629083 4.720991 57.317266 16.176501 \n",
+ "2019-12-10 03:20:00 linux 27.385947 86.643938 10.083795 35.653121 \n",
+ "2019-12-10 10:20:00 linux 25.270573 57.400062 76.844230 42.516088 \n",
+ "\n",
+ " max(disk) max(mem) min(cpu) min(disk) \\\n",
+ "time os \n",
+ "2019-12-09 12:20:00 windows 48.721278 65.290912 0.297331 46.874492 \n",
+ "2019-12-09 20:20:00 windows 90.991214 73.327954 12.638177 30.485137 \n",
+ "2019-12-10 03:20:00 windows 70.774701 83.916436 37.458137 1.101887 \n",
+ "2019-12-10 10:20:00 windows 81.769719 81.269016 60.144993 11.483470 \n",
+ "2019-12-09 12:20:00 linux 39.986561 59.308987 3.156173 29.013744 \n",
+ "2019-12-09 20:20:00 linux 5.506636 69.135578 7.081664 3.935345 \n",
+ "2019-12-10 03:20:00 linux 86.742539 20.137033 19.118773 86.545337 \n",
+ "2019-12-10 10:20:00 linux 77.700351 84.036040 8.025058 37.099772 \n",
+ "\n",
+ " min(mem) \n",
+ "time os \n",
+ "2019-12-09 12:20:00 windows 35.270953 \n",
+ "2019-12-09 20:20:00 windows 68.894958 \n",
+ "2019-12-10 03:20:00 windows 17.774372 \n",
+ "2019-12-10 10:20:00 windows 41.901232 \n",
+ "2019-12-09 12:20:00 linux 20.302082 \n",
+ "2019-12-09 20:20:00 linux 45.498953 \n",
+ "2019-12-10 03:20:00 linux 0.030557 \n",
+ "2019-12-10 10:20:00 linux 69.652420 "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# Perform a similar query as in the previous example but use a non-SQL query\n",
+ "# and group the results by the `os` label\n",
+ "df = client.read(backend=\"tsdb\", table=tsdb_table, aggregators=\"avg, max, min\",\n",
+ " step=\"1h\", group_by=\"os\", start=\"now-1d\", end=\"now\",\n",
+ " multi_index=True)\n",
+ "display(df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ " \n",
+ "#### Conditional Read\n",
+ "\n",
+ "The following examples demonstrate how to use a query filter to conditionally read only a subset of the data from a TSDB table. \n",
+ "\n",
+ "- In non-SQL queries, this is done by setting the value of the `filter` parameter to a [platform filter expression](https://www.iguazio.com/docs/reference/latest-release/expressions/condition-expression/#filter-expression).\n",
+ "- In SQL queries, this is done by setting the `query` parameter to a query string that includes a `FROM` clause with a platform filter expression expressed as an SQL expression.\n",
+ " Note that the comparison operator for such queries is `=`, as opposed to `==` in non-SQL queries."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " count(cpu) \n",
+ " count(disk) \n",
+ " count(mem) \n",
+ " sum(cpu) \n",
+ " sum(disk) \n",
+ " sum(mem) \n",
+ " \n",
+ " \n",
+ " time \n",
+ " node \n",
+ " os \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 2019-12-03 \n",
+ " 11 \n",
+ " linux \n",
+ " 2.0 \n",
+ " 2.0 \n",
+ " 2.0 \n",
+ " 78.812834 \n",
+ " 109.317319 \n",
+ " 94.673989 \n",
+ " \n",
+ " \n",
+ " 2019-12-04 \n",
+ " 11 \n",
+ " linux \n",
+ " 4.0 \n",
+ " 4.0 \n",
+ " 4.0 \n",
+ " 195.020956 \n",
+ " 146.568450 \n",
+ " 90.351534 \n",
+ " \n",
+ " \n",
+ " 2019-12-05 \n",
+ " 11 \n",
+ " linux \n",
+ " 3.0 \n",
+ " 3.0 \n",
+ " 3.0 \n",
+ " 224.546236 \n",
+ " 182.191266 \n",
+ " 107.405772 \n",
+ " \n",
+ " \n",
+ " 2019-12-06 \n",
+ " 11 \n",
+ " linux \n",
+ " 3.0 \n",
+ " 3.0 \n",
+ " 3.0 \n",
+ " 150.109953 \n",
+ " 154.702766 \n",
+ " 184.915614 \n",
+ " \n",
+ " \n",
+ " 2019-12-07 \n",
+ " 11 \n",
+ " linux \n",
+ " 3.0 \n",
+ " 3.0 \n",
+ " 3.0 \n",
+ " 53.839369 \n",
+ " 101.296601 \n",
+ " 145.521524 \n",
+ " \n",
+ " \n",
+ " 2019-12-08 \n",
+ " 11 \n",
+ " linux \n",
+ " 4.0 \n",
+ " 4.0 \n",
+ " 4.0 \n",
+ " 206.398742 \n",
+ " 254.195212 \n",
+ " 223.195701 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 \n",
+ " 11 \n",
+ " linux \n",
+ " 3.0 \n",
+ " 3.0 \n",
+ " 3.0 \n",
+ " 127.556590 \n",
+ " 145.346681 \n",
+ " 163.567794 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 \n",
+ " 11 \n",
+ " linux \n",
+ " 2.0 \n",
+ " 2.0 \n",
+ " 2.0 \n",
+ " 78.169209 \n",
+ " 123.842311 \n",
+ " 69.682977 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " count(cpu) count(disk) count(mem) sum(cpu) \\\n",
+ "time node os \n",
+ "2019-12-03 11 linux 2.0 2.0 2.0 78.812834 \n",
+ "2019-12-04 11 linux 4.0 4.0 4.0 195.020956 \n",
+ "2019-12-05 11 linux 3.0 3.0 3.0 224.546236 \n",
+ "2019-12-06 11 linux 3.0 3.0 3.0 150.109953 \n",
+ "2019-12-07 11 linux 3.0 3.0 3.0 53.839369 \n",
+ "2019-12-08 11 linux 4.0 4.0 4.0 206.398742 \n",
+ "2019-12-09 11 linux 3.0 3.0 3.0 127.556590 \n",
+ "2019-12-10 11 linux 2.0 2.0 2.0 78.169209 \n",
+ "\n",
+ " sum(disk) sum(mem) \n",
+ "time node os \n",
+ "2019-12-03 11 linux 109.317319 94.673989 \n",
+ "2019-12-04 11 linux 146.568450 90.351534 \n",
+ "2019-12-05 11 linux 182.191266 107.405772 \n",
+ "2019-12-06 11 linux 154.702766 184.915614 \n",
+ "2019-12-07 11 linux 101.296601 145.521524 \n",
+ "2019-12-08 11 linux 254.195212 223.195701 \n",
+ "2019-12-09 11 linux 145.346681 163.567794 \n",
+ "2019-12-10 11 linux 123.842311 69.682977 "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# Read over-time aggregates with a 1-day aggregation step for all metric\n",
+ "# samples in the table with the `os` label \"linux\" and the `node` label 11.\n",
+ "df = client.read(backend=\"tsdb\", table=tsdb_table, aggregators=\"count,sum\",\n",
+ " step=\"1d\", start=\"0\", filter=\"os=='linux' and node=='11'\",\n",
+ " multi_index=True)\n",
+ "display(df)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " count(mem) \n",
+ " sum(mem) \n",
+ " \n",
+ " \n",
+ " time \n",
+ " node \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 2019-12-09 13:05:00 \n",
+ " 2 \n",
+ " 1.0 \n",
+ " 65.290912 \n",
+ " \n",
+ " \n",
+ " 2019-12-09 20:20:00 \n",
+ " 2 \n",
+ " 1.0 \n",
+ " 73.327954 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 03:35:00 \n",
+ " 2 \n",
+ " 1.0 \n",
+ " 83.916436 \n",
+ " \n",
+ " \n",
+ " 2019-12-10 10:50:00 \n",
+ " 2 \n",
+ " 1.0 \n",
+ " 41.901232 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " count(mem) sum(mem)\n",
+ "time node \n",
+ "2019-12-09 13:05:00 2 1.0 65.290912\n",
+ "2019-12-09 20:20:00 2 1.0 73.327954\n",
+ "2019-12-10 03:35:00 2 1.0 83.916436\n",
+ "2019-12-10 10:50:00 2 1.0 41.901232"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# Read over-time aggregates with an half-hour step for mem` metric samples\n",
+ "# created yesterday with the `os` label \"windows\" and the `node` label 2, and\n",
+ "# group the results by the `node` label; use an SQL query\n",
+ "query_str = f\"select count(mem), sum(mem) from '{tsdb_table}' \" + \\\n",
+ " \"where os='windows' and node='2' group by node\"\n",
+ "df = client.read(backend=\"tsdb\", query=query_str, step=\"15m\",\n",
+ " start=\"now-1d\", multi_index=True)\n",
+ "display(df)"
+ ]
+ },
+ {
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -1170,7 +2707,7 @@
},
{
"cell_type": "code",
- "execution_count": 14,
+ "execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
@@ -1210,12 +2747,12 @@
},
{
"cell_type": "code",
- "execution_count": 15,
+ "execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"# Relative path to the stream within the parent platform data container\n",
- "strm = os.path.join(os.getenv(\"V3IO_USERNAME\") + \"/examples/somestream\")"
+ "strm = os.path.join(os.getenv(\"V3IO_USERNAME\"), \"examples/somestream\")"
]
},
{
@@ -1234,7 +2771,7 @@
},
{
"cell_type": "code",
- "execution_count": 16,
+ "execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
@@ -1268,7 +2805,7 @@
},
{
"cell_type": "code",
- "execution_count": 17,
+ "execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
@@ -1299,7 +2836,7 @@
},
{
"cell_type": "code",
- "execution_count": 18,
+ "execution_count": 24,
"metadata": {},
"outputs": [
{
@@ -1336,7 +2873,7 @@
"Index: []"
]
},
- "execution_count": 18,
+ "execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
@@ -1365,7 +2902,7 @@
},
{
"cell_type": "code",
- "execution_count": 19,
+ "execution_count": 25,
"metadata": {},
"outputs": [
{
@@ -1408,7 +2945,7 @@
},
{
"cell_type": "code",
- "execution_count": 20,
+ "execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
diff --git a/getting-started/read-external-db.ipynb b/getting-started/read-external-db.ipynb
index 614a262e..c60934d4 100644
--- a/getting-started/read-external-db.ipynb
+++ b/getting-started/read-external-db.ipynb
@@ -1504,7 +1504,7 @@
"### Use Pandas Streaming Capabilities to oIngest Large Datasets \n",
"Many Pandas inputs/outputs including SQL, CSV, as well as Iguazio Frames support chunking. \n",
"\n",
- "With chunking feature, the driver forms a continious iterator in order to reading/writing data chunk by chunk. This requires to specify the `chunksize` (number of rows) which enables a DataFrame iterator. This iterator can be passed as is to a DataFrame writer like Iguazio Frames. \n",
+ "With chunking feature, the driver forms a continuous iterator in order to reading/writing data chunk by chunk. This requires to specify the `chunksize` (number of rows) which enables a DataFrame iterator. This iterator can be passed as is to a DataFrame writer like Iguazio Frames. \n",
"\n",
"The following example will stream data from MySQL to Iguazio NoSQL API."
]
diff --git a/igz-tutorials-get.sh b/igz-tutorials-get.sh
index f58e796d..93b091d3 100644
--- a/igz-tutorials-get.sh
+++ b/igz-tutorials-get.sh
@@ -42,6 +42,7 @@ cp -rf ${TEMP_DIR}/getting-started ${DEST_DIR}
cp -rf ${TEMP_DIR}/demos ${DEST_DIR}
cp -rf ${TEMP_DIR}/*.ipynb ${DEST_DIR}
cp -rf ${TEMP_DIR}/assets ${DEST_DIR}
+cp -rf ${TEMP_DIR}/experiment-tracking ${DEST_DIR}
cp -f ${TEMP_DIR}/README.md ${TEMP_DIR}/LICENSE ${DEST_DIR}
echo "Deleting temporary ${DEST_DIR} directory ..."
rm -rf ${TEMP_DIR}