diff --git a/docs/docs/_static/integrations/mlflow.gif b/docs/docs/_static/integrations/mlflow.gif deleted file mode 100644 index 9291fd6e579b1..0000000000000 Binary files a/docs/docs/_static/integrations/mlflow.gif and /dev/null differ diff --git a/docs/docs/_static/integrations/mlflow/mlflow.gif b/docs/docs/_static/integrations/mlflow/mlflow.gif new file mode 100644 index 0000000000000..9ae103daa2d20 Binary files /dev/null and b/docs/docs/_static/integrations/mlflow/mlflow.gif differ diff --git a/docs/docs/_static/integrations/mlflow/mlflow_chat_trace_quickstart.png b/docs/docs/_static/integrations/mlflow/mlflow_chat_trace_quickstart.png new file mode 100644 index 0000000000000..137180f3275b0 Binary files /dev/null and b/docs/docs/_static/integrations/mlflow/mlflow_chat_trace_quickstart.png differ diff --git a/docs/docs/_static/integrations/mlflow/mlflow_query_trace_quickstart.png b/docs/docs/_static/integrations/mlflow/mlflow_query_trace_quickstart.png new file mode 100644 index 0000000000000..020bed1792c96 Binary files /dev/null and b/docs/docs/_static/integrations/mlflow/mlflow_query_trace_quickstart.png differ diff --git a/docs/docs/_static/integrations/mlflow/mlflow_run_quickstart.png b/docs/docs/_static/integrations/mlflow/mlflow_run_quickstart.png new file mode 100644 index 0000000000000..87f50bf6cac29 Binary files /dev/null and b/docs/docs/_static/integrations/mlflow/mlflow_run_quickstart.png differ diff --git a/docs/docs/_static/integrations/mlflow/mlflow_settings_quickstart.png b/docs/docs/_static/integrations/mlflow/mlflow_settings_quickstart.png new file mode 100644 index 0000000000000..ba5d36a2464c5 Binary files /dev/null and b/docs/docs/_static/integrations/mlflow/mlflow_settings_quickstart.png differ diff --git a/docs/docs/_static/integrations/mlflow/mlflow_traces_list_quickstart.png b/docs/docs/_static/integrations/mlflow/mlflow_traces_list_quickstart.png new file mode 100644 index 0000000000000..b6c9dd30efe54 Binary files /dev/null and b/docs/docs/_static/integrations/mlflow/mlflow_traces_list_quickstart.png differ diff --git a/docs/docs/community/integrations.md b/docs/docs/community/integrations.md index 9fdfcd31bad7e..da3c4d36624e6 100644 --- a/docs/docs/community/integrations.md +++ b/docs/docs/community/integrations.md @@ -25,6 +25,10 @@ We support [a huge number of LLMs](../module_guides/models/llms/modules.md). Check out our [one-click observability](../module_guides/observability/index.md) page for full tracing integrations. +## Experiment Tracking + +- [MLflow](../../examples/observability/mlflow) + ## Structured Outputs - [Guidance](integrations/guidance.md) diff --git a/docs/docs/examples/llm/databricks.ipynb b/docs/docs/examples/llm/databricks.ipynb index 79cc044b078ca..4c655caad0f90 100644 --- a/docs/docs/examples/llm/databricks.ipynb +++ b/docs/docs/examples/llm/databricks.ipynb @@ -85,8 +85,8 @@ "source": [ "\n", "```bash\n", - "export DATABRICKS_API_KEY=\n", - "export DATABRICKS_API_BASE=\n", + "export DATABRICKS_TOKEN=\n", + "export DATABRICKS_SERVING_ENDPOINT=\n", "```\n", "\n", "Alternatively, you can pass your API key and serving endpoint to the LLM when you init it:" @@ -102,7 +102,7 @@ "llm = Databricks(\n", " model=\"databricks-dbrx-instruct\",\n", " api_key=\"your_api_key\",\n", - " api_base=\"https://[your-work-space].cloud.databricks.com/serving-endpoints/[your-serving-endpoint]\",\n", + " api_base=\"https://[your-work-space].cloud.databricks.com/serving-endpoints/\",\n", ")" ] }, diff --git a/docs/docs/examples/observability/MLflow.ipynb b/docs/docs/examples/observability/MLflow.ipynb index 152601d298b7f..7ebb3c4980331 100644 --- a/docs/docs/examples/observability/MLflow.ipynb +++ b/docs/docs/examples/observability/MLflow.ipynb @@ -4,9 +4,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Introduction to Using LlamaIndex with MLflow\n", + "# MLflow Tracing and E2E Integration with LlamaIndex\n", "\n", - "Welcome to this interactive tutorial designed to introduce you to [LlamaIndex](https://www.llamaindex.ai/) and its integration with [MLflow](https://mlflow.org/docs/latest/index.html#). This tutorial is structured as a notebook to provide a hands-on, practical learning experience with the simplest and most core features of LlamaIndex." + "Welcome to this interactive tutorial for LlamaIndex integration with [MLflow](https://mlflow.org/docs/latest/index.html#). This tutorial provides a hands-on learning experience with LlamaIndex and MLflow's core features.\n", + "\n", + "![mlflow-tracing](../../../_static/integrations/mlflow/mlflow.gif)" ] }, { @@ -20,42 +22,77 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### What you will learn\n", + "## Why use LlamaIndex with MLflow?\n", + "\n", + "The integration of LlamaIndex with MLflow provides a seamless experience for developing and managing LlamaIndex applications:\n", + "\n", + "* **MLflow Tracing** is a powerful observability tool for monitoring and debugging what happens inside the LlamaIndex models, helping you identify potential bottlenecks or issues quickly.\n", + "* **MLflow Experiment** allows you to track your indices/engines/workflows within MLflow and manage the many moving parts that comprise your LlamaIndex project, such as prompts, LLMs, tools, global configurations, and more.\n", + "* **MLflow Model** packages your LlamaIndex applications with all its dependency versions, input and output interfaces, and other essential metadata.\n", + "* **MLflow Evaluate** facilitates the efficient performance assessment of your LlamaIndex application, ensuring robust performance analytics and quick iterations." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## What you will learn\n", "By the end of this tutorial you will have:\n", "\n", "* Created an MVP VectorStoreIndex in LlamaIndex.\n", - "* Logged that index to the MLflow tracking server.\n", - "* Registered that index to the MLflow model registry.\n", - "* Loaded the model and performed inference.\n", - "* Explored the MLflow UI to learn about logged artifacts.\n", + "* Make inference using the index as a query engine and inspect it with MLflow Tracing.\n", + "* Logged the index to MLflow Experiment.\n", + "* Explored the MLflow UI to learn about how MLflow Model packages your LlamaIndex application.\n", "\n", - "These basics will familiarize you with the LlamaIndex user journey in MLlfow." + "These basics will familiarize you with the LlamaIndex user journey in MLflow. If you want to learn more about the integration with more advanced use cases (e.g. tool calling agent), please refer to [this advanced tutorial](https://mlflow.org/blog/mlflow-llama-index-workflow)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup\n", + "\n", + "1. Install MLflow and LlamaIndex:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%pip install mlflow>=2.18 llama-index>=0.10.44 -q" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### Setup\n", + "2. Open a separate terminal and run `mlflow ui --port 5000` to start the MLflow UI, if you haven't already. If you are running this notebook on a cloud environment, refer to the [How to Run Tutorial](https://www.mlflow.org/docs/latest/getting-started/running-notebooks.html) guide to learn different setups for MLflow.\n", "\n", - "First, we must ensure we have the required dependecies and environment variables. By default, LlamaIndex uses OpenAI as the source for LLMs and embeding models, so we'll do the same. Let's start by installing the requisite libraries and providing an OpenAI API key.\n" + "3. Create an MLflow Experiment and connect the notebook to it" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Note: you may need to restart the kernel to use updated packages.\n" - ] - } - ], + "outputs": [], "source": [ - "%pip install mlflow>=2.15 llama-index>=0.10.44 -q" + "import mlflow\n", + "\n", + "mlflow.set_experiment(\"llama-index-tutorial\")\n", + "mlflow.set_tracking_uri(\n", + " \"http://localhost:5000\"\n", + ") # Or your remote tracking server URI" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "4. Set OpenAI API key to the environment variable. If you are using different LLM provider, set the corresponding environment variable." ] }, { @@ -67,123 +104,124 @@ "import os\n", "from getpass import getpass\n", "\n", - "from llama_index.core import Document, VectorStoreIndex\n", - "from llama_index.core.llms import ChatMessage\n", - "\n", - "import mlflow\n", - "\n", "os.environ[\"OPENAI_API_KEY\"] = getpass(\"Enter your OpenAI API key: \")" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Enable MLflow Tracing\n", + "MLflow Tracing for LlamaIndex can be enabled just by one-line of code." + ] + }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ - "assert (\n", - " \"OPENAI_API_KEY\" in os.environ\n", - "), \"Please set the OPENAI_API_KEY environment variable.\"" + "mlflow.llama_index.autolog()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### Create a Index \n", - "\n", - "[Vector store indexes](https://docs.llamaindex.ai/en/stable/module_guides/storing/vector_stores/) are one of the core components in LlamaIndex. They contain embedding vectors of ingested document chunks (and sometimes the document chunks as well). These vectors enable various types of inference, such as query engines, chat engines, and retrievers, each serving different purposes in LlamaIndex.\n", - "\n", - "1. **Query Engine:**\n", - " - **Usage:** Perform straightforward queries to retrieve relevant information based on a user’s question.\n", - " - **Scenario:** Ideal for fetching concise answers or documents matching specific queries, similar to a search engine.\n", - "\n", - "2. **Chat Engine:**\n", - " - **Usage:** Engage in conversational AI tasks that require maintaining context and history over multiple interactions.\n", - " - **Scenario:** Suitable for interactive applications like customer support bots or virtual assistants, where conversation context is important.\n", - "\n", - "3. **Retriever:**\n", - " - **Usage:** Retrieve documents or text segments that are semantically similar to a given input.\n", - " - **Scenario:** Useful in retrieval-augmented generation (RAG) systems to fetch relevant context or background information, enhancing the quality of generated responses in tasks like summarization or question answering.\n", - "\n", - "By leveraging these different types of inference, LlamaIndex allows you to build robust AI applications tailored to various use cases, enhancing interaction between users and large language models.\n", - "\n", - "\n", + "## Create an Index \n", "\n", + "[Vector store indexes](https://docs.llamaindex.ai/en/stable/module_guides/storing/vector_stores/) are one of the core components in LlamaIndex. They contain embedding vectors of ingested document chunks (and sometimes the document chunks as well). These vectors can be leveraged for inference tasks using different **engine** types in LlamaIndex.\n", "\n", + "1. **Query Engine:**: Perform straightforward queries to retrieve relevant information based on a user’s question. Ideal for fetching concise answers or documents matching specific queries, similar to a search engine.\n", "\n", - "\n", - "\n", - "\n", - "\n" + "2. **Chat Engine:**: Engage in conversational AI tasks that require maintaining context and history over multiple interactions. Suitable for interactive applications like customer support bots or virtual assistants, where conversation context is important." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "------------- Example Document used to Enrich LLM Context -------------\n", - "Doc ID: e4c638ce-6757-482e-baed-096574550602\n", - "Text: Context LLMs are a phenomenal piece of technology for knowledge\n", - "generation and reasoning. They are pre-trained on large amounts of\n", - "publicly available data. How do we best augment LLMs with our own\n", - "private data? We need a comprehensive toolkit to help perform this\n", - "data augmentation for LLMs. Proposed Solution That's where LlamaIndex\n", - "comes in. Ll...\n", - "\n", - "------------- Example Query Engine -------------\n", - "LlamaIndex is a \"data framework\" designed to assist in building LLM apps by offering tools such as data connectors for various data sources, ways to structure data for easy use with LLMs, an advanced retrieval/query interface, and integrations with different application frameworks. It caters to both beginner and advanced users, providing a high-level API for simple data ingestion and querying, as well as lower-level APIs for customization and extension of different modules to suit individual needs.\n", - "\n", - "------------- Example Chat Engine -------------\n", - "LlamaIndex is a data framework designed to assist in building LLM apps by providing tools such as data connectors for various data sources, ways to structure data for easy use with LLMs, an advanced retrieval/query interface, and integrations with different application frameworks. It caters to both beginner and advanced users with a high-level API for easy data ingestion and querying, as well as lower-level APIs for customization and extension of different modules to suit specific needs.\n", - "\n", - "------------- Example Retriever -------------\n", - "[NodeWithScore(node=TextNode(id_='d18bb1f1-466a-443d-98d9-6217bf71ee5a', embedding=None, metadata={'filename': 'README.md', 'category': 'codebase'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={: RelatedNodeInfo(node_id='e4c638ce-6757-482e-baed-096574550602', node_type=, metadata={'filename': 'README.md', 'category': 'codebase'}, hash='3183371414f6a23e9a61e11b45ec45f808b148f9973166cfed62226e3505eb05')}, text='Context\\nLLMs are a phenomenal piece of technology for knowledge generation and reasoning.\\nThey are pre-trained on large amounts of publicly available data.\\nHow do we best augment LLMs with our own private data?\\nWe need a comprehensive toolkit to help perform this data augmentation for LLMs.\\n\\nProposed Solution\\nThat\\'s where LlamaIndex comes in. LlamaIndex is a \"data framework\" to help\\nyou build LLM apps. It provides the following tools:\\n\\nOffers data connectors to ingest your existing data sources and data formats\\n(APIs, PDFs, docs, SQL, etc.)\\nProvides ways to structure your data (indices, graphs) so that this data can be\\neasily used with LLMs.\\nProvides an advanced retrieval/query interface over your data:\\nFeed in any LLM input prompt, get back retrieved context and knowledge-augmented output.\\nAllows easy integrations with your outer application framework\\n(e.g. with LangChain, Flask, Docker, ChatGPT, anything else).\\nLlamaIndex provides tools for both beginner users and advanced users.\\nOur high-level API allows beginner users to use LlamaIndex to ingest and\\nquery their data in 5 lines of code. Our lower-level APIs allow advanced users to\\ncustomize and extend any module (data connectors, indices, retrievers, query engines,\\nreranking modules), to fit their needs.', mimetype='text/plain', start_char_idx=1, end_char_idx=1279, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n'), score=0.850998849877966)]\n" - ] - } - ], + "outputs": [], "source": [ - "print(\n", - " \"------------- Example Document used to Enrich LLM Context -------------\"\n", - ")\n", - "llama_index_example_document = Document.example()\n", - "print(llama_index_example_document)\n", - "\n", - "index = VectorStoreIndex.from_documents([llama_index_example_document])\n", + "from llama_index.core import Document, VectorStoreIndex\n", + "from llama_index.core.llms import ChatMessage\n", "\n", - "print(\"\\n------------- Example Query Engine -------------\")\n", + "# Create an index with a single dummy document\n", + "llama_index_example_document = Document.example()\n", + "index = VectorStoreIndex.from_documents([llama_index_example_document])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Query the Index" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's use this index to perform inference via a query engine." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ "query_response = index.as_query_engine().query(\"What is llama_index?\")\n", - "print(query_response)\n", + "print(query_response)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In addition to the response printed out, you should also see the MLflow Trace UI in the output cell. This provides a detailed yet intuitive visualization of the execution flow of the query engine, helping you understand the internal workings and debug any issues that may arise.\n", + "\n", + "![](../../../_static/integrations/mlflow/mlflow_query_trace_quickstart.png)\n", "\n", - "print(\"\\n------------- Example Chat Engine -------------\")\n", + "Let's make another query with a chat engine this time, to see the difference in the execution flow." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ "chat_response = index.as_chat_engine().chat(\n", " \"What is llama_index?\",\n", " chat_history=[\n", " ChatMessage(role=\"system\", content=\"You are an expert on RAG!\")\n", " ],\n", ")\n", - "print(chat_response)\n", + "print(chat_response)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![](../../../_static/integrations/mlflow/mlflow_chat_trace_quickstart.png)\n", + "\n", "\n", + "As shown in the traces, the primary difference is that the query engine executes a static workflow (RAG), while the chat engine uses an agentic workflow to dynamically pulls the necessary context from the index.\n", "\n", - "print(\"\\n------------- Example Retriever -------------\")\n", - "retriever_response = index.as_retriever().retrieve(\"What is llama_index?\")\n", - "print(retriever_response)" + "You can also check the logged traces in MLflow UI, by navigating to the experiment you created earlier and selecting the `Trace` tab. If you don't want to show the traces in the output cell and only records them in MLflow, run `mlflow.tracing.disable_notebook_display()` in the notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### Log the Index with MLflow\n", + "## Save the Index with MLflow\n", "\n", - "The below code logs a LlamaIndex model with MLflow, allowing you to persist and manage it across different environments. By using MLflow, you can track, version, and reproduce your model reliably. The script logs parameters, an example input, and registers the model under a specific name. The `model_uri` provides a unique identifier for retrieving the model later. This persistence is essential for ensuring consistency and reproducibility in development, testing, and production. Managing the model with MLflow simplifies loading, deployment, and sharing, maintaining an organized workflow.\n", + "The below code logs a LlamaIndex model with MLflow, tracking its parameters and an example input while registering it with a unique model_uri. This ensures consistent, reproducible model management across development, testing, and production, and simplifies deployment and sharing.\n", "\n", - "Key Parameters\n", + "Key Parameters:\n", "\n", "* ``engine_type``: defines the pyfunc and spark_udf inference type\n", "* ``input_example``: defines the the input signature and infers the output signature via a prediction\n", @@ -194,62 +232,25 @@ "cell_type": "code", "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "2024/07/24 17:58:27 INFO mlflow.llama_index.serialize_objects: API key(s) will be removed from the global Settings object during serialization to protect against key leakage. At inference time, the key(s) must be passed as environment variables.\n", - "/Users/michael.berk/opt/anaconda3/envs/mlflow-dev/lib/python3.8/site-packages/_distutils_hack/__init__.py:26: UserWarning: Setuptools is replacing distutils.\n", - " warnings.warn(\"Setuptools is replacing distutils.\")\n", - "Successfully registered model 'my_llama_index_vector_store'.\n", - "Created version '1' of model 'my_llama_index_vector_store'.\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "643e7b6936674e469f98d94004f3424a", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "Downloading artifacts: 0%| | 0/12 [00:00" - ] - }, - "execution_count": null, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import os\n", - "import subprocess\n", - "\n", - "from IPython.display import IFrame\n", - "\n", - "# Start the MLflow UI in a background process\n", - "mlflow_ui_command = [\"mlflow\", \"ui\", \"--port\", \"5000\"]\n", - "subprocess.Popen(\n", - " mlflow_ui_command,\n", - " stdout=subprocess.PIPE,\n", - " stderr=subprocess.PIPE,\n", - " preexec_fn=os.setsid,\n", - ")" + "Finally, let's explore the MLflow's UI to what we have logged so far. You can access the UI by opening `http://localhost:5000` in your browser, or run the following cell to display it inside the notebook." ] }, { @@ -376,8 +310,7 @@ "metadata": {}, "outputs": [], "source": [ - "# Wait for the MLflow server to start then run the following command\n", - "# Note that cached results don't render, so you need to run this to see the UI\n", + "# Directly renders MLflow UI within the notebook for easy browsing:)\n", "IFrame(src=\"http://localhost:5000\", width=1000, height=600)" ] }, @@ -388,6 +321,11 @@ "Let's navigate to the experiments tab in the top left of the screen and click on our most recent\n", "run, as shown in the image below.\n", "\n", + "![](../../../_static/integrations/mlflow/mlflow_run_quickstart.png)\n", + "\n", + "\n", + "The Run page shows the overall metadata about your experiment. You can further navigate to the `Artifacts` tab to see the logged artifacts (models).\n", + "\n", "MLflow logs artifacts associated with your model and its environment during the MLflow run. \n", "Most of the logged files, such as the `conda.yaml`, `python_env.yml`, and \n", "`requirements.txt` are standard to all MLflow logging and facilitate reproducibility between\n", @@ -398,42 +336,19 @@ "\n", "By storing these objects, MLflow is able to recreate the environment in which you logged your model.\n", "\n", - "![mlflow_ui_run](mlflow_ui_run.png)\n", - "\n", - "**Important:** MLflow will not serialize API keys. Those must be present in your model loading \n", - "environment as environment variables. \n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We also created a record of the model in the model registry. By simply specifying \n", - "`registered_model_name` and `input_example` when logging the model, we get robust signature\n", - "inference and an instance in the model registry, as shown below.\n", + "![](../../../_static/integrations/mlflow/mlflow_settings_quickstart.png)\n", "\n", - "![mlflow_ui_registered_model](mlflow_ui_registered_model.png)\n" + "**Important:** MLflow will not serialize API keys. Those must be present in your model loading environment as environment variables. \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Finally, let's explore the traces we logged. In the `Experiments` tab we can click on `Tracing` to \n", - "view the logged traces for our two inference calls. Tracing effectively shows a callback-based\n", - "stacktrace for what ocurred in our inference system. \n", + "Finally, you can see the full list of traces that were logged during the tutorial by navigating to the `Tracing` tab. By clicking on a each row, you can see the detailed trace view similar to the one shown in the output cell earlier.\n", "\n", - "![mlflow_tracing_quickstart](mlflow_tracing_quickstart.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If we click on our first trace, we can see some really cool details about our inputs, outputs,\n", - "and the duration of each step in the chain.\n", "\n", - "![mlflow_single_trace_quickstart](mlflow_single_trace_quickstart.png)" + "![](../../../_static/integrations/mlflow/mlflow_traces_list_quickstart.png)\n" ] }, { diff --git a/docs/docs/examples/observability/mlflow_single_trace_quickstart.png b/docs/docs/examples/observability/mlflow_single_trace_quickstart.png deleted file mode 100644 index aa859074e0c62..0000000000000 Binary files a/docs/docs/examples/observability/mlflow_single_trace_quickstart.png and /dev/null differ diff --git a/docs/docs/examples/observability/mlflow_tracing_quickstart.png b/docs/docs/examples/observability/mlflow_tracing_quickstart.png deleted file mode 100644 index dbdf29ccbe218..0000000000000 Binary files a/docs/docs/examples/observability/mlflow_tracing_quickstart.png and /dev/null differ diff --git a/docs/docs/examples/observability/mlflow_ui_registered_model.png b/docs/docs/examples/observability/mlflow_ui_registered_model.png deleted file mode 100644 index 31682f53d22bf..0000000000000 Binary files a/docs/docs/examples/observability/mlflow_ui_registered_model.png and /dev/null differ diff --git a/docs/docs/examples/observability/mlflow_ui_run.png b/docs/docs/examples/observability/mlflow_ui_run.png deleted file mode 100644 index ad58d8c2aeae2..0000000000000 Binary files a/docs/docs/examples/observability/mlflow_ui_run.png and /dev/null differ diff --git a/docs/docs/examples/workflow/function_calling_agent.ipynb b/docs/docs/examples/workflow/function_calling_agent.ipynb index ff1f29c72d4b2..bf433819ee2bc 100644 --- a/docs/docs/examples/workflow/function_calling_agent.ipynb +++ b/docs/docs/examples/workflow/function_calling_agent.ipynb @@ -42,46 +42,6 @@ "Set up tracing to visualize each step in the workflow." ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!pip install \"llama-index-core>=0.10.43\" \"openinference-instrumentation-llama-index>=2.2.2\" \"opentelemetry-proto>=1.12.0\" opentelemetry-exporter-otlp opentelemetry-sdk" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from opentelemetry.sdk import trace as trace_sdk\n", - "from opentelemetry.sdk.trace.export import SimpleSpanProcessor\n", - "from opentelemetry.exporter.otlp.proto.http.trace_exporter import (\n", - " OTLPSpanExporter as HTTPSpanExporter,\n", - ")\n", - "from openinference.instrumentation.llama_index import LlamaIndexInstrumentor\n", - "\n", - "\n", - "# Add Phoenix API Key for tracing\n", - "PHOENIX_API_KEY = \"\"\n", - "os.environ[\"OTEL_EXPORTER_OTLP_HEADERS\"] = f\"api_key={PHOENIX_API_KEY}\"\n", - "\n", - "# Add Phoenix\n", - "span_phoenix_processor = SimpleSpanProcessor(\n", - " HTTPSpanExporter(endpoint=\"https://app.phoenix.arize.com/v1/traces\")\n", - ")\n", - "\n", - "# Add them to the tracer\n", - "tracer_provider = trace_sdk.TracerProvider()\n", - "tracer_provider.add_span_processor(span_processor=span_phoenix_processor)\n", - "\n", - "# Instrument the application\n", - "LlamaIndexInstrumentor().instrument(tracer_provider=tracer_provider)" - ] - }, { "cell_type": "markdown", "metadata": {}, diff --git a/docs/docs/module_guides/observability/index.md b/docs/docs/module_guides/observability/index.md index 4a59ef45d66b2..75ca875699fd6 100644 --- a/docs/docs/module_guides/observability/index.md +++ b/docs/docs/module_guides/observability/index.md @@ -79,6 +79,37 @@ llama_index.core.set_global_handler( ![](../../_static/integrations/arize_phoenix.png) + +### MLflow + +[MLflow](https://mlflow.org/docs/latest/llms/tracing/index.html) is an open-source MLOps/LLMOps platform, focuses on the full lifecycle for machine learning projects, ensuring that each phase is manageable, traceable, and reproducible. +**MLflow Tracing** is an OpenTelemetry-based tracing capability and supports one-click instrumentation for LlamaIndex applications. + +#### Usage Pattern + +Since MLflow is open-source, you can start using it without any account creation or API key setup. Jump straight into the code after installing the MLflow package! + +```python +import mlflow + +mlflow.llama_index.autolog() # Enable mlflow tracing +``` + +![](../../_static/integrations/mlflow/mlflow.gif) + +#### Guides + +MLflow LlamaIndex integration also provides experiment tracking, evaluation, dependency management, and more. Check out the [MLflow documentation](https://mlflow.org/docs/latest/llms/llama-index/index.html) for more details. + +#### Support Table + +MLflow Tracing support the full range of LlamaIndex features. Some new features like [AgentWorkflow](https://www.llamaindex.ai/blog/introducing-agentworkflow-a-powerful-system-for-building-ai-agent-systems) requires MLflow >= 2.18.0. + +| Streaming | Async | Engine | Agents | Workflow | AgentWorkflow | +| --- | --- | --- | --- | --- | --- | +| ✅ | ✅ | ✅ | ✅ | ✅ (>= 2.18) | ✅ (>= 2.18) | + + ### OpenLLMetry [OpenLLMetry](https://github.com/traceloop/openllmetry) is an open-source project based on OpenTelemetry for tracing and monitoring @@ -546,39 +577,6 @@ import llama_index.core llama_index.core.set_global_handler("simple") ``` -### MLflow - -[MLflow](https://mlflow.org/docs/latest/index.html) is an open-source platform, purpose-built to assist machine learning practitioners and teams in handling the complexities of the machine learning process. MLflow focuses on the full lifecycle for machine learning projects, ensuring that each phase is manageable, traceable, and reproducible. - -##### Install - -```shell -pip install mlflow>=2.15 llama-index>=0.10.44 -``` - -#### Usage Pattern - -```python -import mlflow - -mlflow.llama_index.autolog() # Enable mlflow tracing - -with mlflow.start_run() as run: - mlflow.llama_index.log_model( - index, - artifact_path="llama_index", - engine_type="query", # Logged engine type for inference - input_example="hi", - registered_model_name="my_llama_index_vector_store", - ) - model_uri = f"runs:/{run.info.run_id}/llama_index" - -predictions = mlflow.pyfunc.load_model(model_uri).predict("hi") -print(f"Query engine prediction: {predictions}") -``` - -![](../../_static/integrations/mlflow.gif) - #### Guides - [MLflow](https://mlflow.org/docs/latest/llms/llama-index/index.html)