Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with Databricks serving endpoint #902

Open
1 task done
azuretime opened this issue Dec 12, 2024 · 1 comment
Open
1 task done

Integration with Databricks serving endpoint #902

azuretime opened this issue Dec 12, 2024 · 1 comment
Labels
question Further information is requested

Comments

@azuretime
Copy link

azuretime commented Dec 12, 2024

Did you check the docs?

  • I have read all the NeMo-Guardrails docs

Is your feature request related to a problem? Please describe.

Hi, there is no detail on how to use Nemo-Guardrails with Databricks serving endpoint in the docs. What should be the correct way to use the endpoints with a Databricks access token?

Describe the solution you'd like

I have an existing RAG chain and want to use Nemo-Guardrails to filter input. Models in my notebook are loaded with Databricks(endpoint_name="endpoint_name",max_tokens=,temperature=) and ChatDatabricks(endpoint_name="endpoint_name",max_tokens=,temperature=) ChatDatabricks . The models I use include llama3 and databricks dbrx serving endpoints.

from langchain_community.chat_models import ChatDatabricks
from langchain_community.llms import Databricks

I know it's possible to use RunnableRails or "guardrails | some_chain" (LangChain Integration) but I want the self check input step to be done before the retrieval step inside the chain. That means if self check input decides the request should be blocked (Yes), the chain should reply with a default answer without retrieving any context. So how can I load the LLM from "endpoint_name" to check input inside the chain?

Describe alternatives you've considered

Also, is something like the method below possible?
In config.yml:

models:
  - type: main
    engine: "databricks_endpoint"
    model: "meta-llama-3-8b-instruct"
    headers:
          Authorization: "Bearer <your-access-token>"
    parameters:
      endpoint_url: https://..../serving-endpoints/meta-llama-3-8b-instruct/invocations
      task: "chat"
      model_kwargs:
        temperature: 0.1
        max_length: 500

Additional context

No response

@azuretime azuretime added enhancement New feature or request status: needs triage New issues that have not yet been reviewed or categorized. labels Dec 12, 2024
@Pouyanpi
Copy link
Collaborator

Pouyanpi commented Jan 7, 2025

@azuretime, thanks for opening this issue, are you trying to use databricks?

I assume you have already checked this guide, so to start let's change the engine to databricks.

Also, you cannot pass headers but need to set an environment variable, for the possible list of parameters you can consult this link or better their github repo.

Does this address your issue?

@Pouyanpi Pouyanpi added question Further information is requested and removed enhancement New feature or request status: needs triage New issues that have not yet been reviewed or categorized. labels Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants