Skip to content

Commit

Permalink
Merge branch 'rioxarray'
Browse files Browse the repository at this point in the history
  • Loading branch information
tgoelles committed Jun 5, 2024
2 parents 228b023 + a14a7f9 commit f1bb934
Show file tree
Hide file tree
Showing 16 changed files with 467 additions and 44 deletions.
2 changes: 1 addition & 1 deletion kedro-airflow/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ The Airflow DAG configuration can be customized by editing this file.
### Step 3: Package and install the Kedro pipeline in the Airflow executor's environment

After generating and deploying the DAG file, you will then need to package and install the Kedro pipeline into the Airflow executor's environment.
Please visit the guide to [deploy Kedro as a Python package](https://docs.kedro.org/en/stable/deployment/single_machine.html#package-based) for more details.
Please visit the guide to [Apache Airflow deployment](https://docs.kedro.org/en/stable/deployment/airflow.html) for more details.

### FAQ

Expand Down
2 changes: 2 additions & 0 deletions kedro-airflow/RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# Upcoming Release

# Release 0.9.0
* Sort DAGs to make sure `kedro airflow create` is deterministic.
* Option to group MemoryDatasets in the same Airflow task (breaking change for custom template via `--jinja-file`).
* Include the environment name in the DAG file name when different from the default.
Expand Down
2 changes: 1 addition & 1 deletion kedro-airflow/kedro_airflow/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
"""Kedro plugin for running a project with Airflow."""

__version__ = "0.8.0"
__version__ = "0.9.0"
2 changes: 1 addition & 1 deletion kedro-datasets/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<!-- Note that the contents of this file are also used in the documentation, see docs/source/index.md -->

[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Python Version](https://img.shields.io/badge/python-3.9%20%7C%203.10%20%7C%203.11-blue.svg)](https://pypi.org/project/kedro-datasets/)
[![Python Version](https://img.shields.io/badge/python-3.9%20%7C%203.10%20%7C%203.11%20%7C%203.12-blue.svg)](https://pypi.org/project/kedro-datasets/)
[![PyPI Version](https://badge.fury.io/py/kedro-datasets.svg)](https://pypi.org/project/kedro-datasets/)
[![Code Style: Black](https://img.shields.io/badge/code%20style-black-black.svg)](https://github.com/ambv/black)

Expand Down
18 changes: 14 additions & 4 deletions kedro-datasets/RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,19 @@

## Major features and improvements

| * Added the following new experimental datasets:

| Type | Description | Location |
| -------------------------- | ---------------------------------------------------- | --------------------------------------- |
| `rioxarray.GeotiffDataset` | A dataset for loading and saving geotiff data files. | `kedro_datasets_experimental.rioxarray` |
* Added the following new **experimental** datasets:

| Type | Description | Location |
|-------------------------------------|-----------------------------------------------------------|-----------------------------------------|
| `langchain.ChatAnthropicDataset` | A dataset for loading a ChatAnthropic langchain model. | `kedro_datasets_experimental.langchain` |
| `langchain.ChatCohereDataset` | A dataset for loading a ChatCohere langchain model. | `kedro_datasets_experimental.langchain` |
| `langchain.OpenAIEmbeddingsDataset` | A dataset for loading a OpenAIEmbeddings langchain model. | `kedro_datasets_experimental.langchain` |
| `langchain.ChatOpenAIDataset` | A dataset for loading a ChatOpenAI langchain model. | `kedro_datasets_experimental.langchain` |
| `rioxarray.GeotiffDataset` | A dataset for loading and saving geotiff raster data | `kedro_datasets_experimental.rioxarray` |


# Release 3.0.1

## Bug fixes and other changes

Expand All @@ -18,6 +26,8 @@
Many thanks to the following Kedroids for contributing PRs to this release:

* [Charles Guan](https://github.com/charlesbmi)
* [Thomas Gölles](https://github.com/tgoelles)


# Release 3.0.0

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,11 @@ kedro_datasets_experimental
:toctree:
:template: autosummary/class.rst


kedro_datasets_experimental.langchain.ChatAnthropicDataset
kedro_datasets_experimental.langchain.ChatCohereDataset
kedro_datasets_experimental.langchain.ChatOpenAIDataset
kedro_datasets_experimental.langchain.OpenAIEmbeddingsDataset
kedro_datasets_experimental.langchain.ChatAnthropicDataset
kedro_datasets_experimental.rioxarray.GeotiffDataset

3 changes: 2 additions & 1 deletion kedro-datasets/docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,8 @@
"deltalake.table.Metadata",
"DataCatalog",
"ibis.backends.BaseBackend",
"langchain_openai.chat_models.base.ChatOpenAI",
"langchain_openai.embeddings.base.OpenAIEmbeddings",
),
"py:data": (
"typing.Any",
Expand Down Expand Up @@ -221,7 +223,6 @@

# -- Kedro specific configuration -----------------------------------------
KEDRO_MODULES = [
"kedro_datasets",
"kedro_datasets_experimental"
]

Expand Down
2 changes: 1 addition & 1 deletion kedro-datasets/kedro_datasets/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""``kedro_datasets`` is where you can find all of Kedro's data connectors."""

__all__ = ["KedroDeprecationWarning"]
__version__ = "3.0.0"
__version__ = "3.0.1"

import sys
import warnings
Expand Down
19 changes: 19 additions & 0 deletions kedro-datasets/kedro_datasets_experimental/langchain/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
"""Provides interface to langchain model API objects."""
from typing import Any

import lazy_loader as lazy

# https://github.com/pylint-dev/pylint/issues/4300#issuecomment-1043601901
ChatOpenAIDataset: Any
OpenAIEmbeddingsDataset: Any
ChatAnthropicDataset: Any
ChatCohereDataset: Any

__getattr__, __dir__, __all__ = lazy.attach(
__name__,
submod_attrs={
"_openai": ["ChatOpenAIDataset", "OpenAIEmbeddingsDataset"],
"_anthropic": ["ChatAnthropicDataset"],
"_cohere": ["ChatCohereDataset"],
},
)
78 changes: 78 additions & 0 deletions kedro-datasets/kedro_datasets_experimental/langchain/_anthropic.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
"""Defines an interface to common Anthropic models."""

from typing import Any, NoReturn

from kedro.io import AbstractDataset, DatasetError
from langchain_anthropic import ChatAnthropic


class ChatAnthropicDataset(AbstractDataset[None, ChatAnthropic]):
"""``ChatAnthropicDataset`` loads a ChatAnthropic `langchain <https://python.langchain.com/>`_ model.
Example usage for the :doc:`YAML API <kedro:data/data_catalog_yaml_examples>`:
catalog.yml:
.. code-block:: yaml
claude_instant_1:
type: langchain.ChatAnthropicDataset
kwargs:
model: "claude-instant-1"
temperature: 0.0
credentials: anthropic
credentials.yml:
.. code-block:: yaml
anthropic:
anthropic_api_url: <anthropic-api-base>
anthropic_api_key: <anthropic-api-key>
Example usage for the
`Python API <https://kedro.readthedocs.io/en/stable/data/\
advanced_data_catalog_usage.html>`_:
.. code-block:: pycon
>>> from kedro_datasets_experimental.langchain import ChatAnthropicDataset
>>> llm = ChatAnthropicDataset(
... credentials={
... "anthropic_api_url": "xxx"
... "anthropic_api_key": "xxx",
... },
... kwargs={
... "model": "claude-instant-1",
... "temperature": 0.0,
... }
... ).load()
>>>
>>> # See: https://python.langchain.com/docs/integrations/chat/anthropic
>>> llm.invoke("Hello world!")
"""

def __init__(self, credentials: dict[str, str], kwargs: dict[str, Any] = None):
"""Constructor.
Args:
credentials: must contain `anthropic_api_url` and `anthropic_api_key`.
kwargs: keyword arguments passed to the ChatAnthropic constructor.
"""
self.anthropic_api_url = credentials["anthropic_api_url"]
self.anthropic_api_key = credentials["anthropic_api_key"]
self.kwargs = kwargs or {}

def _describe(self) -> dict[str, Any]:
return {**self.kwargs}

def _save(self, data: None) -> NoReturn:
raise DatasetError(f"{self.__class__.__name__} is a read only data set type")

def _load(self) -> ChatAnthropic:
return ChatAnthropic(
anthropic_api_url=self.anthropic_api_url,
anthropic_api_key=self.anthropic_api_key,
**self.kwargs,
)
76 changes: 76 additions & 0 deletions kedro-datasets/kedro_datasets_experimental/langchain/_cohere.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
"""
Cohere dataset definition.
"""

from typing import Any, NoReturn

from kedro.io import AbstractDataset, DatasetError
from langchain_cohere import ChatCohere


class ChatCohereDataset(AbstractDataset[None, ChatCohere]):
"""``ChatCohereDataset`` loads a ChatCohere `langchain <https://python.langchain.com/>`_ model.
Example usage for the :doc:`YAML API <kedro:data/data_catalog_yaml_examples>`:
catalog.yml:
.. code-block:: yaml
command:
type: langchain.ChatCohereDataset
kwargs:
model: "command"
temperature: 0.0
credentials: cohere
credentials.yml:
.. code-block:: yaml
cohere:
cohere_api_url: <cohere-api-base>
cohere_api_key: <cohere-api-key>
Example usage for the
`Python API <https://kedro.readthedocs.io/en/stable/data/\
advanced_data_catalog_usage.html>`_:
.. code-block:: pycon
>>> from kedro_datasets_experimental.langchain import ChatCohereDataset
>>> llm = ChatCohereDataset(
... credentials={
... "cohere_api_key": "xxx",
... "cohere_api_url": "xxx",
... },
... kwargs={
... "model": "command",
... "temperature": 0.0,
... }
... ).load()
>>>
>>> # See: https://python.langchain.com/v0.1/docs/integrations/chat/cohere/
>>> llm.invoke("Hello world!")
"""

def __init__(self, credentials: dict[str, str], kwargs: dict[str, Any] = None):
"""Constructor.
Args:
credentials: must contain `cohere_api_url` and `cohere_api_key`.
kwargs: keyword arguments passed to the underlying constructor.
"""
self.cohere_api_url = credentials["cohere_api_url"]
self.cohere_api_key = credentials["cohere_api_key"]
self.kwargs = kwargs or {}

def _describe(self) -> dict[str, Any]:
return {**self.kwargs}

def _save(self, data: None) -> NoReturn:
raise DatasetError(f"{self.__class__.__name__} is a read only data set type")

def _load(self) -> ChatCohere:
return ChatCohere(cohere_api_key=self.cohere_api_key, base_url=self.cohere_api_url, **self.kwargs)
Loading

0 comments on commit f1bb934

Please sign in to comment.