Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move requirements to pyproject.toml files on starters #246

Merged
merged 26 commits into from
Nov 13, 2024

Conversation

lrcouto
Copy link
Contributor

@lrcouto lrcouto commented Oct 30, 2024

Motivation and Context

Part of kedro-org/kedro#4116.

Moves project template dependencies to pyproject.toml to allow for Kedro projects created with starter templates or the example pipeline to be created with package managers like uv, pdm or rye

How has this been tested?

Can be tested by creating a new kedro project with a starter and using the --checkout flag to get the templates from this branch. For example,

uvx kedro new --starter=spaceflights-pandas --checkout=starters-template-compatibility

Should go through character creation normally, and managing dependencies with commands like uv add and uv remove should work correctly, altering the pyproject.toml file and creating the uv.lock file on the project directory.

Checklist

  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Assigned myself to the PR
  • Added tests to cover my changes

@lrcouto lrcouto changed the title Change pyproject files on spaceflights Move requirements to pyproject.toml files on starters Oct 30, 2024
Signed-off-by: Laura Couto <[email protected]>
Signed-off-by: Laura Couto <[email protected]>
Signed-off-by: Laura Couto <[email protected]>
Signed-off-by: Laura Couto <[email protected]>
Signed-off-by: Laura Couto <[email protected]>
This reverts commit 088365d.

Signed-off-by: Laura Couto <[email protected]>
@lrcouto lrcouto force-pushed the starters-template-compatibility branch from 919a5f8 to a87df2e Compare October 31, 2024 00:50
@lrcouto
Copy link
Contributor Author

lrcouto commented Oct 31, 2024

The linter on this project starts to pop out errors when the project dependencies are put directly on the pyproject.toml files. I don't know if that has to do with ruff resolving dependencies differently when they're on pyproject.toml, but it decided that Dict and Tuple from the typing library should not be used, as well as other errors, like this:

src/project_dummy/pipeline_registry.py:2:1: UP035 typing.Dict is deprecated, use dict instead
src/project_dummy/pipeline_registry.py:8:29: UP006 [*] Use dict instead of Dict for type annotation
src/project_dummy/pipelines/data_science/nodes.py:2:1: UP035 typing.Dict is deprecated, use dict instead
src/project_dummy/pipelines/data_science/nodes.py:2:1: UP035 typing.Tuple is deprecated, use tuple instead
src/project_dummy/pipelines/data_science/nodes.py:10:48: UP006 [*] Use dict instead of Dict for type annotation
src/project_dummy/pipelines/data_science/nodes.py:10:57: UP006 [*] Use tuple instead of Tuple for type annotation

AFAIK those have been deprecated for a little while? I've changed everything to comply with the linter.

@astrojuanlu

This comment was marked as resolved.

@lrcouto

This comment was marked as resolved.

@lrcouto
Copy link
Contributor Author

lrcouto commented Oct 31, 2024

PR with the linting updates at #249

Copy link
Member

@astrojuanlu astrojuanlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, thanks @lrcouto!

@lrcouto lrcouto marked this pull request as ready for review November 5, 2024 20:28
"notebook",
"kedro~={{ cookiecutter.kedro_version }}",
"kedro[jupyter]",
"kedro-datasets[spark, pandas, spark.SparkDataset, pandas.ParquetDataset]>=1.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also update this requirement to >=3.0 like in all the other starters, to use the "new" way we define the dependency.

@ankatiyar
Copy link
Contributor

Should the requirements.txt files from the starters also be removed?
I also think the docs will need to be updated to remove the pip install -r requirements.txt instructions after this is merged. Overall looks good to me!

@astrojuanlu
Copy link
Member

I think @lrcouto was advocating for doing this in 2 steps, given that there were some unintended breakage when removing requirements.txt, at the cost of introducing some redundancy

@lrcouto
Copy link
Contributor Author

lrcouto commented Nov 7, 2024

I think @lrcouto was advocating for doing this in 2 steps, given that there were some unintended breakage when removing requirements.txt, at the cost of introducing some redundancy

Yes, this. Removing requirements.txt (or making it install the dependencies from pyproject.toml) will be a bit tricky.

Removing it would affect documentation, docstrings and example material that we already have and uses the pip install -r requirements.txt workflow. It is also the most common way to install python dependencies.

Changing the content to "-e ." like it was suggested on the issue works when running a project locally, but breaks our end-to-end tests, so I'd need to investigate it a little further to figure out a way around it. It will also require changes in the tools-related functions that write the dependencies (it'll probaly need those anyway).

As for keeping the redundancy, the biggest disadvantage I see is that if we have to change any dependencies, we have to make sure that we do it on both files.

Copy link
Contributor

@ElenaKhaustova ElenaKhaustova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left minor suggestion on updating requirements.txt for consistency. Overall looks good to me. 👍

Signed-off-by: Laura Couto <[email protected]>
Signed-off-by: Laura Couto <[email protected]>
Signed-off-by: Laura Couto <[email protected]>
Signed-off-by: Laura Couto <[email protected]>
This reverts commit 088365d.

Signed-off-by: Laura Couto <[email protected]>
…g/kedro-starters into starters-template-compatibility
Copy link
Contributor

@ankatiyar ankatiyar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏾

@@ -3,5 +3,5 @@ jupyterlab>=3.0
notebook
kedro~={{ cookiecutter.kedro_version }}
kedro[jupyter]
kedro-datasets[spark, pandas, spark.SparkDataset, pandas.ParquetDataset]>=1.0
kedro-datasets[spark, pandas, spark.SparkDataset, pandas.ParquetDataset]>=3.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to update this to match with the 3.0 syntax e.g. pandas-parquetdataset instead of pandas.ParquetDataset

Copy link
Contributor

@ElenaKhaustova ElenaKhaustova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@lrcouto lrcouto merged commit 1cbb5da into main Nov 13, 2024
27 checks passed
@lrcouto lrcouto deleted the starters-template-compatibility branch November 13, 2024 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants