Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build(python)!: Streamline optional dependency definitions in pyproject.toml #17168

Merged
merged 4 commits into from
Jun 26, 2024

Conversation

stinodego
Copy link
Member

@stinodego stinodego commented Jun 24, 2024

Credit to @alexander-beedie for the initial work in #17064

Closes #17143

Changes

  • Breaking:

    • fastexcel group was renamed to calamine to align with the engine name in the API.
    • async group now installs gevent instead of nest-asyncio. The gevent group was removed. nest-asyncio is installed as part of the new database group.
    • matplotlib group was renamed to graph, since it enables the LazyFrame.show_graph functionality.
  • Non-breaking:

    • adbc and plot group now also install pandas/pyarrow, which are required for them to work.
    • New groups excel and database install all the engines for their respective functionality.

Rationale

Here's the logic by which the new extras group structure was constructed:

  1. Extras are named after the functionality they enable, not after the specific dependency. Theoretically, we may switch out dependencies while keeping our public API the same.
  2. Some functionality is tied to a specific dependency, e.g. to_numpy is tied to NumPy, and the xlsx2csv dependency is tied to a specific engine for read_excel. These get their own key.
  3. Keys may include others keys, e.g. pandas will require PyArrow. This is defined as polars[pyarrow] so that we need to define the minimum dependency version only once.
  4. Different engines for the same functionality are grouped as a single key for that functionality. This is offered as a convenience - for production use it is recommended to pick a single engine and specify that as an extra.
  5. An all group is included for convenience. Again, this is not recommended for production use.

Users are advised to explicitly mention the extras that they use when installing Polars, e.g. polars[numpy] when relying on Polars NumPy interop. This will make sure your dependencies are compatible.

Example

Before:

pip install 'polars[fastexcel,gevent,matplotlib]'

After:

pip install 'polars[calamine,async,graph]'

@github-actions github-actions bot added breaking Change that breaks backwards compatibility build Changes that affect the build system or external dependencies python Related to Python Polars labels Jun 24, 2024
@stinodego stinodego marked this pull request as ready for review June 24, 2024 22:23
@stinodego stinodego added this to the 1.0.0 milestone Jun 24, 2024
Copy link

codecov bot commented Jun 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.94%. Comparing base (4731834) to head (41a6241).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #17168      +/-   ##
==========================================
- Coverage   80.94%   80.94%   -0.01%     
==========================================
  Files        1464     1464              
  Lines      191928   191928              
  Branches     2742     2742              
==========================================
- Hits       155349   155347       -2     
- Misses      36070    36072       +2     
  Partials      509      509              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stinodego stinodego marked this pull request as draft June 24, 2024 22:34
@stinodego stinodego marked this pull request as ready for review June 24, 2024 23:49
@stinodego stinodego added the reference Reference issue for recurring topics label Jun 25, 2024
@stinodego stinodego changed the title build(python)!: Streamline optional 'extra' dependency groups build(python)!: Streamline optional dependency definitions in pyproject.toml Jun 25, 2024
@ritchie46 ritchie46 merged commit 65b7c1a into main Jun 26, 2024
22 checks passed
@ritchie46 ritchie46 deleted the extras branch June 26, 2024 06:48
alexander-beedie pushed a commit to alexander-beedie/polars that referenced this pull request Jun 26, 2024
alexander-beedie pushed a commit to alexander-beedie/polars that referenced this pull request Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Change that breaks backwards compatibility build Changes that affect the build system or external dependencies python Related to Python Polars reference Reference issue for recurring topics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mention required feature flags for plotting / convert to pandas without PyArrow if possible
2 participants