Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DataFrame fill_null #14765

Closed
kosiew opened this issue Feb 19, 2025 · 0 comments · Fixed by #14769
Closed

Add DataFrame fill_null #14765

kosiew opened this issue Feb 19, 2025 · 0 comments · Fixed by #14769
Labels
enhancement New feature or request

Comments

@kosiew
Copy link
Contributor

kosiew commented Feb 19, 2025

Is your feature request related to a problem or challenge?

There is a common operation in libraries such as pyspark to fill nulls in an entire DataFrame (or to limit by columns). It would be nice to have a similar feature in datafusion and datafusion-python.

Describe the solution you'd like

If I have a dataframe with a bunch of null values in different columns, I would want to replace all nulls in those columns with the provided value IF it can be cast to the column's type. Otherwise no-op should happen. Also the user should be able to limit which columns this applies to.

Describe alternatives you've considered

Instead of having a built‑in fill_null, you can use conditional expressions or functions like coalesce (or nvl) to replace nulls or NaNs.

Additional context

This is a repost from apache/datafusion-python#922, prompted by this PR comment

@kosiew kosiew added the enhancement New feature or request label Feb 19, 2025
@kosiew kosiew changed the title Add DataFrame fill_nan/fill_null Add DataFrame fill_null Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant