Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More dask fixes #16

Closed
wants to merge 109 commits into from
Closed

More dask fixes #16

wants to merge 109 commits into from

Conversation

lithomas1
Copy link

Reference issue

What does this implement/fix?

Additional information

Add a new optional parameter `callback` to least_squares.
A user-defined function that is called on each iteration, can stop
optimization by returning True or raising StopIteration.
Same signature and API as in eg. scipy.optimize.differential_evolution.
Only implemented for trf and dogleg methods so far.
@lithomas1 lithomas1 marked this pull request as ready for review November 23, 2024 18:52
Comment on lines +336 to +338
# Trick to get the array-api-compat namespace for dask
# (otherwise the "naked" dask.array asarray does not respect
# the input dtype)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I fixed this in dask/dask#11288?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed that patch really quick - but at a quick glance it looks like that only fixes the issue when the input is a dask array.

When the input is a numpy array it looks like we go into the from_array codepath which is not passed the dtype option?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my patch didn't fix it then this should be a new bug report for Dask - we shouldn't have to work around that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll poke around more inside dask to double check if this is still an issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be fixed in 2024.12.1 at least - gonna take this out.
Maybe I was on too old of a dask version.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in dask/dask#11586

Comment on lines 202 to 205
# use array-api-compat namespace for dask
# since dask asarray never makes a copy
# which makes xp_copy silently a no-op
xp = array_namespace(x)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we adjust xp_copy to special-case and copy for Dask instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are currently passing in xp into xp_copy in these tests which is the problem.

i.e. xp_copy(arr, xp=<bad dask.array xp>)

If we don't pass in xp_copy, I've changed array_namespace in our vendored array-api-compat to use the wrapped dask.array namespace by default (will upstream this later).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be easier to just do these copies NumPy side so we don't need to involve xp_copy at all:

x = np.random.rand(100)
x2 = x.copy()
x = xp.asarray(x)
x2 = xp.asarray(x2)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also drop the xp argument to xp_copy (which would be a smaller diff).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

happy with that!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, no - x2 should be an array from the unwrapped namespace before it is passed into func.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've changed all of these back to using numpy, let me know if there's any more left.

@lithomas1
Copy link
Author

Going to hold off on this until
data-apis/array-api-compat#209 is resolved.

Looks like dasks is switching to always copying in asarray?

windows-server-2003 and others added 15 commits December 12, 2024 08:12
…is slow due to a bug with std::priority_queue
Remove trailing spaces, unused variables and imports, change reference of Fibonacci heap to priority queue
Add missing spaces between tests.
model the structure on scipy.ndimage:

- add `_delegators.py` with *_signature functions
- add _signal_api.py to collect "bare" imports from _private modules
- add _support_alternative_backends.py to decorate "bare" functions
- in __init__.py, import decorated names from _support_alternative_backends.py
@lithomas1
Copy link
Author

stats is now fully passing as well.

if is_dask_namespace(xp) or is_jax_namespace(xp):
# TODO: verify for jax
return xp.where(cond, f(arrays[0], arrays[1]), f2(arrays[0], arrays[1]) if not fillvalue else fillvalue)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @mdhaber was also interested in changes to _lazywhere.

(Sorry for the slow ping!)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me if you need something that works with Dask and JAX.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f2 can be None. IMHO _lazywhere should be moved to array-api-extras before such backend-specific hacks take place.

Copy link
Author

@lithomas1 lithomas1 Dec 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was try to hack around _lazywhere since IIRC @mdhaber mentioned _lazywhere didn't seem to be much of an optimization compared to just doing the regular where, and so might be removed.

Happy to wait for your jax/array-api-extra changes to land, though.

Thanks for the pointer about f2, though.
I'll patch in a later commit, but strange that it wasn't hit in tests.

Comment on lines -202 to +203
x2 = xp_copy(x, xp=xp)
x2 = xp.asarray(x_np.copy())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you remind me why xp_copy doesn't work for dask?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think xp_copy does an asarray(..., copy=True) to make a copy, but dask silently ignores the copy flag in its asarray implementation.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this may be for older Dask only? data-apis/array-api-compat#211 seemed to suggest that copies are now always made even when copy=None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a test-specific issue where xp is the naked dask module. The wrapper returned by array_namespace has a functioning copy=True.

scipy/sparse/linalg/_propack/PROPACK Outdated Show resolved Hide resolved
lucascolley and others added 3 commits January 3, 2025 20:30
TST: array types: enforce namespace in tests
…hinx` 0.17 (scipy#22161)

* DOC, MAINT: Drop `convert_notebooks.py`

* DOC: Use Markdown notebooks with JupyterLite

* MAINT: Pin jupyterlite-sphinx to >=0.17.1

* DOC: Use the `NotebookLite` directive instead

* DOC: Add a "Download" button for the notebooks

[docs only]
@lithomas1
Copy link
Author

@lucascolley
This should be close to green now, and ready for re-integration back into your main PR.
I've added skips for all the other modules (e.g. cluster, ndimage)

Are you able to do one last rebase to pick up the lazywhere changes for jax on scipy main?

@lucascolley lucascolley requested a review from tupui as a code owner January 3, 2025 23:51
@lucascolley lucascolley removed the request for review from tupui January 3, 2025 23:53
@lucascolley
Copy link
Owner

oops, the rebase went wrong for some reason. Let me try to fix it.

@lucascolley
Copy link
Owner

I'll open a new PR to SciPy

@lucascolley lucascolley closed this Jan 4, 2025
@lithomas1 lithomas1 deleted the dask branch January 21, 2025 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.