Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: add copy #495

Open
jakirkham opened this issue Oct 12, 2022 · 7 comments
Open

RFC: add copy #495

jakirkham opened this issue Oct 12, 2022 · 7 comments
Labels
API extension Adds new functions or objects to the API. Needs Discussion Needs further discussion. RFC Request for comments. Feature requests and proposed changes. topic: Creation Array creation.
Milestone

Comments

@jakirkham
Copy link
Member

Some array libraries implement a .copy() method (like NumPy). While there are some indirect ways to get at this now (asarray, reshape, etc.), currently the API lacks a way to do this directly and without other potential side-effects. Should add a method (unlike a function) would ensure the new array has the original array's type simply. Curious if there is appetite for including this in the API.

Related is a question of what interplay there is with __copy__ and/or __deepcopy__ (if any).

@rgommers rgommers added the API extension Adds new functions or objects to the API. label Oct 12, 2022
@rgommers
Copy link
Member

Thanks @jakirkham. A couple of thoughts:

  • NumPy has both a function and a method. The function has an extra keyword (subok)
  • Either way, the new keywords don't seem appropriate, so it would simply be copy(x) or x.copy()
  • At that point, it's the same as copy.copy and copy.deepcopy I"d think (at least for NumPy)
  • PyTorch calls it clone, it doesn't have copy
  • Is there a reason to add it? Autograd or compilers related perhaps - better than the stdlib function?

@asmeurer
Copy link
Member

Isn't this asarray(x, copy=True)? Does that have side-effects?

@jakirkham
Copy link
Member Author

To summarize the ask, library functions handling general arrays may want to copy as they want an array they can mutate safely without affecting user provided data.

We concluded that we want a function (not a method) and one needs to do a namespace lookup (x.__array_namespace__) to figure it out (as the type is likely not known by the library).

There is a separate question of what we call it. Either a new function (copy, clone) or use an existing one (asarray(x, copy=True)). We would also want to spell out that some libraries (Dask, JAX, maybe others) may not actually copy the underlying data.

__copy__ and __deepcopy__ are different enough (Dask would copy graphs, JAX something similar, etc.) that it is worth specifying this path may not copy the array data (if that is the user's concern) and that using the function above (name to be decided) would be preferred for data copying.

@asmeurer
Copy link
Member

asmeurer commented Oct 20, 2022

Right now asarray says that copy=True MUST copy the data, but maybe it should say it can not copy if it knows it solely owns the data and disallows mutation. Or can there be situations where a library thinks it solely owns the data but it doesn't actually, so it really has to do a real memory copy?

@rgommers
Copy link
Member

rgommers commented Nov 4, 2022

@asmeurer good point. Maybe something like "must ensure that the returned array does not share data with another array, either by copying the data to a new memory location or in some other way (e.g., this property is guaranteed by design)".

@rgommers rgommers added this to the v2023 milestone Mar 9, 2023
@kgryte kgryte modified the milestones: v2023, v2024 Jan 25, 2024
@kgryte kgryte changed the title Adding .copy() method RFC: add copy() Apr 4, 2024
@kgryte kgryte added RFC Request for comments. Feature requests and proposed changes. topic: Creation Array creation. Needs Discussion Needs further discussion. labels Apr 4, 2024
@kgryte kgryte changed the title RFC: add copy() RFC: add copy Apr 18, 2024
@lucascolley
Copy link
Member

IMO this doesn't need to be added to the standard. In SciPy we use a private function called xp_copy a lot, which is just an alias for asarray(x, copy=True). If anyone would like that alias added to array-api-extra, feel free to open an issue.

@kgryte
Copy link
Contributor

kgryte commented Jan 23, 2025

I've opened #886 to hopefully satisfy concerns raised here and elsewhere: an array which can be mutated without affecting user data and allowing libraries such as JAX to avoid performing an explicit copy, which is unnecessary when provided a known array type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API extension Adds new functions or objects to the API. Needs Discussion Needs further discussion. RFC Request for comments. Feature requests and proposed changes. topic: Creation Array creation.
Projects
None yet
Development

No branches or pull requests

5 participants