Fixing issues related to dtype=object arrays in interpolation routines #1655

leftaroundabout · 2024-08-29T16:41:31Z

This is a cleaned-up version of parts of #1649. See that PR for discussion.

In old versions of NumPy, ODL relied on its capability to represent ragged arrays automatically as arrays of arrays (i.e., of objects). This was in particular used for meshgrids, which are a kind of discretization supported by the interpolation classes in odl.discr.

Current NumPy does not automatically convert to dtype=object anymore, and for good reasons: it is error-prone (shapes become ambiguous, whether to consider the nested array or just its outer structure) and performance / memory locality suffers. In #1633, this was addressed by explicitly generating an object-array specifically for the meshgrid-specifying inputs, but further testing (#1648) revealed that this was not sufficient: the dtype=object property would percolate into the interpolation calculations, and there cause new failures due to required implicit conversion (as well as performance degradation).

This PR goes into the details of the interpolation routines and ensures linear arrays are stored with primitive dtype. It fixes the discretization tests in NumPy-1.19, though there are still some implicit conversions that the even stricter numpy-1.26 does not accept, as well as different tests that currently fail for unrelated reasons.

…e stored as arrays. Without this, NumPy implicitly generates arrays but makes them ragged (dtype=object), which is bad for performance and disabled in newer versions.

In older NumPy, this would silently create an array-of-object, but that has a wrong shape too.

…ject. This would previously happen because meshgrids are passed in as ragged arrays, and NumPy does not convert the rows to float dtype as should happen.

…riate. One of the tests samples from an integral grid at non-integral points. This failed after the explicit conversion introduced in c90044a. Falling back to `float` fixes the test case, though perhaps it would be better to select a dedicated meshing dtype.

When this happens it is likely that the used did something like starting from an integer mesh, but in that case linear interpolation does not seem very appropriate.

pep8speaks · 2024-08-29T16:41:38Z

Checking new PR...

In the file odl/discr/discr_utils.py:

Line 629:20: E225 missing whitespace around operator
Line 629:19: E128 continuation line under-indented for visual indent

leftaroundabout added 6 commits August 29, 2024 18:24

Ensure the distances to computed linear-interpolation weights from ar…

68da3d4

…e stored as arrays. Without this, NumPy implicitly generates arrays but makes them ragged (dtype=object), which is bad for performance and disabled in newer versions.

Failure to convert to array implies it is not a suitable input array.

101ed50

In older NumPy, this would silently create an array-of-object, but that has a wrong shape too.

Ensure interpolation weight is computed with the correct array type.

ba5df4e

Ensure mesh coordinate calculations are not carried out with dtype=ob…

b80017c

…ject. This would previously happen because meshgrids are passed in as ragged arrays, and NumPy does not convert the rows to float dtype as should happen.

Add warning when falling back to float for interpolation coefficients.

2cb4a95

When this happens it is likely that the used did something like starting from an integer mesh, but in that case linear interpolation does not seem very appropriate.

leftaroundabout mentioned this pull request Aug 29, 2024

Fixing issues related to dtype=object arrays in interpolation routines #1649

Closed

JevgenijaAksjonova approved these changes Aug 30, 2024

View reviewed changes

JevgenijaAksjonova merged commit f720076 into odlgroup:master Aug 30, 2024

leftaroundabout mentioned this pull request Aug 30, 2024

NumPy dtype=object problem in linear resampling operator #1648

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing issues related to dtype=object arrays in interpolation routines #1655

Fixing issues related to dtype=object arrays in interpolation routines #1655

leftaroundabout commented Aug 29, 2024

pep8speaks commented Aug 29, 2024

Fixing issues related to dtype=object arrays in interpolation routines #1655

Fixing issues related to dtype=object arrays in interpolation routines #1655

Conversation

leftaroundabout commented Aug 29, 2024

pep8speaks commented Aug 29, 2024