fix: handle sparse matrix reading for big-endian byte order #163

wcxve · 2025-02-21T16:41:03Z

Summary by Sourcery

This pull request addresses an issue with reading sparse matrices in big-endian byte order. It introduces a utility function to convert arrays to native byte order and applies it when reading response matrices. Additionally, it includes new tests to ensure correct handling of big-endian files and different endianness.

Bug Fixes:

Fixes an issue where sparse matrices were not being read correctly when using big-endian byte order by converting the matrix to native byte order before creating the sparse matrix.

Enhancements:

Adds a utility function to convert arrays to native byte order.

Tests:

Adds tests to ensure that the response matrix can be converted to a BCSR matrix in JAX for big-endian files.
Adds tests to ensure that ResponseData objects are created correctly for different endianness.

sourcery-ai · 2025-02-21T16:41:10Z

Reviewer's Guide by Sourcery

This pull request fixes an issue where sparse matrices with big-endian byte order were not being read correctly. The code now converts the byte order to be compatible with scipy.sparse. Additionally, tests were added to verify that response files with big-endian byte order are loaded correctly and that ResponseData handles different endianness.

Sequence diagram for reading response data with byte order conversion

sequenceDiagram
    participant OGIPReader
    participant ResponseData
    participant to_native_byteorder
    participant coo_array

    OGIPReader->>OGIPReader: _read_response(file, response_id)
    OGIPReader->>to_native_byteorder: reduced_matrix = to_native_byteorder(reduced_matrix)
    to_native_byteorder-->>OGIPReader: returns reduced_matrix
    OGIPReader->>coo_array: sparse_matrix = coo_array(reduced_matrix, ...)
    coo_array-->>OGIPReader: returns sparse_matrix
    OGIPReader-->>OGIPReader: returns response_data

Updated class diagram for ResponseData

classDiagram
    class ResponseData {
        photon_egrid: np.ndarray
        channel_emin: np.ndarray
        channel_emax: np.ndarray
        response_matrix: scipy.sparse.coo_array
        channel: np.ndarray
        __init__(
            photon_egrid: Sequence[float],
            channel_emin: Sequence[float],
            channel_emax: Sequence[float],
            response_matrix: NDArray | sparray,
            channel: Sequence[str]
        )
    }
    note for ResponseData "The constructor now converts the byte order of response_matrix to native byte order if it's not already in native byte order."

Class diagram for to_native_byteorder function

classDiagram
    class to_native_byteorder {
        +NDArray to_native_byteorder(arr: NDArray)
    }
    note for to_native_byteorder "Converts a NumPy array to native byte order."

File-Level Changes

Change	Details	Files
Addressed an issue where sparse matrices with big-endian byte order were not being read correctly by converting the byte order to be compatible with `scipy.sparse`.	Added a utility function `to_native_byteorder` to convert arrays to native byte order. Applied `to_native_byteorder` to the response matrix in `ResponseData` to ensure compatibility with `scipy.sparse`. Applied `to_native_byteorder` when reading the response matrix in `_read_response` in `ogip.py`.	`src/elisa/util/misc.py` `src/elisa/data/base.py` `src/elisa/data/ogip.py`
Added tests to verify that response files with big-endian byte order are loaded correctly and that `ResponseData` handles different endianness.	Added a parameterized test `test_load_response` to load and verify response files with different endianness. Added a test `test_response` to verify that `ResponseData` handles different endianness.	`tests/test_data.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!
Generate a plan of action for an issue: Comment @sourcery-ai plan on
an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @wcxve - I've reviewed your changes - here's some feedback:

Overall Comments:

Consider adding a helper function to handle byte order conversion to avoid repetition.

Here's what I looked at during the review

🟡 General issues: 1 issue found
🟢 Security: all looks good
🟡 Testing: 1 issue found
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-02-21T16:42:09Z

src/elisa/data/ogip.py

+        # if the byte order is not native,
+        # then convert it to be compatible with scipy.sparse
+        if reduced_matrix.dtype.byteorder != '=':
+            order = reduced_matrix.dtype.newbyteorder()
+            reduced_matrix = reduced_matrix.astype(order)


suggestion: Consider centralizing duplicated byte order conversion logic.

The same pattern for converting non‐native byte order appears in both this file and in 'base.py'. Extracting this logic into a shared utility function could improve maintainability and reduce code duplication.

sourcery-ai · 2025-02-21T16:42:09Z

tests/test_data.py

+    ResponseData(photon_egrid, channel_emin, channel_emax, matrix1, channel)
+    ResponseData(photon_egrid, channel_emin, channel_emax, matrix2, channel)


issue (testing): These tests are missing assertions.

Simply creating the ResponseData object doesn't verify its correctness. Add assertions to check that the data within the object is correctly handled for both little-endian and big-endian input matrices.

wcxve · 2025-02-21T16:58:11Z

@sourcery-ai review

sourcery-ai

Hey @wcxve - I've reviewed your changes - here's some feedback:

Overall Comments:

Consider adding a short explanation in to_native_byteorder about why this conversion is necessary.
The new tests are good, but it's not clear why the existing data files have different endianness.

Here's what I looked at during the review

🟡 General issues: 1 issue found
🟢 Security: all looks good
🟡 Testing: 2 issues found
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-02-21T16:59:29Z

src/elisa/data/base.py

@@ -1138,8 +1139,11 @@ def __init__(
        photon_egrid = np.array(photon_egrid, dtype=np.float64, order='C')
        channel_emin = np.array(channel_emin, dtype=np.float64, order='C')
        channel_emax = np.array(channel_emax, dtype=np.float64, order='C')
-        response_matrix = coo_array(response_matrix)
-        response_matrix.eliminate_zeros()
+        if not isinstance(response_matrix, sparray):


question (bug_risk): Revisit the conditional conversion for response_matrix.

Currently, the conversion to native byte order is skipped if response_matrix is already an instance of sparray. Ensure that sparse matrices are always natively ordered or consider if they might also need conversion to avoid potential incompatibility issues with scipy.sparse.

sourcery-ai · 2025-02-21T16:59:29Z

tests/test_data.py

+    # test if the response matrix can be converted to a BCSR matrix in JAX
+    BCSR.from_scipy_sparse(rsp.matrix)


suggestion (testing): Consider adding assertions on the content of the BCSR matrix.

Converting to BCSR is good, but it doesn't confirm the content is correct. Add assertions to check specific values within the BCSR.from_scipy_sparse(rsp.matrix) result to ensure the conversion and data are as expected for different endianness.

Suggested change

# test if the response matrix can be converted to a BCSR matrix in JAX

BCSR.from_scipy_sparse(rsp.matrix)

# test if the response matrix can be converted to a BCSR matrix in JAX

bcsr_matrix = BCSR.from_scipy_sparse(rsp.matrix)

# Verify that the converted matrix has the same shape as the original

assert bcsr_matrix.shape == rsp.matrix.shape

# Compare dense representations to confirm correct conversion of data

original_dense = rsp.matrix.todense()

bcsr_dense = bcsr_matrix.todense()

assert np.allclose(bcsr_dense, original_dense)

sourcery-ai · 2025-02-21T16:59:29Z

tests/test_data.py

+    mat1 = np.eye(100).astype('<f4')
+    mat2 = np.eye(100).astype('>f4')


suggestion (testing): Consider testing non-identity matrices.

Using identity matrices might not fully exercise the byte-swapping logic. Consider adding tests with more complex matrices to ensure all data is handled correctly.

fix: handle sparse matrix reading for big-endian byte order

431b410

sourcery-ai bot reviewed Feb 21, 2025

View reviewed changes

Improve code

1141ad9

sourcery-ai bot reviewed Feb 21, 2025

View reviewed changes

wcxve added 2 commits February 22, 2025 01:09

Fix test

a732653

Fix test

17de58d

wcxve merged commit 472eee9 into main Feb 21, 2025
14 checks passed

wcxve deleted the fix-sparse branch February 21, 2025 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle sparse matrix reading for big-endian byte order #163

fix: handle sparse matrix reading for big-endian byte order #163

wcxve commented Feb 21, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Feb 21, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

sourcery-ai bot Feb 21, 2025

sourcery-ai bot Feb 21, 2025

wcxve commented Feb 21, 2025

sourcery-ai bot left a comment

sourcery-ai bot Feb 21, 2025

sourcery-ai bot Feb 21, 2025

sourcery-ai bot Feb 21, 2025

		ResponseData(photon_egrid, channel_emin, channel_emax, matrix1, channel)
		ResponseData(photon_egrid, channel_emin, channel_emax, matrix2, channel)

		# test if the response matrix can be converted to a BCSR matrix in JAX
		BCSR.from_scipy_sparse(rsp.matrix)

-    # test if the response matrix can be converted to a BCSR matrix in JAX
-    BCSR.from_scipy_sparse(rsp.matrix)
+    # test if the response matrix can be converted to a BCSR matrix in JAX
+    bcsr_matrix = BCSR.from_scipy_sparse(rsp.matrix)
+    # Verify that the converted matrix has the same shape as the original
+    assert bcsr_matrix.shape == rsp.matrix.shape
+    # Compare dense representations to confirm correct conversion of data
+    original_dense = rsp.matrix.todense()
+    bcsr_dense = bcsr_matrix.todense()
+    assert np.allclose(bcsr_dense, original_dense)

		mat1 = np.eye(100).astype('<f4')
		mat2 = np.eye(100).astype('>f4')

fix: handle sparse matrix reading for big-endian byte order #163

fix: handle sparse matrix reading for big-endian byte order #163

Conversation

wcxve commented Feb 21, 2025 • edited by sourcery-ai bot Loading

Summary by Sourcery

sourcery-ai bot commented Feb 21, 2025 • edited Loading

Reviewer's Guide by Sourcery

Sequence diagram for reading response data with byte order conversion

Updated class diagram for ResponseData

Class diagram for to_native_byteorder function

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Feb 21, 2025

Choose a reason for hiding this comment

sourcery-ai bot Feb 21, 2025

Choose a reason for hiding this comment

wcxve commented Feb 21, 2025

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Feb 21, 2025

Choose a reason for hiding this comment

sourcery-ai bot Feb 21, 2025

Choose a reason for hiding this comment

sourcery-ai bot Feb 21, 2025

Choose a reason for hiding this comment

wcxve commented Feb 21, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Feb 21, 2025 •

edited

Loading