-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: handle sparse matrix reading for big-endian byte order #163
Conversation
Reviewer's Guide by SourceryThis pull request fixes an issue where sparse matrices with big-endian byte order were not being read correctly. The code now converts the byte order to be compatible with Sequence diagram for reading response data with byte order conversionsequenceDiagram
participant OGIPReader
participant ResponseData
participant to_native_byteorder
participant coo_array
OGIPReader->>OGIPReader: _read_response(file, response_id)
OGIPReader->>to_native_byteorder: reduced_matrix = to_native_byteorder(reduced_matrix)
to_native_byteorder-->>OGIPReader: returns reduced_matrix
OGIPReader->>coo_array: sparse_matrix = coo_array(reduced_matrix, ...)
coo_array-->>OGIPReader: returns sparse_matrix
OGIPReader-->>OGIPReader: returns response_data
Updated class diagram for ResponseDataclassDiagram
class ResponseData {
photon_egrid: np.ndarray
channel_emin: np.ndarray
channel_emax: np.ndarray
response_matrix: scipy.sparse.coo_array
channel: np.ndarray
__init__(
photon_egrid: Sequence[float],
channel_emin: Sequence[float],
channel_emax: Sequence[float],
response_matrix: NDArray | sparray,
channel: Sequence[str]
)
}
note for ResponseData "The constructor now converts the byte order of response_matrix to native byte order if it's not already in native byte order."
Class diagram for to_native_byteorder functionclassDiagram
class to_native_byteorder {
+NDArray to_native_byteorder(arr: NDArray)
}
note for to_native_byteorder "Converts a NumPy array to native byte order."
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @wcxve - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider adding a helper function to handle byte order conversion to avoid repetition.
Here's what I looked at during the review
- 🟡 General issues: 1 issue found
- 🟢 Security: all looks good
- 🟡 Testing: 1 issue found
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
src/elisa/data/ogip.py
Outdated
# if the byte order is not native, | ||
# then convert it to be compatible with scipy.sparse | ||
if reduced_matrix.dtype.byteorder != '=': | ||
order = reduced_matrix.dtype.newbyteorder() | ||
reduced_matrix = reduced_matrix.astype(order) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Consider centralizing duplicated byte order conversion logic.
The same pattern for converting non‐native byte order appears in both this file and in 'base.py'. Extracting this logic into a shared utility function could improve maintainability and reduce code duplication.
tests/test_data.py
Outdated
ResponseData(photon_egrid, channel_emin, channel_emax, matrix1, channel) | ||
ResponseData(photon_egrid, channel_emin, channel_emax, matrix2, channel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (testing): These tests are missing assertions.
Simply creating the ResponseData
object doesn't verify its correctness. Add assertions to check that the data within the object is correctly handled for both little-endian and big-endian input matrices.
@sourcery-ai review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @wcxve - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider adding a short explanation in
to_native_byteorder
about why this conversion is necessary. - The new tests are good, but it's not clear why the existing data files have different endianness.
Here's what I looked at during the review
- 🟡 General issues: 1 issue found
- 🟢 Security: all looks good
- 🟡 Testing: 2 issues found
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
@@ -1138,8 +1139,11 @@ def __init__( | |||
photon_egrid = np.array(photon_egrid, dtype=np.float64, order='C') | |||
channel_emin = np.array(channel_emin, dtype=np.float64, order='C') | |||
channel_emax = np.array(channel_emax, dtype=np.float64, order='C') | |||
response_matrix = coo_array(response_matrix) | |||
response_matrix.eliminate_zeros() | |||
if not isinstance(response_matrix, sparray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question (bug_risk): Revisit the conditional conversion for response_matrix.
Currently, the conversion to native byte order is skipped if response_matrix is already an instance of sparray. Ensure that sparse matrices are always natively ordered or consider if they might also need conversion to avoid potential incompatibility issues with scipy.sparse.
tests/test_data.py
Outdated
# test if the response matrix can be converted to a BCSR matrix in JAX | ||
BCSR.from_scipy_sparse(rsp.matrix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Consider adding assertions on the content of the BCSR matrix.
Converting to BCSR is good, but it doesn't confirm the content is correct. Add assertions to check specific values within the BCSR.from_scipy_sparse(rsp.matrix)
result to ensure the conversion and data are as expected for different endianness.
# test if the response matrix can be converted to a BCSR matrix in JAX | |
BCSR.from_scipy_sparse(rsp.matrix) | |
# test if the response matrix can be converted to a BCSR matrix in JAX | |
bcsr_matrix = BCSR.from_scipy_sparse(rsp.matrix) | |
# Verify that the converted matrix has the same shape as the original | |
assert bcsr_matrix.shape == rsp.matrix.shape | |
# Compare dense representations to confirm correct conversion of data | |
original_dense = rsp.matrix.todense() | |
bcsr_dense = bcsr_matrix.todense() | |
assert np.allclose(bcsr_dense, original_dense) |
mat1 = np.eye(100).astype('<f4') | ||
mat2 = np.eye(100).astype('>f4') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Consider testing non-identity matrices.
Using identity matrices might not fully exercise the byte-swapping logic. Consider adding tests with more complex matrices to ensure all data is handled correctly.
Summary by Sourcery
This pull request addresses an issue with reading sparse matrices in big-endian byte order. It introduces a utility function to convert arrays to native byte order and applies it when reading response matrices. Additionally, it includes new tests to ensure correct handling of big-endian files and different endianness.
Bug Fixes:
Enhancements:
Tests: