Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow providers to return bigints #1745

Merged
merged 3 commits into from
Feb 3, 2025
Merged

Allow providers to return bigints #1745

merged 3 commits into from
Feb 3, 2025

Conversation

axelboc
Copy link
Contributor

@axelboc axelboc commented Jan 29, 2025

The mock and h5wasm providers now return bigint values for int-64 datasets (including compounds with int-64 fields). They no longer have to cast int-64 integers to JS numbers (i.e. float 64), which means that the bigint values reach the Scalar, Matrix and Compound visualizations without any loss of precision.

This addresses most of #1679. (I'll close it once H5GroveProvider also returns bigints.)

The preparatory work I did in #1736, #1737, #1738, #1739 and #1743 bore fruit. I didn't end up needing a separate BigIntegerType or even the "generic solution" explained in #1679 (comment). (This generic solution was flawed anyway: allowing providers to declare which value type they want to return for which dataset shape/type is useless, since the viewer wants to know which value types any provider may return, so either way we end up with a type union.)

Summary of the changes

  • Mock datasets with bigint values (e.g. 1n) are now "guessed" to have type IntegerType instead of UnknownType
  • H5WasmApi no longer converts bigints to numbers
  • ScalarValue<NumericType> is now number | bigint
  • ArrayValue<NumericType> is now TypedArray | BigIntTypedArray | number[] | bigint[]
  • The Scalar, Matrix and Compound visualizations now support bigints — they just convert them to string, except when scientific notation is selected (in which case they cast them to numbers and pass them to d3-format).
  • The mapped components of the WebGL visualizations (which don't support bigints) are now responsible for converting bigint signals (including auxiliaries) to numbers — this is done in useToNumArray, which was already responsible for converting booleans to numbers.
  • Since axis and error arrays are now also allowed to contain bigints (because of ArrayValue<NumericType>), and since D3 and most of our code in @h5web/lib doesn't support bigints, those arrays also have to be converted inside mapped components with useToNumArray and useToNumArrays.

How I got everything to work

What mattered the most to not break the entire codebase was to ensure that NumArray didn't go from number[]| TypedArray to number[] | bigint[] | TypedArray | BigIntTypedArray, as NumArray is used throughout @h5web/lib.

The passage between Value<Dataset<Shape, DType>> and NumArray happens in the mapped components, so declaring the props of those components with ArrayValue<NumericArray> instead of NumArray in #1739 was key.

Another trick that makes the implementation a lot simpler in this PR is to ensure that enum and boolean datasets cannot be provided as bigints but only as numbers (even though technically, EnumType and BoolType can have a 64-bit IntegerType as base type). This is enforced by the ScalarValue and ArrayValue types and by the corresponding assertion functions.

Screenshots

image

image

@axelboc axelboc force-pushed the bigint-new branch 3 times, most recently from 355d6ab to 3a3c561 Compare January 29, 2025 15:28
@axelboc axelboc requested a review from loichuder January 29, 2025 15:29
@axelboc
Copy link
Contributor Author

axelboc commented Jan 29, 2025

/approve


default:
return (val) => {
return typeof val === 'bigint' ? val.toString() : formatter(val);
Copy link
Contributor Author

@axelboc axelboc Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a bug fix here. Int-64 values provided as number were being cast to string instead of being formatted with D3.

I'm not 100% sure which is the result we want, actually - e.g. for fixed-point notation, do we want 1.000 or 1? I assumed 1, since decimals don't matter, so maybe I should revert and keep the toString()? In that case, maybe toString() should be the default formatting for all integers with auto and fixed-point notation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it does not make much sense to do more than converting to strings for integers. toString should be enough.

Copy link
Contributor Author

@axelboc axelboc Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactoring done in 0748093

I've split createNumericFormatter into two:

  • createFloatFormatter which works as before
  • createIntegerFormatter:
    • with "auto" and "fixed-point" notation: casts numbers/bigints to string with toString()
    • with "scientific" notation: casts bigints to numbers with Number, and then formats numbers with D3 (same format as before: '.3e')

Copy link
Member

@loichuder loichuder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 💪


default:
return (val) => {
return typeof val === 'bigint' ? val.toString() : formatter(val);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it does not make much sense to do more than converting to strings for integers. toString should be enough.

Base automatically changed from bigint-prep to main February 3, 2025 08:36
@axelboc axelboc force-pushed the bigint-new branch 3 times, most recently from d507cfe to 6852665 Compare February 3, 2025 09:48
@axelboc axelboc requested a review from loichuder February 3, 2025 09:55
@axelboc axelboc merged commit a31fc50 into main Feb 3, 2025
8 checks passed
@axelboc axelboc deleted the bigint-new branch February 3, 2025 10:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants