fix: adjust _convert_weight_to_int4pack_cpu input weights for pytorch>=2.5 #286

dvrogozh · 2024-08-16T23:15:50Z

Fixes: #274

PyTorch 2.5 adjusted input weights of _convert_weight_to_int4pack_cpu from [n][k] int32 to [n][k / 2] uint8. Changing quanto code accordingly.

See: pytorch/pytorch#129940
See: pytorch/pytorch@6f662e9

CC: @dacorvo

…>=2.5 Fixes: huggingface#274 PyTorch 2.5 adjusted input weights of _convert_weight_to_int4pack_cpu from [n][k] int32 to [n][k / 2] uint8. Changing quanto code accordingly. See: pytorch/pytorch#129940 See: pytorch/pytorch@6f662e9 Signed-off-by: Dmitry Rogozhkin <[email protected]>

dacorvo

@dvrogozh thank you very much for this pull-request. I did not look into the details of the new TinyGemm packing in pytorch, but I think it is now device agnostic. This is interesting because with 2.4, the weights are repacked when moving from cpu to cuda, which is slow (see #270). We could avoid the unpack/pack line 133 in packed.py with pytorch 2.5 (maybe an upcomimg pull-request ?).

dvrogozh · 2024-08-20T14:32:14Z

I did not look into the details of the new TinyGemm packing in pytorch, but I think it is now device agnostic.

Yes. This seems to be an intent of the change on pytorch side discussed in pytorch PR description pytorch/pytorch#129940.

We could avoid the unpack/pack line 133 in packed.py with pytorch 2.5 (maybe an upcomimg pull-request ?).

It seems so. Worth to try out. I shall see if I will have success building for cuda today to give it a try.

dvrogozh requested a review from dacorvo as a code owner August 16, 2024 23:15

dvrogozh mentioned this pull request Aug 16, 2024

RuntimeError: _convert_weight_to_int4pack_cpu : expect weight to be kByte #274

Closed

dvrogozh force-pushed the int4pack branch from 94c417f to 37c83b5 Compare August 16, 2024 23:24

dacorvo approved these changes Aug 20, 2024

View reviewed changes

dacorvo merged commit c02750b into huggingface:main Aug 20, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: adjust _convert_weight_to_int4pack_cpu input weights for pytorch>=2.5 #286

fix: adjust _convert_weight_to_int4pack_cpu input weights for pytorch>=2.5 #286

dvrogozh commented Aug 16, 2024 •

edited

Loading

dacorvo left a comment

dvrogozh commented Aug 20, 2024

fix: adjust _convert_weight_to_int4pack_cpu input weights for pytorch>=2.5 #286

fix: adjust _convert_weight_to_int4pack_cpu input weights for pytorch>=2.5 #286

Conversation

dvrogozh commented Aug 16, 2024 • edited Loading

dacorvo left a comment

Choose a reason for hiding this comment

dvrogozh commented Aug 20, 2024

dvrogozh commented Aug 16, 2024 •

edited

Loading