-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPTQ UX] Add scheme arg with QuantizationScheme support #2286
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rahul-tuli
requested review from
Satrat,
bfineran,
dsikka,
horheynm and
dbogunowicz
May 15, 2024 13:37
7 tasks
bfineran
suggested changes
May 15, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rahul-tuli the target UX here doesn't take a full scheme instead the goal is to have targets
and scheme
separate (with targets added later to the scheme)
i.e.
GPTQModifier:
ignore: ["LlamaRotaryEmbedding", "LlamaRMSNorm", "SiLUActivation", "MatMulLeftInput_QK", "MatMulRightInput_QK", "MatMulLeftInput_PV", "MatMulRightInput_PV", "MatMulOutput_QK", "MatMulOutput_PV", "lm_head", "Embedding"]
sequential_update: True
dampening_frac: 0.001
block_size: 128
targets: ["Linear"]
scheme:
input_activations: null
output_activations: null
weights:
num_bits: 8
type: "int"
symmetric: true
strategy: "tensor"
group_size: 128
Base automatically changed from
preserve-mask-structure-test
to
gptq-ux-config-groups
May 17, 2024 16:16
rahul-tuli
force-pushed
the
gptq-ux-config-groups
branch
from
May 20, 2024 18:20
440661b
to
34482c0
Compare
rahul-tuli
force-pushed
the
add-scheme-support
branch
from
May 23, 2024 15:09
9c084f7
to
e6d8734
Compare
rahul-tuli
changed the base branch from
main
to
install-compressed-tensors-from-source
May 23, 2024 15:10
rahul-tuli
force-pushed
the
add-scheme-support
branch
from
May 23, 2024 15:11
e6d8734
to
f721988
Compare
bfineran
previously approved these changes
May 23, 2024
Satrat
previously approved these changes
May 23, 2024
rahul-tuli
dismissed stale reviews from Satrat and bfineran
May 24, 2024 13:57
The base branch was changed.
dsikka
approved these changes
May 24, 2024
horheynm
approved these changes
May 24, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds support for a
scheme
arg in GPTQ, this arg can be set to a singleQuantizationScheme
objectrecipe:
test script:
test command:
Output: