Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd repetition and other issues recently - Incorrect ftype? #1192

Open
icsy7867 opened this issue Oct 28, 2024 · 3 comments
Open

Odd repetition and other issues recently - Incorrect ftype? #1192

icsy7867 opened this issue Oct 28, 2024 · 3 comments

Comments

@icsy7867
Copy link

Describe the Issue
Apologies, I am in no means an expert, and I am still learning.

Recently, after upgrading KoboldCPP I have been seeing some strange repetition and other issues with responses that frequently just dont quite make sense given the content.

I am well within the context limit. (Currently testing Qwen 2.5 - 32B using Q4 quantization), it the responses frequently contradict what is in the context right before it. I was doing some digging, so I noticed that in the start output I see:

llm_load_print_meta: model ftype      = Q3_K - Large

even though this is a Q4 model. Being curious, I downloaded llamacpp and did a similar execution, and it immediately detected the correct Q4 on the same model.

Additional Information:
Running in Podman, using a Quadro P6000, 24GB VRAM.

For running koboldcpp I am using:

--usecublas --flashattention --gpulayers 999 --contextsize 12000

Seems to work otherwise, no errors or anything visible. Just curious if someone new of a magic flag or something to try? Or maybe I am being stupid and missing something.

@icsy7867
Copy link
Author

I also test version 1.74 and version 1.73, but it always identifies the ftype as Q3_Large.

@LostRuins
Copy link
Owner

I don't think the ftype detection has anything to do with the output quality - if it was wrong it would output complete garbage.

@icsy7867
Copy link
Author

icsy7867 commented Nov 2, 2024

Fair enough! Thank you for the response. I am still exploring the various flags and options as I figure out the issues. I am pretty sure this is user error somehow...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants