Skip to content

CPU usage drops only on the "generating" part. #521

Answered by LostRuins
silmarine asked this question in Q&A
Discussion options

You must be logged in to vote

16GB RAM is not enough to load 20B models. That's why its so slow - you are probably hitting swap.

Please try a 7B model instead: maybe this one should be good. https://huggingface.co/TheBloke/airoboros-mistral2.2-7B-GGUF/blob/main/airoboros-mistral2.2-7b.Q4_K_S.gguf

Replies: 2 comments 4 replies

Comment options

You must be logged in to vote
4 replies
@silmarine
Comment options

@LostRuins
Comment options

Answer selected by LostRuins
@silmarine
Comment options

@LostRuins
Comment options

Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants