chat with bob example broken #12

nuance1979 · 2023-05-27T21:32:57Z

Hi @abdeladim-s , thanks for the update!

I was trying to update to pyllamacpp==2.4.0 but found that even the example on the README, which is similar to llama.cpp's ./examples/chat.sh but not identical, is not working properly. For example, when I copied the example code into a foo.py and run it, I got:

If I go to llama.cpp, check out 66874d4 then make clean && make && ./examples/chat.sh, I got:

I just want to get an equivalent of running llama.cpp's chat.sh with pyllamacpp==2.4.0, no more no less. How should I do it?

The text was updated successfully, but these errors were encountered:

absadiki · 2023-05-28T00:13:56Z

Hi @nuance1979, You are welcome & thanks for reporting the bug.

Could you please let me know what model are you using so I can debug using the same model ?

absadiki · 2023-05-28T00:58:21Z

@nuance1979 Oh, I have just noticed! it was just the print out of scope 😅
please give it a try now and let me know if the problem persists ?

nuance1979 · 2023-05-28T02:05:53Z

@abdeladim-s Yes, the format is fixed. But I was mainly talking about the content that's not right:

You see the ./examples/chat.sh from llama.cpp gives me the sensible answer about who Barack Obama is but your code snippet gives me non-sensical answer. Something is wrong.

absadiki · 2023-05-28T04:07:01Z

@nuance1979, I think you just need to specify the exact parameters as that chat.sh example.
I have updated the example in the readme to match it.

Those are my results :

Please update from source and give it a try ?

nuance1979 · 2023-05-28T17:26:31Z

Thanks a lot! I don't know why but I'm still getting non-sensical results like this after I install from master branch and use the updated example:

nuance1979 · 2023-05-28T17:44:23Z

Another try, still non-sensical:

I see a difference in llama_init_from_file: kv self size = 512.00 MB while when I run ./examples/chat.sh it was 256 MB. Not sure if that makes a difference. But when I run ./examples/chat.sh, I can always get answers that makes sense.

absadiki · 2023-05-28T21:21:04Z

@nuance1979, that's weird, the model seems to be always hallucinating! on my end everything's working as expected (as you can see on my previous comment).

yeah you are right, I really don't know why it is devised by half, usually it equals to the context size!
but I don't think this hallucination problem has nothing to do with the kv cache size.
You should at least get meaningful results!

have you tried other models?
Can you try the pyllamacpp cli as well ?

nuance1979 · 2023-05-28T22:30:01Z

I tried pyllamacpp cli and still got non-sensical output:

I checked the SHA256SUM of my .pth and f16.bin files and they matched completely. Again, the same model gave me completely sensible answer when invoked with ./examples/chat.sh so the logical conclusion is that something within pyllamacpp is not right. You can try check your model's SHA256SUM against this file: https://github.com/ggerganov/llama.cpp/blob/master/SHA256SUMS

absadiki · 2023-05-29T00:30:19Z

Yeah something is happening, but I honestly have no idea since I couldn't reproduce this issue on my end.
Do you have any idea how to proceed ?

nuance1979 · 2023-05-29T20:17:18Z

Can you ask a third person to try it? Just to see whether it's a problem on my side.

absadiki · 2023-05-29T23:48:19Z

Sure .. let us try that, @ParisNeo is using pyllamacpp as a backend to his UI, and his repo has so many stars already.

--
Hi @ParisNeo,

Could you please let us know if someone on your repo has reported a similar problem to this issue ?
And ,if you have some time, could you please try the example in the readme page whether it's working properly on your side ?

Thank you!

ParisNeo · 2023-05-30T17:59:20Z

Hi, No, I didn't have any complaints about the pyllamacpp backend yet. If I have time tomorrow I'll try. I got to go.

absadiki · 2023-05-30T18:01:53Z

Thanks @ParisNeo, Let us know if you found any issues.

absadiki · 2023-06-02T19:49:53Z

hi @nuance1979, any news on this ? Are you still getting the same error ?

If you know someone else who can test it then please send them a message!

Otherwise I have tried to test it on colab as well, even though it is slow, but it worked as expected.
Please give it a try, here is the notebook.

nuance1979 · 2023-06-03T04:48:36Z

hi @nuance1979, any news on this ? Are you still getting the same error ?

Yes. Still non-sensical answers.

If you know someone else who can test it then please send them a message!

Sure. I'll ask my friend to test it.

Otherwise I have tried to test it on colab as well, even though it is slow, but it worked as expected. Please give it a try, here is the notebook.

All my tests were done with the original llama 7B model (quantized into q4_0.bin with llama.cpp). But you are testing WizardLM-7B in the notebook so I don't think it's useful here.

nuance1979 · 2023-06-03T05:04:40Z

Ok. I tried your notebook with llama-7b and it can reproduce what I saw:

Again, I want to emphasize that the same model behaves correctly when I use ./examples/chat.sh from llama.cpp repo.

You can try it yourself with this model link: https://huggingface.co/TheBloke/LLaMa-7B-GGML/resolve/main/llama-7b.ggmlv3.q4_0.bin

absadiki · 2023-06-03T06:06:54Z

All my tests were done with the original llama 7B model (quantized into q4_0.bin with llama.cpp). But you are testing WizardLM-7B in the notebook so I don't think it's useful here.

Oh! are you using the original model? .. So maybe that's the source of the problem.
The original LLaMA model is not fine-tuned on instruction-response, so using it in a chat manner is not really correct! But I am quite surprised it is working on llama.cpp.

I will try to test with the original model and see.
But usually, you will need to try that example with fine-tuned models like wizardLM alpaca 'vicuna` etc. to get good results.

nuance1979 · 2023-06-03T14:20:16Z

I understand the difference between original llama and instruction-tuned variants. All I'm saying is that the fact that llama.cpp works under the same condition points to a potential bug in pyllamacpp and it would be great if you can fix it.

absadiki · 2023-06-04T00:35:26Z

@nuance1979, Yeah you are right. Sorry for that :(
I don't know what am I messing in my implementation. especially it seems to be working with other models!
I need to check it again ..

Let me know if you have any ideas .. any help would be appreciated! Thanks!

siddhsql · 2023-06-13T17:27:33Z

if it helps someone, I tried it like this. edit cli.py and make following changes:

PROMPT_CONTEXT = """Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.

User: Hello, Bob.
Bob: Hello. How may I help you today?
User: please tell me the largest city in Europe.
Bob: Sure, The largest city in Europe is Moscow
"""
PROMPT_PREFIX = ""
PROMPT_SUFFIX = ""

example output:

python3 cli.py /Users/xxx/llm/llama-cpp-python/models/vicuna-7b-1.1.ggmlv3.q5_1.bin


██████╗ ██╗   ██╗██╗     ██╗      █████╗ ███╗   ███╗ █████╗  ██████╗██████╗ ██████╗
██╔══██╗╚██╗ ██╔╝██║     ██║     ██╔══██╗████╗ ████║██╔══██╗██╔════╝██╔══██╗██╔══██╗
██████╔╝ ╚████╔╝ ██║     ██║     ███████║██╔████╔██║███████║██║     ██████╔╝██████╔╝
██╔═══╝   ╚██╔╝  ██║     ██║     ██╔══██║██║╚██╔╝██║██╔══██║██║     ██╔═══╝ ██╔═══╝
██║        ██║   ███████╗███████╗██║  ██║██║ ╚═╝ ██║██║  ██║╚██████╗██║     ██║
╚═╝        ╚═╝   ╚══════╝╚══════╝╚═╝  ╚═╝╚═╝     ╚═╝╚═╝  ╚═╝ ╚═════╝╚═╝     ╚═╝


PyLLaMACpp
A simple Command Line Interface to test the package
Version: 2.4.1


=========================================================================================

[+] Running model `/Users/xxx/llm/llama-cpp-python/models/vicuna-7b-1.1.ggmlv3.q5_1.bin`
[+] LLaMA context params: `{}`
[+] GPT params: `{}`
llama.cpp: loading model from /Users/xxx/llm/llama-cpp-python/models/vicuna-7b-1.1.ggmlv3.q5_1.bin
llama_model_load_internal: format     = ggjt v3 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 9 (mostly Q5_1)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =    0.07 MB
llama_model_load_internal: mem required  = 6612.59 MB (+ 2052.00 MB per state)
.
llama_init_from_file: kv self size  =  512.00 MB
...
[+] Press Ctrl+C to Stop ...
...
You: who is barack obama?
AI:
Obama, Barack Hussein (b. 1961), first African American president of the United States (2009-17). He was born in Honolulu, Hawaii, to a Kenyan father and an American mother. After graduating from Columbia University and Harvard Law School, he worked as a community organizer in Chicago before being elected to the Illinois state senate in 1996. In 2004, he was elected to the U.S. Senate, and four years later he ran for president, defeating Republican John McCain in the general election. He won re-election in 2012. Obama's presidency was marked by efforts to address climate change, improve healthcare access, and strengthen national security through foreign policy initiatives such as the withdrawal of U.S. troops from Iraq and the pursuit of peace talks with Iran. He signed into law landmark legislation including the Affordable Care Act (ACA) and the Matthew Shepard and James Byrd Jr. Hate Crimes Prevention Act, and oversaw the end of U.S. combat operations in Afghanistan. Despite some controversies, Obama remains a highly regarded figure in American politics and continues to advocate for progressive causes through his organization, Organizing for Action.
You:

I am synced to commit 6d487b9

nuance1979 · 2023-06-16T21:05:30Z

Thanks @siddhsql ! However, you are using vicuna-7b and I was talking about llama-7b.

absadiki closed this as completed in 0abe1a9 May 28, 2023

absadiki reopened this May 28, 2023

absadiki added a commit that referenced this issue May 28, 2023

#12

498cd8d

absadiki added a commit that referenced this issue Jul 3, 2023

#12

f66dda7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chat with bob example broken #12

chat with bob example broken #12

nuance1979 commented May 27, 2023

absadiki commented May 28, 2023

absadiki commented May 28, 2023

nuance1979 commented May 28, 2023

absadiki commented May 28, 2023 •

edited

Loading

nuance1979 commented May 28, 2023

nuance1979 commented May 28, 2023

absadiki commented May 28, 2023

nuance1979 commented May 28, 2023

absadiki commented May 29, 2023

nuance1979 commented May 29, 2023

absadiki commented May 29, 2023

ParisNeo commented May 30, 2023

absadiki commented May 30, 2023

absadiki commented Jun 2, 2023

nuance1979 commented Jun 3, 2023 •

edited

Loading

nuance1979 commented Jun 3, 2023 •

edited

Loading

absadiki commented Jun 3, 2023

nuance1979 commented Jun 3, 2023 •

edited

Loading

absadiki commented Jun 4, 2023

siddhsql commented Jun 13, 2023

nuance1979 commented Jun 16, 2023

chat with bob example broken #12

chat with bob example broken #12

Comments

nuance1979 commented May 27, 2023

absadiki commented May 28, 2023

absadiki commented May 28, 2023

nuance1979 commented May 28, 2023

absadiki commented May 28, 2023 • edited Loading

nuance1979 commented May 28, 2023

nuance1979 commented May 28, 2023

absadiki commented May 28, 2023

nuance1979 commented May 28, 2023

absadiki commented May 29, 2023

nuance1979 commented May 29, 2023

absadiki commented May 29, 2023

ParisNeo commented May 30, 2023

absadiki commented May 30, 2023

absadiki commented Jun 2, 2023

nuance1979 commented Jun 3, 2023 • edited Loading

nuance1979 commented Jun 3, 2023 • edited Loading

absadiki commented Jun 3, 2023

nuance1979 commented Jun 3, 2023 • edited Loading

absadiki commented Jun 4, 2023

siddhsql commented Jun 13, 2023

nuance1979 commented Jun 16, 2023

absadiki commented May 28, 2023 •

edited

Loading

nuance1979 commented Jun 3, 2023 •

edited

Loading

nuance1979 commented Jun 3, 2023 •

edited

Loading

nuance1979 commented Jun 3, 2023 •

edited

Loading