The 'stop' argument causes 'pyo3_runtime.PanicException' in some cases. #1131

Six6stRINgs · 2025-02-21T08:24:52Z

The bug
Hello, Thanks for your great work. But recently I have occured some wired bugs and don't how to fix it.
I want to gen a json format string using gen with some regular args.
Using the stop argument in gen can cause bugs in certain cases.

Code

wmeta_key = wordmeta.get_attr_name_list()   #Basemodel from pydantic
    input_text = f"task:{task}\nprompt:{prompt}\n```json\n{{\n"
    for key in wmeta_key:
        wmeta_value = getattr(wordmeta, key)
        wdecision_value = getattr(wdecision, f"need_{key}")

        if isinstance(wmeta_value, int | float):
            reg = r"^\d+$"
            m_tokens = 10
        elif isinstance(wmeta_value, str):
            reg = None
            m_tokens = 50

        gen_res = gen_decision(
            lm + f"{task} {key}: ",
            need_llm=wdecision_value,
            ori_text=wmeta_value,
            name="res",
            regex=reg,
            stop=["'", '"', ".", "\n"],
            max_tokens=m_tokens,
        )

def gen_decision(lm, need_llm: bool, ori_text: str, **gen_args):
    return lm + gen(**gen_args) if need_llm else ori_text

......

When running gen_decision, error will be reported. If I remove the stop argument, the issue does not occur. However, the output generated by the LLM is not satisfactory.

assertion failed: self.state.byte_to_token_idx.len() >= n_bytes
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "/work/llm.py", line 65, in gen_decision
    return lm + gen(**gen_args) if need_llm else ori_text
  File "/root/anaconda3/envs/guidance/lib/python3.10/site-packages/guidance/models/_model.py", line 1207, in __add__
    out = lm._run_stateless(value)
  File "/root/anaconda3/envs/guidance/lib/python3.10/site-packages/guidance/models/_model.py", line 1413, in _run_stateless
    for chunk in gen_obj:
  File "/root/anaconda3/envs/guidance/lib/python3.10/site-packages/guidance/models/_model.py", line 431, in __call__
    tokens, mask_fut, backtrack = parser.advance(engine_output)
  File "/root/anaconda3/envs/guidance/lib/python3.10/site-packages/guidance/_parser.py", line 78, in advance
    return self._generator.send(engine_output)
  File "/root/anaconda3/envs/guidance/lib/python3.10/site-packages/guidance/_parser.py", line 153, in _parse
    backtrack, ff_tokens = self.ll_interpreter.commit_token(
pyo3_runtime.PanicException: assertion failed: self.state.byte_to_token_idx.len() >= n_bytes
[ERROR] 2025-02-21-07:57:02 (PID:3017059, Device:0, RankID:-1) ERR99999 UNKNOWN application exception

But when running some simple codes, it will be ok:

lm = create_lm() #lm = models.Transformers(model_path, device_map=device_map)
    print(
        lm
        + "Hello "
        + gen(
            name="res",
            regex=None,
            stop=["'", '"', ";", ":", ",", ".", "\n"],
            max_tokens=50,
            temperature=0.5,
        )
    )

Response

Hello I am trying to create a simple program that will take a string and print out the number of times each character appears in the string

To Reproduce
Loader : loaded by models.Transformers
Model: Qwen2.5-1.5b-instruct, Qwen1.5-7b
Executing the aforementioned code

System info (please complete the following information):

OS (e.g. Ubuntu, Windows 11, Mac OS, etc.): Ubuntu
Guidance Version (guidance.__version__): 0.2.0
Device: npu

The text was updated successfully, but these errors were encountered:

Harsha-Nori · 2025-03-02T18:02:16Z

Hi @Six6stRINgs, thanks for reporting this...definitely odd! Will look into reproducing this week

Six6stRINgs · 2025-03-05T09:34:21Z

@Harsha-Nori Hello, thanks for your reply. I spent a few days trying to make the code a little bit easier to reprodue this bug.
There is the code:

import guidance
from guidance import gen
from pydantic import BaseModel

class bm(BaseModel):
    a: str
    text: str
    ccc: str

    def get_attr_name_list(self) -> list[str]:
        return [attr for attr, _ in self.model_fields.items()]


class bm2(BaseModel):
    ttt: str
    attrr: str
    bo: str

    def get_attr_name_list(self) -> list[str]:
        return [attr for attr, _ in self.model_fields.items()]


if __name__ == "__main__":
    model_path = "/data/weights/qwen2_5-1.5b-instruct/"
    device_map = {"": 2}

    m = bm(
        a="5",
        text="world",
        ccc="hello world",
    )

    m2 = bm2(ttt="123", attrr="2", bo="True")

    model = guidance.models.Transformers(model_path, device_map=device_map)

    bm_list1 = ["a", "text", "ccc"]  # crash at 2nd str
    bm_list2 = ["222", "333", "444"]  # normal
    bm_list3 = ["a", "666", "ccc"]  # normal
    bm_list4 = ["text", "ccc", "a"]  # crash at 1st str

    bm2_list1 = ["ttt", "attrr", "bo"]  # normal
    bm2_list2 = ["t1", "a2", "b3"]  # normal
    bm2_list3 = ["ttt", "attrr", "ta"]  # normal

    bm_keys: list[str] = m.get_attr_name_list()  # crash at 2nd str, same as bm_list1
    bm2_keys: list[str] = m2.get_attr_name_list()  # normal

    # the list/keys from above
    for key in bm_keys:
        print(f"key: {key}")
        res = (
            model
            + f"Hello, {key}: "
            + gen(
                name="res",
                stop=["'", '"', ";", ":", ",", ".", "\n"],
                max_tokens=15,
            )
        )
        print(f"LLM res: {res}")

I run this code on my linux server.

The key part is the the list used in `for` loop

It seems that the str in list may crash the program. If the string is exactly the attribute name of the Basemodel, same bug will be reported.
From the code, bm_list1, bm_list4 and bm_keys will crash the program when model begins to gen. But other lists work well.

thread '<unnamed>' panicked at parser/src/earley/parser.rs:2043:9:
assertion failed: self.state.byte_to_token_idx.len() >= n_bytes
......

However, the odd part is that bm2_list1 and bm2_keys of what contains in the same attribute string of the class bm2 it works without error.
If I remove the stop arg, all lists run well.
Have no idea about this bug T_T.

Harsha-Nori · 2025-03-05T17:29:52Z

Thank you for the detailed debugging, it really helps! @mmoskal @hudson-ai any ideas? Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: SixString ***@***.***> Sent: Wednesday, March 5, 2025 1:34:42 AM To: guidance-ai/guidance ***@***.***> Cc: Mention ***@***.***>; Comment ***@***.***>; Subscribed ***@***.***> Subject: Re: [guidance-ai/guidance] The 'stop' argument causes 'pyo3_runtime.PanicException' in some cases. (Issue #1131) @Harsha-Nori<https://github.com/Harsha-Nori> Hello, thanks for your reply. I spent a few days trying to make the code a little bit easier to reprodue this bug. There is the code: import guidance from guidance import gen from pydantic import BaseModel class bm(BaseModel): a: str text: str ccc: str def get_attr_name_list(self) -> list[str]: return [attr for attr, _ in self.model_fields.items()] class bm2(BaseModel): ttt: str attrr: str bo: str def get_attr_name_list(self) -> list[str]: return [attr for attr, _ in self.model_fields.items()] if __name__ == "__main__": model_path = "/data/weights/qwen2_5-1.5b-instruct/" device_map = {"": 2} m = bm( a="5", text="world", ccc="hello world", ) m2 = bm2(ttt="123", attrr="2", bo="True") model = guidance.models.Transformers(model_path, device_map=device_map) bm_list1 = ["a", "text", "ccc"] # crash at 2nd str bm_list2 = ["222", "333", "444"] # normal bm_list3 = ["a", "666", "ccc"] # normal bm_list4 = ["text", "ccc", "a"] # crash at 1st str bm2_list1 = ["ttt", "attrr", "bo"] # normal bm2_list2 = ["t1", "a2", "b3"] # normal bm2_list3 = ["ttt", "attrr", "ta"] # normal bm_keys: list[str] = m.get_attr_name_list() # crash at 2nd str, same as bm_list1 bm2_keys: list[str] = m2.get_attr_name_list() # normal # the list/keys from above for key in bm_keys: print(f"key: {key}") res = ( model + f"Hello, {key}: " + gen( name="res", stop=["'", '"', ";", ":", ",", ".", "\n"], max_tokens=15, ) ) print(f"LLM res: {res}") I run this code on my linux server. The key part is the the list used in for loop It seems that the str in list may crash the program. If the string is exactly the attribute name of the Basemodel, same bug will be reported. From the code, bm_list1, bm_list4 and bm_keys will crash the program when model begins to gen. But other lists work well. thread '<unnamed>' panicked at parser/src/earley/parser.rs:2043:9: assertion failed: self.state.byte_to_token_idx.len() >= n_bytes ...... However, the odd part is that bm2_list1 and bm2_keys of what contains in the same attribute string of the class bm2 it works without error. If I remove the stop arg, all lists run well. Have no idea about this bug T_T. — Reply to this email directly, view it on GitHub<#1131 (comment)> or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABIOOZYHLUI6RPRCZTICTWT2S3ALHBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLLDTOVRGUZLDORPXI6LQMWWES43TOVSUG33NNVSW45FGORXXA2LDOOJIFJDUPFYGLKTSMVYG643JORXXE6NFOZQWY5LFVE2TMNBUGQZTENJVQKSHI6LQMWSWS43TOVS2K5TBNR2WLKRSHA3DQMZUG42TKMNHORZGSZ3HMVZKMY3SMVQXIZI>. You are receiving this email because you were mentioned. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

hudson-ai · 2025-03-05T21:59:24Z

@mmoskal I have a hypothesis -- we crash if stop backtracks during token healing.

Minimal repro:

model += "Hello, text: "
model += gen(stop='"')

If we token heal and allow the first token to be ' "' (which it seems Qwen wants to do in this case), something breaks. Note that we don't get any errors if the above were this instead:

model += "Hello, text: " + gen(stop='"')

mmoskal · 2025-03-05T22:07:34Z

Yeah, this crashes:

#[test]
fn test_ll_stop_heal() {
    // https://github.com/guidance-ai/guidance/issues/1131
    check_lark_grammar_prompt(
        r#"
            start: gen
            gen[stop=/"/]: /.*/
        "#,
        "Hello, text: ",
        &["Hello‧,‧ text‧:", " \""],
    );
}

hudson-ai · 2025-03-05T22:30:18Z

@Six6stRINgs thanks for opeing the issue and finding this bug!

While we work on a fix, here is a workaround: group the f"Hello, {key}: " prompt with the gen expression.

There is a subtle difference between

model + "Hello" + gen(...)

and

model + ("Hello" + gen(...))

Doing the second version instead here should get around the crash.

mmoskal · 2025-03-05T22:42:48Z

keeping open until llguidance is updated in guidance

mmoskal closed this as completed in guidance-ai/llguidance@3fff351 Mar 5, 2025

mmoskal reopened this Mar 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The 'stop' argument causes 'pyo3_runtime.PanicException' in some cases. #1131

The 'stop' argument causes 'pyo3_runtime.PanicException' in some cases. #1131

Six6stRINgs commented Feb 21, 2025

Harsha-Nori commented Mar 2, 2025

Six6stRINgs commented Mar 5, 2025

Harsha-Nori commented Mar 5, 2025 via email

hudson-ai commented Mar 5, 2025 •

edited

Loading

mmoskal commented Mar 5, 2025

hudson-ai commented Mar 5, 2025

mmoskal commented Mar 5, 2025

The 'stop' argument causes 'pyo3_runtime.PanicException' in some cases. #1131

The 'stop' argument causes 'pyo3_runtime.PanicException' in some cases. #1131

Comments

Six6stRINgs commented Feb 21, 2025

Harsha-Nori commented Mar 2, 2025

Six6stRINgs commented Mar 5, 2025

The key part is the the list used in for loop

Harsha-Nori commented Mar 5, 2025 via email

hudson-ai commented Mar 5, 2025 • edited Loading

mmoskal commented Mar 5, 2025

hudson-ai commented Mar 5, 2025

mmoskal commented Mar 5, 2025

The key part is the the list used in `for` loop

hudson-ai commented Mar 5, 2025 •

edited

Loading