generated gibberish within the block #48

yangyangyyy123 · 2024-10-10T06:50:09Z

the generate() function right now only takes the last position of token generated, and shifts the entire window of input one position forward to generate the next output token, still only taking the last position. https://github.com/karpathy/ng-video-lecture/blob/master/gpt.py#L189

I was curious and looked at the entire output contents. For the input, I fed the output of a previous run with the current generate() function, so the input token sequence would be completely "based on the behavior of the model itself", so to speak. then I generated the entire list of T tokens from output. to my surprise, the output is very much gibberish , and quite different from the input (though I could still see a few matches).

I can't figure out why the current method of only taking from the last output position produces seemingly fluent sequences, while the output from middle of the block doesn't make sense. in the current scheme, input grows from torch.zeros((1,1)), up to block size, so during this period, it should be no different from what an output position in the middle of block_size sees, as the output position has masked out all input after it, effective it becomes the end of output window too

HangjianQian · 2025-01-08T14:40:34Z

The question of 'why we only taking the last position of logits', I think all positions of output logits have meaning. The meaning of logits at location i: consider the input [0:i], what the following output should be. So when we train the model, the model can learn from the shorter length.
An Training Example in Word Level: input: "I like shopping online".

input	output	output logits position
I	like	1
I like	shopping	2
I like shopping	online	3

During training, all these losses are collected by cross entropy. During inference, as we only consider the next token, we should only take the last element of logits.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generated gibberish within the block #48

generated gibberish within the block #48

yangyangyyy123 commented Oct 10, 2024 •

edited

Loading

HangjianQian commented Jan 8, 2025

generated gibberish within the block #48

generated gibberish within the block #48

Comments

yangyangyyy123 commented Oct 10, 2024 • edited Loading

HangjianQian commented Jan 8, 2025

yangyangyyy123 commented Oct 10, 2024 •

edited

Loading