Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why set the label tokens the same as the input token #637

Open
1 of 2 tasks
kaimoxuan123 opened this issue Aug 19, 2024 · 0 comments
Open
1 of 2 tasks

Why set the label tokens the same as the input token #637

kaimoxuan123 opened this issue Aug 19, 2024 · 0 comments

Comments

@kaimoxuan123
Copy link

System Info

CUDA: 12.1

Information

  • The official example scripts
  • My own modified scripts

🐛 Describe the bug

Hello, I am learning toxicchat_dataset.py to generate instruction datasets to fine-tune llmam3.1.
https://github.com/meta-llama/llama-recipes/blob/main/src/llama_recipes/datasets/toxicchat_dataset.py#L17

for
ombined_tokens = {
"input_ids": list(prompt_tokens),
"labels": list(prompt_tokens)
}
return dict(combined_tokens, attention_mask=[1]*len(combined_tokens["input_ids"]))
As our task is to predict the next token, why can we offset the label exactlly the same as the input_ids, why don't we offest the input_ids by one token to the right side to get the labels.

Error logs

Read the code to get some experience

Expected behavior

The label is the input_ids shifted to the right by one token

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant