-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MiniCPM-o-2_6数据处理问题 #6793
Labels
solved
This problem has been already solved
Comments
cc @BUAADreamer |
参考这个pr改一下看看是不是正常了? @jinzhuoran |
感谢回复,问题已经解决 |
2 tasks
另外汇报一个问题,设置
可能需要对audio输入做一些空输入的操作 |
@jinzhuoran 贴一下代码行数? |
更新最新的modeling_minicpmo.py 代码即可 |
@jinzhuoran 现在正常了嘛 |
感谢,现在已经正常了 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Reminder
System Info
当输入单张图片时,https://github.com/hiyouga/LLaMA-Factory/blob/main/src/llamafactory/data/mm_plugin.py#L555 会计算
valid_image_nums=3
,这可能是由于https://github.com/hiyouga/LLaMA-Factory/blob/main/src/llamafactory/data/mm_plugin.py#L551 中的input_ids_ == processor.tokenizer.slice_end_id
导致的,一个图片会被分成多个slice然而这会导致https://github.com/hiyouga/LLaMA-Factory/blob/main/src/llamafactory/data/mm_plugin.py#L519 计算错误,例如在一个batch中靠前的数据会分配多张图片,而靠后的样本则为空图片,可能是因为没有利用https://github.com/hiyouga/LLaMA-Factory/blob/main/src/llamafactory/data/mm_plugin.py#L538 这个参数。
想请教一下如何解决这个问题?
Reproduction
Others
No response
The text was updated successfully, but these errors were encountered: