-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any easier/faster way to obtain data? #8
Comments
@piee-kun |
I already have both my data and a DiscordChatExporter capture of the entire 3.5mil message channel. Is that able to be used or not? |
@piee-kun
The pipeline will then train the model with your |
Thanks so much for the info and for the time out of your day in order to help me. I'll get back to you with the results. |
Well, I've formatted the data properly and now I'm stuck on step 2 of initializing, I left it for basically the entire day and nothing came of it. |
@piee-kun Im sorry to hear that. Could you please share the logs and anything else you think may be useful to figure out what is wrong? At the current state Mimicbot is not too fault tolerant so I can imagine a couple things that could go wrong but your logs will help a lot. |
Where are the logs stored? I see none generated. |
@piee-kun Hi sorry for the late reply. The loading bar unfortunately is broken and will never move, despite training actually making progress. That is due to the utilization of mimicbot/mimicbot_cli/train.py Lines 436 to 437 in 039c95b
The error is notifying you that after the first epoch, on the first save that the repo name is invalid. I am not sure if you explicitly set the repo name, if you did then try to change the name to the satisfy the requirements on the error, otherwise if you did not, I suggest explicitly setting a name (it is prompted at the beginning of the Let me know how it goes. |
I'm trying to use a dataset of 100k of my own messages among 3.5mil of other messages. The data collection takes over 5hr, but I'm not letting it finish due to time constraints. Is there any faster way to obtain training data as opposed to mining it from the server itself?
The text was updated successfully, but these errors were encountered: