-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash when combining --unique --count-only --matches with large files #57
Comments
No idea. How huge is the JSON file? Could you provide the output with |
File size is 300 MB
|
Ok, looks like a memory corruption in MoarVM :-(. Multi-threading is hard! Does it also crash with OOC, which version of Rakudo are you using? |
|
Hmmm Also: you don't appear to be using any JSON specific functionality. So I tried to do this on a 800MB text file I have for this purpose. It passes, but on MacOS with Apple Silicon. Are you running by chance on Intel hardware? If so, could you try running it with If that makes a difference, then please upgrade to 2024.07. We identified issues in the expression JIT compiler: 1. it caused instability, and 2. overall, it slowed down execution on Intel hardware. So it has been disabled since the 2024.07 release. |
My bad for the arg I have a shortcut for |
Hmm.. then I would suggest you upgrade to 2024.07 anyway: there's been some other work on MoarVM as well, and maybe, just maybe it got fixed. |
I will upgrade for sure, but indeed with a different JSON file of 200 mb it seems to work fine. With another file of just 56 MB it keeps crashing. |
I'm attaching the small file I use to reproduce the bug |
I've manage to get an error that is a bit more useful:
There must be a link with paraseq ! |
if you run this cmd several time you will get different number of matches:
but if you add |
Interesting. The |
Indeed. But |
OOC, why the |
Sadly this doesn't tell it what the exception was. This happens on |
I've just uploaded a 0.3.11 release of |
this gives me all unique matched pattern with different case which is useful for me. |
managed to get this:
useful ? |
VERY useful. But first some sleep :-) |
An update from my side: started working on golfing the issue, not a lot of luck so far. |
ok thanks , let me know if you need anything else. |
I've just uploaded 0.3.13 which should hopefully fix your query, albeit at the expense of being less asynchronous. I don't fully understand yet how this happens, but this should be a lot stabler. Please let me know if this didn't fix it. Also, today I uncovered an issue in Rakudo with the use of |
Good news is latest version has fixed the issue on both of my test files but it's quite slow. |
Doubtful any time soon. It's not really a technical issue. The workaround is to make sure there is an unshared lexical Ideally we would like to have each scope have its own lexical if /foo/ { say $/ } # or $0 or $<bar> which is currently, apart from being a common pattern, is also cemented in roast. |
Status update: I think I've been able to reproduce your original problem with a 10 line script. So there's progress :-) |
Glad to hear that. |
Just to be clear in order to prevent regex multithreading issues, you just need to add |
Yes. |
Hi Liz,
Thanks for implementing the feature request ! After doing some testing I've run into a bug when you do this:
rak keyword --count-only --unique --matches /tmp/huge-file.json
=> this crashes instantlyrak keyword --count-only --unique --matches /tmp/small-file.txt
=> this worksrak keyword --count-only /tmp/huge-file.json
=> this worksI'm using rak version
Any idea why it crashes with huge files only ?
The text was updated successfully, but these errors were encountered: