-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows ramplot not valid #434
Comments
WHY does it still make a temp file on the drive when plotting in ram? |
it writes to the final plot file during the plotting process. so the final write time takes less. as it can commit some of it as it goes. |
Any ideas regarding the crashing of higher then c1 plots, and being invalid even though the log says it is completed? |
No one I've got one. It's just poorly poorly optimized read somewhere that bladebit does not like 2 numa nodes. Did 1 it worked at first, but still crashed after 2 invalid plots. All ramplots are invalid. C0 through C5. Even diskplot: cheap nvme read ±400 write 750MB 121 min. Better nvme ± 700 read ±1.1 to 1.4GB a second 62 min so that scales fairly decent. On to a datacenter stripe array that reads 5.5 GB a second and writes around 7GB (they are sustained write optimized) it takes 76 minutes. There is absolutely no rime nor reason for these times. Get the gnarly feeling that it is mostly AMD optimized. Could be wrong but other diskplotters with similar read n write that are AMD based hit sub to low 20 min. Some even faster.... |
This is incorrect. bladebit's The only things that changed in the latest v3 were minor things to accommodate for compressed plotting. There could potentially be a bug there in phase 3, but we never encountered any during testing. If you can provide plot id's and compression levels for the plots you created that were invalid I could reproduce them locally to see if I encounter them as well. If you have some full logs of ramplot's faulty plots please post a few as well and I can take a look |
Are the corrupt ones only in Windows, by the way? |
Yes sir they are windows based. Where could I find these, do you mean the actual id's of the plots. If so I will post them. As for logs they just crap out at the start at first and after reading about the numa dislike i changed it to no more than 16 threads, thus using 1 numa node and that worked at first.... later it produces no more than 2 or 3 plots before crashing. Thats just it the logs state completed succesfully no errors. But the farmer node does not recognise them as valid. |
Ok i think i got the not opening compressed plots sorted. was my bad and not the harvester the decrompression was turned of. I discovered this after reading the debug logs. started a series of 10 C5. 1st C5 did seize up at start and never got to allocating buffers. Deleted it and it suprisingly started with 2nd plot... Got a autoremover so i hope it will krank out 10 plots, o on sorry 9 the first 1 did choke upon start |
So it is a hit and a miss. Never makes it past third plot, makes 1 delete the next where it keeps hanging and then it will start the next. It is not set and forget but rather set and forget about it. as if plotting on its own when automated did not take long enough, i now have to keep an eye out for when it crashes, because it will!! and run all night burning electric on a plot that is never going to finish. Maybe code in sort of a counter that exits current plot when phase time equals or exceeds number x and then start with next in the batch. It's a fairly safe assumption when a phase or rather a subset lets say propagating, sorting or computing fx in the phase is not completed after say 1800 seconds it most likely never will. could be as simple as you most likely will have to define a sub routine or a function for counter this will be a 1 time definition that you can call upon anytime its needed, it most defenitly will put some overhead on the proces but i think a counter function or sub will be negligible, compared to the alternative running a routine all night that is never going to finish . And i know for a fact that it is a problem that more users have when the plotter craps out. It would be quite useful if you have the knowledge that when plotting. in the event that it does crash. The program wil stop clean up after itself and goes about it's business plotting. This can be applied linear to ram, disk and cuda plotting alike I think. And quite frankly I think my old coding teacher would have killed me for overlooking such an obvious single point of failure, and not code in a safety to handle it. |
Ok it plowed through 7 plots un interrupted. Thats a plus. Now i can realish into gpu plotting, as my riser and bracket for my server finaly got delivered in the mail this afternoon. But still going to check out if i can add a boundry of some sorts to define scope of "normal" operating parameters. Still think that 30 minutes for allocating buffers and resources will more then suffice for atleast ram and cudaplotting. No need to reinvent the wheel, that is well beyond the scope of my coding capabilities and lets be honest it is a very well designed wheel already after my first peak into the code. Even the temp cleanup is already arranged in the code i think, because it litteraly is done at the end. |
After careful reading your reply stating it is the most stable out of the 3( so you do know there are issues with the 3), is like spinning 3 spinning tops a b c. A 100mph b 75 mph c 50 mph and stating a is the most stable( the longest), which it is due to higher rpm and the gyroscopic effect. But the end result wil be the same for all 3. Lay on their side..... |
That said gpu plotter crashed after 1 plot out of 10. After deleting it did finish. On go 25.. crashes after 16. Luckily I caught it and after checking the logs from the mover I concluded that it hanged on allocating resources for over 45 or should I say only 45 minutes. |
I have an issue with ram plot. None of the c1 through c4 are valid. c5 it just craps out and hangs will not even plot. the generated log for the plots do not mention any error of any kind thus presumend valid/succeded.
Can't seem to find any info on this problem.
Dual Xeons with 512 GB of Ram so ram shortage can be ruled out. Using the bladebit command line it just says sytem mem and closes. Any help/pointers are much appreciated.
After inspection it just succesfully generated c1 plots\the other
Bladebit Chia Plotter
Version : 3.1.0
Git Commit : e9836f8
Compiled With: msvc 19.29.30152
[Global Plotting Config]
Will create 1 plots.
Thread count : 32
Warm start enabled : false
NUMA disabled : false
CPU affinity disabled : false
Farmer public key : ******
Pool contract address : ******
Compression Level : 1
Benchmark mode : disabled
System Memory: 501/511 GiB.
Memory required: 416 GiB.
Allocating buffers.
Generating plot 1 / 1: 26068862a560d7ccd78a902d3166d79a8c992c309de2093b3b7f4aee54123327
Plot temporary file: G:\plot-k32-c01-2023-10-16-06-17-26068862a560d7ccd78a902d3166d79a8c992c309de2093b3b7f4aee54123327.plot.tmp
Running Phase 1
Generating F1...
Finished F1 generation in 34.92 seconds.
Sorting F1...
Finished F1 sort in 162.35 seconds.
Progress update: 0.01
Forward propagating to table 2...
Pairing L/R groups...
Finished pairing L/R groups in 36.5480 seconds. Created 4294967296 pairs.
Average of 236.1406 pairs per group.
Computing Fx...
Finished computing Fx in 36.3710 seconds.
Sorting entries...
Finished sorting in 306.08 seconds.
Finished forward propagating table 2 in 379.74 seconds.
Progress update: 0.06
Forward propagating to table 3...
Pairing L/R groups...
Finished pairing L/R groups in 26.5890 seconds. Created 4294955440 pairs.
Average of 236.1400 pairs per group.
Computing Fx...
Finished computing Fx in 36.5790 seconds.
Sorting entries...
Finished sorting in 244.27 seconds.
Finished forward propagating table 3 in 308.82 seconds.
Progress update: 0.12
Forward propagating to table 4...
Pairing L/R groups...
Finished pairing L/R groups in 25.7270 seconds. Created 4294967296 pairs.
Average of 236.1406 pairs per group.
Computing Fx...
Finished computing Fx in 29.9160 seconds.
Sorting entries...
Finished sorting in 233.35 seconds.
Finished forward propagating table 4 in 289.77 seconds.
Progress update: 0.2
Forward propagating to table 5...
Pairing L/R groups...
Finished pairing L/R groups in 25.9710 seconds. Created 4294967296 pairs.
Average of 236.1406 pairs per group.
Computing Fx...
Finished computing Fx in 38.4280 seconds.
Sorting entries...
Finished sorting in 233.17 seconds.
Finished forward propagating table 5 in 298.95 seconds.
Progress update: 0.28
Forward propagating to table 6...
Pairing L/R groups...
Finished pairing L/R groups in 26.3800 seconds. Created 4294967296 pairs.
Average of 236.1406 pairs per group.
Computing Fx...
Finished computing Fx in 36.7030 seconds.
Sorting entries...
Finished sorting in 239.34 seconds.
Finished forward propagating table 6 in 303.30 seconds.
Progress update: 0.36
Forward propagating to table 7...
Pairing L/R groups...
Finished pairing L/R groups in 26.5250 seconds. Created 4294967296 pairs.
Average of 236.1406 pairs per group.
Computing Fx...
Finished computing Fx in 34.5100 seconds.
Finished forward propagating table 7 in 62.39 seconds.
Progress update: 0.42
Finished Phase 1 in 1840.25 seconds.
Running Phase 2
Prunning table 6...
Finished prunning table 6 in 0.95 seconds.
Progress update: 0.43
Prunning table 5...
Finished prunning table 5 in 41.85 seconds.
Progress update: 0.48
Prunning table 4...
Finished prunning table 4 in 40.06 seconds.
Progress update: 0.51
Prunning table 3...
Finished prunning table 3 in 43.31 seconds.
Progress update: 0.55
Finished Phase 2 in 126.63 seconds.
Running Phase 3
Compressing tables 2 and 3...
Finished compressing tables 2 and 3 in 122.39 seconds
Progress update: 0.73
Table 2 now has 3439886779 / 4294955440 entries ( 80.09% ).
Compressing tables 3 and 4...
Finished compressing tables 3 and 4 in 122.18 seconds
Progress update: 0.79
Table 3 now has 3466099063 / 4294967296 entries ( 80.70% ).
Compressing tables 4 and 5...
Finished compressing tables 4 and 5 in 123.88 seconds
Progress update: 0.85
Table 4 now has 3533016750 / 4294967296 entries ( 82.26% ).
Compressing tables 5 and 6...
Finished compressing tables 5 and 6 in 130.37 seconds
Progress update: 0.92
Table 5 now has 3713709595 / 4294967296 entries ( 86.47% ).
Compressing tables 6 and 7...
Finished compressing tables 6 and 7 in 156.02 seconds
Progress update: 0.98
Table 6 now has 4294967296 / 4294967296 entries ( 100.00% ).
Finished Phase 3 in 654.84 seconds.
Running Phase 4
Writing P7.
Finished writing P7 in 1.48 seconds.
Writing C1 table.
Finished writing C1 table in 0.01 seconds.
Writing C2 table.
Finished writing C2 table in 0.00 seconds.
Writing C3 table.
Finished writing C3 table in 1.68 seconds.
Finished Phase 4 in 3.16 seconds.
Writing final plot tables to disk
G:\plot-k32-c01-2023-10-16-06-17-26068862a560d7ccd78a902d3166d79a8c992c309de2093b3b7f4aee54123327.plot.tmp -> G:\plot-k32-c01-2023-10-16-06-17-26068862a560d7ccd78a902d3166d79a8c992c309de2093b3b7f4aee54123327.plot
Final plot table pointers:
Table 1: 4096 ( 0x0000000000001000 )
Table 2: 4096 ( 0x0000000000001000 )
Table 3: 14001424784 ( 0x00000003428cc990 )
Table 4: 28090921184 ( 0x000000068a5970e0 )
Table 5: 42452428634 ( 0x00000009e25ca75a )
Table 6: 57548442509 ( 0x0000000d66278b8d )
Table 7: 75007232909 ( 0x0000001176c78b8d )
C 1 : 92723973005 ( 0x0000001596c78b8d )
C 2 : 92725690997 ( 0x0000001596e1c275 )
C 3 : 92725691173 ( 0x0000001596e1c325 )
Final plot table sizes:
Table 1: 0.00 MiB
Table 2: 13352.80 MiB
Table 3: 13436.79 MiB
Table 4: 13696.20 MiB
Table 5: 14396.68 MiB
Table 6: 16650.00 MiB
Table 7: 16896.00 MiB
C 1 : 1.64 MiB
C 2 : 0.00 MiB
C 3 : 1228.80 MiB
Finished writing tables to disk in 20.77 seconds.
Finished plotting in 2645.66 seconds (44.09 minutes).
C5:
Bladebit Chia Plotter
Version : 3.1.0
Git Commit : e9836f8
Compiled With: msvc 19.29.30152
[Global Plotting Config]
Will create 1 plots.
Thread count : 32
Warm start enabled : false
NUMA disabled : false
CPU affinity disabled : false
Farmer public key : ******
Pool contract address : ******
Compression Level : 5
Benchmark mode : disabled
System Memory: 501/511 GiB.
Memory required: 416 GiB.
Allocating buff
Tasks
The text was updated successfully, but these errors were encountered: