-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ncdf4-based nc writing for disaggregation.R to avoid memory spike #630
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite nice now! And I also like that writing .nc files is less convoluted without relying on terra
, which essentially also uses ncdf4
. This way we also have a better understanding of what actually gets written in which way.
Like discussed earlier with @pascal-sauer, I wonder, however, whether it would not be more straightforward to implement this in magclass
directly. Also, considering that other scripts like ./extra/disaggregation_LUH2.R
also rely on magclass::write.magpie()
for writing data rich .nc
files.
From a topical prespective the disaggregation script might not be the right place to define the writing function for .nc
files. It should mostly contain processes and tools directly related to the disaggregation, particularly as we have the luxury of helper packages like maglcass
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @pascal-sauer for this incredible and in the end quite extensive work!
🐦 Description of this PR 🐦
disaggregation.R was frequently causing crashes due to memory requirements. It seems there was a very brief memory spike when terra::writeCDF was called by write.magpie, this is however not visible via sacct/maxrss when running via maxmem (where it does not crash). It is a known issue that sacct/maxrss does not always catch brief memory spikes. Running without maxmem consistently leads to crashes nevertheless, whereas the approach in this PR never crashes when run without maxmem. By adding a bunch of
rm
calls the maxrss was reduced to 3.7GB from 4.9GB. (For completeness: the old raster based approach used 6.4GB and also took 2min/33% longer.)We should also consider if we want to switch to this way of writing nc files in write.magpie in general in order to become independent of terra in that regard, and get more control over the written nc files at the same time.
🔧 Checklist for PR creator 🔧
Label pull request from the label list.
Self-review own code
magpie4
R library has been updated accordingly and backwards compatible where necessary.scenario_config.csv
has been updated accordingly (important ifdefault.cfg
has been updated)Document changes
CHANGELOG.md
goxygen::goxygen()
and verify the modified code is properly documentedPerform test runs
Rscript start.R --> "compilation check"
Rscript start.R --> "test runs"
Rscript start.R --> "test runs"
📉 Performance changes 📈
🚨 Checklist for reviewer 🚨
CHANGELOG
is updated correctly