-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-initialize output xarray
time coordinates and variables
#68
Comments
@aufdenkampe, I started working on this late last week by creating a profiling script and notebook as a benchmark. My plan for this week was to pre-init and run the same suite of tests.
These could also serve as a point of comparison for any testing you do with xarray-simlab. I can work on making the profiling script more generalized/user friendly. |
@sjordan29, having a common set of profiling scripts that we use for our various test branches sounds very useful! Thank you! I just created a new Milestone for tracking these collective efforts: Performance enhancements for NSM demo, and I want to create a number of specific issues for us to each test and characterize with our different approaches. |
Pasting #57 (comment) (from @xaviernogueira on Dec. 14) here, to keep discussion on this specific issue in one place:
|
TO DO: hot start dataset; increment_timestep #68
#68 todo: deal with timesteps and potentially change typehints update tests
@aufdenkampe, some findings (also outlined in performance_profiling_tsm.ipynb) I worked to pre-initialize the model xarray and compare against the baseline model. When we track dynamic variables, we see some interesting behavior. On the graphs, each panel has the time/iteration on the y axis, the grid size on the x-axis, and each panel represents a different number of iterations (e.g., the top panel is for 1 iteration, the second panel is for 10 iterations, etc.). For a low number of iterations and 1-1,000 cells, we see the baseline is faster, but when we start to get to a large number of cells and timesteps, the baseline computation time takes off while the updated computation time remains steady. I think it is slower than the baseline up front because of the additional overhead memory required to store an empty array over both space and time for all dynamic variables. When we stop tracking dynamic variables and compare the two models, we see better outcomes for our updated pre-initialized model. For single-cell models, it is slightly slower (by <0.002 seconds), but once we get up to more than one cell, the pre-initialized model is faster. Once we get up to high numbers of cells and timesteps, the pre-initialized model is more than 10x faster than the baseline model I imagine we can get the 1-cell model faster by:
I'd also propose we set the default to not track dynamic variables so that new users don't have this computational expense by default (currently default is to track dynamic variables). |
@sjordan29, those are really interesting results! Thanks for that impressive analysis. Your suggestions for next steps all sound spot-on. The performance advantages are quite impressive for >10,000 time steps and grids >1,000 cells, which I think is where we need it the most. It might be interesting to try a 10-cell or 100-cell model, just to get an idea of some intermediate differences. For the slowdowns with larger number of time steps, I'm thinking that the answer might be:
|
Great -- I'll run those additional tests to fill in the gaps. Then I think I can clean this branch up and merge to main, and we can pivot to some of the memory management issues. |
@aufdenkampe, a couple questions/notes while I get the tests updated.
|
I like both of you suggestions.
|
Sounds good. I am leaving I will probably also need to update example notebooks and documentation now that I am thinking of it. Do you want me to work on a PR to main, or should we keep these speed/memory enhancements on a dev branch for now? |
@sjordan29, that all sounds awesome. I'm a fan of keeping feature branches short-lived, so merging early and often. That said, if the example notebooks won't run, then it's probably a good idea to fix those first. More extensive docs could come in a second PR. |
Preinitializing an array at the start of a computation is well known to improve performance, because updating values in the array is much more CPU and memory efficient than resizing or appending to an array.
This issue separates a sub-thread started in this issue:
numba
or not? #57 (comment)In that comment, @xaviernogueira describes the results of a simple test in the
compute_engine_testing.ipynb
notebook (currently in the57-improve-timestep-performance-further
branch) that showed it improved runtime by 4x-5x. See the link above for a discussion of these results.NOTE that we expect all users to know exactly how many time steps they want to run before starting a simulation, so we do not need to preserve that current flexibility in the modeling system.
The text was updated successfully, but these errors were encountered: