Using MBCn Algorithm in Cross-Validation Mode #1937
Replies: 2 comments
-
Any update on this? |
Beta Was this translation helpful? Give feedback.
-
Sorry for not answering earlier! When implementing sdba I don't think we ever considered the possibility of having holes in the time dimension. Even more so in the current version of
One thing you can try would be to remove the years but re-generate a "synthetic" time coordinate so that there are no holes in the end ? For exemple, for data from 1976 to 2005. In sample 1, drop the even years and then patch the time coordinate with something going from 1976 to 1990. Same thing with sample 2, but dropping the odd years. Your discussion made us realize that this was may be not communicated correctly in the docstring, we should change that. My colleague @coxipi is working on a version that relaxes some of these requirements, but the performance is still not there. MBCn is complex and achieving a balance of generality (that would allow to do funky things like you are trying to do) and performance (so that it can be used with real data) is hard. Hope this helps! |
Beta Was this translation helpful? Give feedback.
-
Setup Information
Xclim version: 0.52.0
Python version: 3.10.14
Operating System: Linux
Context
Hi all,
I'm trying to implement the MBCn algorithm in a cross-validation framework. Specifically, I want to split my historical simulation data into two parts: a training period and a validation period. I've decided to use odd-numbered years for training the algorithm and then adjust the even-numbered years based on the calculated adjustment factors from the training data.
To evaluate the performance of MBCn for simultaneously bias-correcting surface temperature and precipitation, I created six test cases. However, I've encountered differences between the results that I can't fully explain.
Test Cases
Test 1
Training: Full historical time series with even-numbered years masked (using the where operator in xarray).
Application: Full historical time series with odd-numbered years masked.
Test 2
Training: Full historical time series with even-numbered years masked.
Application: Full historical time series with odd-numbered years removed (using where with drop=True in xarray).
Test 3
Training: Full historical time series with even-numbered years masked.
Application: Full historical time series (no years masked or removed).
Test 4
Training: Full historical time series with even-numbered years removed.
Application: Full historical time series with odd-numbered years masked.
Test 5
Training: Full historical time series with even-numbered years removed.
Application: Full historical time series with odd-numbered years removed.
Test 6
Training: Full historical time series with even-numbered years removed.
Application: Full historical time series (no years masked or removed).
Results
It seems that the MBCn.adjust function fails because the application time series is shorter than the training time series. Could this be a limitation of the algorithm? Should this restriction be maintained, or is there a workaround?
Conclusion
Based on my experiments, it seems that for cross-validation, data should be masked rather than removed from the training dataset. However, Test 2 points to a potential limitation in MBCn, where the algorithm cannot adjust a
sim
time series that is shorter than thehist
andref
time series. Interestingly, the reverse situation (a longersim
time series) does not seem to cause issues.I would appreciate any insights or recommendations on how to address these differences.
Thanks in advance for your help!
Best regards,
Sylvain
Steps To Reproduce
To reproduce my tests: https://cloud.meteo.be/s/egKyBbdgaBFxE4d
Beta Was this translation helpful? Give feedback.
All reactions