First differencing in loss objective #72

ChrisRackauckas · 2018-04-05T13:50:10Z

finmod · 2018-04-07T13:48:03Z

Some additional indications on the algorithm are here: https://arxiv.org/pdf/1709.06383.pdf

ChrisRackauckas · 2018-04-08T05:20:20Z

The true cost function for the continuous case can be tracked down to this: https://journals.aps.org/pre/pdf/10.1103/PhysRevE.84.056214 . It's essentially just adding a differencing term to the cost function. The way they do it though is prone to instability, but we can do the first differences a lot like we did in the maximum likelihood. @Vaibhavdixit02 would you like to take that on?

ChrisRackauckas · 2018-04-26T15:28:30Z

https://github.com/JuliaDiffEq/DiffEqParamEstim.jl/blob/master/test/tests_on_odes/l2loss_test.jl

Doesn't cover. I'm getting NANs when using it. @Vaibhavdixit02

ChrisRackauckas · 2018-05-08T13:51:55Z

Let's finish this up @Vaibhavdixit02

Vaibhavdixit02 · 2018-05-08T13:59:58Z

Okay 👍

finmod · 2018-05-11T12:27:33Z

@ChrisRackauckas @Vaibhavdixit02 Is this a successful closure of 4DVAR estimation method? Do you plan to revise the docs for L2Loss or produce a docstring with references on L2Loss? I am at a loss myself on what is going on with this new L2Loss!

ChrisRackauckas · 2018-05-11T15:07:30Z

We should do the 4DVAR cost separately because it has a lot of issues in the general case. But let me summarize what's going on here, and how this is related to 4DVAR.

The author defines three versions of the 4DVAR. The strong 4DVAR has two terms:

Initial condition term
L2 loss term

Until we have SciML/DifferentialEquations.jl#251 , (1) is meaningless. But 2 is what we already have, sans the non-diagonal weighting (#82) which only makes sense if you have pre-information about noise correlations (in which case, use an SDE?). So strong 4DVAR is essentially what we have, that's clear.

The more interesting part is weak 4DVAR:

Initial condition term
L2 loss term
Model derivative term.

The model derivative term is sol[i+1]-sol[i] - f(sol[i],p,t), or using the diffeq interpolants, sol[i+1]-sol[i] - sol(sol.t[i],Val{1}) if dense. This is saying that the derivative in the data should match the Euler approximation of the derivative. Obviously, this is really bad if the data is sparse. So what we've done is instead added a term to better approximate this for sparse or nonlinear data. We added a first differencing term, which is:

(sol[i+1]-sol[i]) - (data[i+1] - data[i]),

saying that the first differences in the solution must match the first differences in the data. While the model derivative Euler term is not correct for sparse or nonlinear data, even when the model and parameters are perfect (i.e. the best loss function doesn't actually zero, and is biased), this captures the same idea but is non-biased.

This first differencing term is not covered by the L2 loss term because of the nonlinearity of the squaring operation. This is easier to note on stochastic differential equation models, where there are cases that the L2Loss alone leads to non-identifiable parameters but adding a first differencing term makes it identifiable. To construct such an example, you can just take something like du = -p1*u*dt + p2*dW_t and make the starting ensemble (initial condition) be the same distribution as the stationary distribution. If you only check the L2 term, then you're only checking whether the probability distribution is correct which is not identifiable due to the relation of p1 and p2 (if you increased p2 and decrease p1 you keep the same stationary distribution). However, the first differences essentially measure the autocorrelation, which is different even when the probability distribution changes (increasing p2 and decreasing p1 keeps the same stationary probability, but a given trajectory moves around the space at a faster rate leading to lower autocorrelation and higher differencing terms on average). So this term really is capturing something new, is non-biased, and is cheap to calculate. We did not find it in the literature, but it's clear how it's inspired by the weak 4DVAR.

But just in case, the model derivative term got an issue: #83 .

The last thing is that the 4DVAR sometimes adds another approximate term. Sometimes what it will do is a polynomial interpolation. It's there to reduce the oscillations which are due to the bias of the model derivative term. A separate issue for that is here: #74 . Of course, it's not a high priority if we aren't using the model derivative term though.

So that's a full explanation of the cost function and what we did. Issues

#82
#83
#74

could fill out the loss function more, but are either very model specific (matrix weights are not something that are common to use, even full diagonal which we have is uncommon), or are biased estimators (model derivative term and the Hermite anti-oscillation term).

None of those possible extra terms could help on the Lorenz tests because of the bias and the lack of true noise (matrix weights are only there to match covariances). But, we did get something in this issue that can help it out, so try out the first differencing term and see how it does.

Vaibhavdixit02 · 2018-05-11T15:57:34Z

@finmod I will definitely add docs for the recent changes as soon as possible. The API isn't different in case you want to use it as before and any previous work will not be affected 👍

ChrisRackauckas · 2018-05-11T15:59:33Z

We need to get benchmarking for it too

ChrisRackauckas changed the title ~~4DVAR Cost function~~ First differencing in loss objective Apr 8, 2018

ChrisRackauckas closed this as completed May 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First differencing in loss objective #72

First differencing in loss objective #72

ChrisRackauckas commented Apr 5, 2018

finmod commented Apr 7, 2018

ChrisRackauckas commented Apr 8, 2018

ChrisRackauckas commented Apr 26, 2018

ChrisRackauckas commented May 8, 2018

Vaibhavdixit02 commented May 8, 2018

finmod commented May 11, 2018

ChrisRackauckas commented May 11, 2018

Vaibhavdixit02 commented May 11, 2018

ChrisRackauckas commented May 11, 2018

First differencing in loss objective #72

First differencing in loss objective #72

Comments

ChrisRackauckas commented Apr 5, 2018

finmod commented Apr 7, 2018

ChrisRackauckas commented Apr 8, 2018

ChrisRackauckas commented Apr 26, 2018

ChrisRackauckas commented May 8, 2018

Vaibhavdixit02 commented May 8, 2018

finmod commented May 11, 2018

ChrisRackauckas commented May 11, 2018

Vaibhavdixit02 commented May 11, 2018

ChrisRackauckas commented May 11, 2018