Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First differencing in loss objective #72

Closed
ChrisRackauckas opened this issue Apr 5, 2018 · 9 comments
Closed

First differencing in loss objective #72

ChrisRackauckas opened this issue Apr 5, 2018 · 9 comments

Comments

@ChrisRackauckas
Copy link
Member

https://d-nb.info/1116709740/34

@finmod
Copy link
Contributor

finmod commented Apr 7, 2018

Some additional indications on the algorithm are here: https://arxiv.org/pdf/1709.06383.pdf

@ChrisRackauckas
Copy link
Member Author

The true cost function for the continuous case can be tracked down to this: https://journals.aps.org/pre/pdf/10.1103/PhysRevE.84.056214 . It's essentially just adding a differencing term to the cost function. The way they do it though is prone to instability, but we can do the first differences a lot like we did in the maximum likelihood. @Vaibhavdixit02 would you like to take that on?

@ChrisRackauckas ChrisRackauckas changed the title 4DVAR Cost function First differencing in loss objective Apr 8, 2018
@ChrisRackauckas
Copy link
Member Author

@ChrisRackauckas
Copy link
Member Author

Let's finish this up @Vaibhavdixit02

@Vaibhavdixit02
Copy link
Member

Okay 👍

@finmod
Copy link
Contributor

finmod commented May 11, 2018

@ChrisRackauckas @Vaibhavdixit02 Is this a successful closure of 4DVAR estimation method? Do you plan to revise the docs for L2Loss or produce a docstring with references on L2Loss? I am at a loss myself on what is going on with this new L2Loss!

@ChrisRackauckas
Copy link
Member Author

We should do the 4DVAR cost separately because it has a lot of issues in the general case. But let me summarize what's going on here, and how this is related to 4DVAR.

The author defines three versions of the 4DVAR. The strong 4DVAR has two terms:

  1. Initial condition term
  2. L2 loss term

Until we have SciML/DifferentialEquations.jl#251 , (1) is meaningless. But 2 is what we already have, sans the non-diagonal weighting (#82) which only makes sense if you have pre-information about noise correlations (in which case, use an SDE?). So strong 4DVAR is essentially what we have, that's clear.

The more interesting part is weak 4DVAR:

  1. Initial condition term
  2. L2 loss term
  3. Model derivative term.

The model derivative term is sol[i+1]-sol[i] - f(sol[i],p,t), or using the diffeq interpolants, sol[i+1]-sol[i] - sol(sol.t[i],Val{1}) if dense. This is saying that the derivative in the data should match the Euler approximation of the derivative. Obviously, this is really bad if the data is sparse. So what we've done is instead added a term to better approximate this for sparse or nonlinear data. We added a first differencing term, which is:

(sol[i+1]-sol[i]) - (data[i+1] - data[i]),

saying that the first differences in the solution must match the first differences in the data. While the model derivative Euler term is not correct for sparse or nonlinear data, even when the model and parameters are perfect (i.e. the best loss function doesn't actually zero, and is biased), this captures the same idea but is non-biased.

This first differencing term is not covered by the L2 loss term because of the nonlinearity of the squaring operation. This is easier to note on stochastic differential equation models, where there are cases that the L2Loss alone leads to non-identifiable parameters but adding a first differencing term makes it identifiable. To construct such an example, you can just take something like du = -p1*u*dt + p2*dW_t and make the starting ensemble (initial condition) be the same distribution as the stationary distribution. If you only check the L2 term, then you're only checking whether the probability distribution is correct which is not identifiable due to the relation of p1 and p2 (if you increased p2 and decrease p1 you keep the same stationary distribution). However, the first differences essentially measure the autocorrelation, which is different even when the probability distribution changes (increasing p2 and decreasing p1 keeps the same stationary probability, but a given trajectory moves around the space at a faster rate leading to lower autocorrelation and higher differencing terms on average). So this term really is capturing something new, is non-biased, and is cheap to calculate. We did not find it in the literature, but it's clear how it's inspired by the weak 4DVAR.

But just in case, the model derivative term got an issue: #83 .

The last thing is that the 4DVAR sometimes adds another approximate term. Sometimes what it will do is a polynomial interpolation. It's there to reduce the oscillations which are due to the bias of the model derivative term. A separate issue for that is here: #74 . Of course, it's not a high priority if we aren't using the model derivative term though.

So that's a full explanation of the cost function and what we did. Issues

#82
#83
#74

could fill out the loss function more, but are either very model specific (matrix weights are not something that are common to use, even full diagonal which we have is uncommon), or are biased estimators (model derivative term and the Hermite anti-oscillation term).

None of those possible extra terms could help on the Lorenz tests because of the bias and the lack of true noise (matrix weights are only there to match covariances). But, we did get something in this issue that can help it out, so try out the first differencing term and see how it does.

@Vaibhavdixit02
Copy link
Member

@finmod I will definitely add docs for the recent changes as soon as possible. The API isn't different in case you want to use it as before and any previous work will not be affected 👍

@ChrisRackauckas
Copy link
Member Author

We need to get benchmarking for it too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants