Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct transformation of second-order tensors in adapters #304

Open
han-ol opened this issue Feb 11, 2025 · 1 comment
Open

Correct transformation of second-order tensors in adapters #304

han-ol opened this issue Feb 11, 2025 · 1 comment
Labels
feature New feature or request

Comments

@han-ol
Copy link
Contributor

han-ol commented Feb 11, 2025

Currently, only first-order tensors, a.k.a. vectors can be transformed in both forward and inverse direction through an adapter object.

Can we transform also different orders, particularly second-order?

The question arises because a PointInferenceNetwork can estimate the covariance matrix of the inference_variables.
Some adapters represent a change of basis and origin, for example, standardize is a linear coordinate transformation from "original coordinates" to "standardized coordinates".
Thus, the inference_variables live in the standardized coordinates, and the covariance matrix needs to be adapted in inverse direction to relate to unstandardized coordinates.

When a covariance matrix is estimated in the standardized coordinates, the inverse of the coordinates transformation is different from naively transforming matrix columns as if they were vectors in standardized coordinates.
Rather, a covariance matrix $\Sigma^{-1}$ transforms as an order 2 tensor: the basis change matrix $A$ is multiplied from both sides $A^T \Sigma^{-1} A$.
For standardize, $A=diag(\sigma_i)$ with $\sigma_i$ the standard deviations for each dimension, so $A^T \Sigma^{-1} A = \Sigma^{-1} \cdot diag(\sigma_i^2)$.

Solutions to this issue are likely related to a separate issue on keeping track of jacobians of adapter transforms, as mentioned in #245.

@paul-buerkner
Copy link
Contributor

For non-linear transformations, I see little hope that we can make it work generally for point estimates. There are some combinations of point estimates and transformations that work such as quantiles and strictly monotonic transforms. But I think that, for now, we should warn somewhere when a point estimate is passed through a non-linear transformation and recommend to just not use these transformation with pointInferenceNetworks.

Standarize is a bit of a different situation, both because it is linear and because it is so important for the networks to work in a stable manner in lots of situtations. So mean and quantiles can just be back transformed through standardize and all is well.

Now, for a covariance matrix the situation is different as you highlight above. Same actually for only the variance since we need to square the scaling.

I suggest something like the following solution to this general problem (covariance matrix is even harder than this):

Every point estimation loss implemented natively by us (e.g. squared loss) knows which point estimate it produces (e.g., mean). So we can enrich it by a structure telling us what to do with inverese transforms. This is turn could be a dict of 3 options.

  • Names of transforms that are valid for this point estimate, such that the point estimate can be safely transformed by the default approach (e.g., standardize for mean)
  • Names of transforms that are invalid for this point estimate and whose correct value cannot be computed in general without additional knowledge (e.g., exp/log for mean). There, we would probably perform the transformation anyway but warn that this will likely lead to biased estimates.
  • A dict of transforms for which we know the correct inverse transform for a given point estimate, but that transform is different from what the adapter has implemented. If such a new transform is specified, it will replace the standard adapter (back-)transform. E.g. We want to compute the variance when passed through standardized. Then our new backtransform would ignore the mean and square the scaling.

This approach of course needs a bit of work for each point estimation loss but I believe it can be done in a structured and tidy way such that the maintainence effort of this feature would be small.

What do you think @han-ol and @stefanradev93 ?

@paul-buerkner paul-buerkner added the feature New feature or request label Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants