-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LinearMaps on non-AbstractVectors #44
Comments
What type of linear spaces do you have in mind? And which changes would be required in |
My application is with a 2d array, but in general it could be a general abstractarray - or indeed a more general type of objects, eg an ApproxFun or something. KrylovKit indeed looks interesting, I wasn't aware of it. Isn't it a duplication of effort to have solvers both in KrylovKit and IterativeSolvers? |
Probably, but it always seemed like my need for more general vectors than I am also working on OptimKit.jl :-). |
Well, certainly how you devote your time and effort is none of my (or anyone's) business, and it is certainly more fun to write code than to get into some code's frame of mind and fix somebody else's bugs. I just want to point out that it is annoying to users to have competing implementations of the same algorithm, each with different features, levels of maintenance, etc. I can't speak for IterativeSolvers, but over at https://github.com/JuliaNLSolvers/Optim.jl/ we certainly care about other input types than just Vectors (see eg https://github.com/JuliaNLSolvers/Optim.jl/blob/master/test/multivariate/array.jl). I'm not sure it has been tested with non-AbstractArrays but I'm sure @pkofod would welcome a PR if it doesn't work. I see that you support Riemannian optimization, great to see that! I added support for that to Optim. You seem to have been more careful about it than I did: I implemented the poor man's version where you just pepper the unconstrained code with retractions and projections on the tangent space. Yet that was enough to get to performance parity with other implementations for my use cases (LBFGS on the Stiefel manifold). You might also be interested in linking to https://github.com/JuliaNLSolvers/ManifoldProjections.jl/ (and if that is not enough for your use case I would be interested in why). |
The problem is, once you go away even from With optimization, that's a different kind of beast. The points over which you try to optimize can take values in an arbitrary set, say, smooth manifold, for the purpose of gradient optimization. The directions on this manifold are tangent vectors, so they are still like vectors, but the points on the manifold are not necessarily so. As such, I don't expect any interface and just enable the user to specify which method is used for take steps (retractions),to compute inner products between directions, to parallel transport directions, and so forth. Did I really need this? Yes. I am working with something known as tensor networks, where, in the simplest case, my manifold is specified by a single tensor. However, that representation is actually like a fiber bundle, i.e. there is some redundancy in the parameterisation which cannot easily be taken out. Furthermore, several quantities, like the function value and gradient, but even the inner product between tangent vectors, rely on auxiliary quantities that are implicitly determined by the tensor specifying the point on the manifold, so I would like to only compute them once and store them together with the tensor. There are different ways to update the tensor, and so forth. With OptimKit, nothing needs to be known about how the user want to encode his points |
So what you're saying is that you have a manifold that you'd rather keep entirely intrinsic, instead of seeing as embedded in a bigger linear space. This makes sense, but in the end you have to have a way to represent the tangent space as a vector, and move towards that direction. Is there really no way you can see that as |
Not really though, To given one example, one particular manifold is parameterized via an array Having To experiment with all these different choices, it is much easier to just redefine the retraction function then to define new types, overload + in a very unnatural way, and so forth. The manifold is in fact embedded in a larger vector space, it just happens to be the case that that vector space has a dimension which scales as |
That's fair enough. I guess Optim could change every
FWIW, I experimented quite a bit with this on the Stiefel manifold, and every retraction pretty much gave the same result, so I ended up with simply a Cholesky-based orthogonalization of x+d, which was the fastest. |
A more general interface is fine by me. It probably wouldn't buy us anything in the current setup, but it wouldn't hurt either. |
Well, |
I think you may have misunderstood my intentions :) I’m not trying to “win you over” (though I’d welcome colaboration). It’s just that we are going to extract the Manifold Projections out of Optim into a separate package so we can also use it elsewhere. That’s why I’m interested in your API: to see if there’s something to learn. |
Regarding the original issue, does it make sense to work on it within
being interpreted as "apply |
Hm, this is tricky. LinearMap could parametrically depend on the type (or shape) of its inputs, so that it only dispatches to the block interpretation if it makes sense? There is ambiguity elsewhere in the ecosystem: A*X if X is a matrix is sometimes interpreted as blockwise application of A (when A is a matrix) or as A acting on the full X (when A is an FFT plan for instance). |
@pkofod , that is indeed a great idea. The difference is indeed that the manifolds I am facing are typically not embedded in a vector space that you want to deal with directly, as they are exponentially large or truly infinite dimensional. Hence, the last thing I want to do, or can do, is describe my state as a full vector in that space. Hence, I have an intrinsic parameterization, as remarked by @antoine-levitt . But furthermore, this parameterization is not a a set of coordinates, as there is some redundancy in the description, and it is not easy to get rid of this, especially nog globally or explicitly. The correct mathematical picture is that my parameter space corresponds to the total space of a fiber bundle, whereas my physical manifold is the base space of that fiber bundle, and is itself embedded in an infinite dimensional vector space. |
Regarding the original issue; I think there is much of the rest of the ecosystem that needs to be modified for this to work, and it also doesn't play well with the current interface. What is I always found the I certainly hope that the ecosystem transitions to such a much more general approach, as I am not a numerical analyst and writing Krylov methods or optimization algorithms is not my area of expertise, but I was at the point where it was more work to fit my needs into the interface of existing packages then to just write my own package. |
I don't have a good opinion on that. On the one hand, functions is certainly the most flexible and simple interface. Indeed, I only came to LinearMaps because I wanted to wrap my function into a form IterativeSolvers would like. On the other, there are a number of variants (e.g. in place or not, is the operator hermitian, is the objective function differentiable, etc.) that are convenient to abstract away into a type (LinearMaps, or the types in Optim.jl for objective functions). A related question is how to represent the vectors you need, eg how do you store a history in GMRES or an approximate jacobian in broyden methods, without assuming they are |
By Jacobian do you mean Hessian (which is indeed the Jacobian of the gradient). For large-scale optimization problems you anyway don't want to store them as a matrix, no? In limited memory implementations, you just need a list of previous gradients, so any list (Vector) of custom objects is ok, no? The same for a history in GMRES (not sure if I know what a history in GMRES is and what the point of keeping it is)? |
Hessian when you want to solve an optimization problem, jacobian when you want to solve F(x) = 0. For BFGS you store as a matrix, but I agree there's no real point in BFGS when you can do LBFGS. I'm more used to storing the history as a N x m matrix (if N is the dimension of the problem and m the history size), because that leads to more locality in memory and more efficient linear algebra, but I remember discussions to the effect that this didn't really matter so I guess a list of objects is fine too. In GMRES you just store the reduced matrix so it's less of a problem. |
It is true that you miss some opportunity for squeezing out maximal efficiency in Krylov type algorithms, by storing e.g. the Krylov subspace as a list of vectors, instead of as a matrix. In particular, it is about the fact that you cannot use BLAS level 2 in the orthogonalization step, whereas you could if you have the basis available as matrix (that is, if you are ok with using non-modified gram schmidt). In KrylovKit.jl, I currently have some custom pure-Julia (multithreaded) methods that try to mimick BLAS level 2 when the vectors are |
Related to the original question, v0.6 of https://github.com/JeffFessler/LinearMapsAA.jl now provides a type ( |
It is often useful to define linear operators on arbitrary linear spaces, not necessarily AbstractVectors (e.g. the discretization of a n-d PDE). At the moment this is not supported by LinearMaps.
The text was updated successfully, but these errors were encountered: