-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecating and renaming additional functions #63
Comments
I agree, I think a more concise and easier accessible API would be nice. E.g., I guess (actually similar to I am not sure if the API for all regularized optimal transport problems should be unified, maybe it could become too confusing with all the keyword arguments - not all algorithms would support them and their meaning might change depending on the algorithm. Maybe instead of A different idea would be to adopt the CommonSolve interface used by SciML and also e.g. Roots. Mainly, it would be something along abstract type AbstractOTProblem end
struct OTProblem <: AbstractOTProblem
cost::C
mu::M
nu::N
end
struct RegularizedOTProblem <: AbstractOTProblem
cost::C
mu::M
nu::N
regularization::R
regularization_parameter::P
end
struct L2Regularization end
struct EntropyRegularization end
function EntropicOTProblem(cost, mu, nu, epsilon)
return RegularizedOTProblem(cost, mu, nu, EntropyRegularization(), epsilon)
end
...
struct StabilizedSinkhorn
scaling_factor::S
scaling_steps::Int
....
end
prob = EntropicOTProblem(cost, mu, nu, 0.01)
solve(prob, StabilizedSinkhorn(); ...) I am just not completely sure yet at which point one should be able to specify that only the cost or plan should be computed. As an option for BTW I noticed a bug in the epsilon scaling and think regardless of any additional simplifications of the API it should be merged with the standard stabilized version. I'll prepare a PR. |
Thanks for the input. I agree that the approach in #66 is problematic for not being extensible (I guess my original thinking was more of a band-aid solution).
Yes it would be good to separate the problem parameters from the solver details (instead of leaving them in the kwargs). That's something that I can try and incorporate in the existing PRed branch. Regarding the CommonSolve framework, would you say this should be an additional API to what we have currently (let's say it's PythonOT-esque), or should it replace the current API? I would say it might be good to have the additional API for users who want to work on their own extensions, but keep the simple functionality for standard usage. Another thing I have in mind is differentiability of the output. This isn't possible for a general solver, but it's possible at least for the Sinkhorn algorithm. Hence it would be nice to keep Also I wonder how general the API should be -- for instance, should we try to have a common API that can handle 2-marginal but also multimarginal/barycenter problems? Or should these problems have a separate API? Keen to hear your thoughts on this. |
I would assume that the main API would be the sinkhorn(mu, nu, C, eps) = solve(EntropicOTProblem(C, mu, nu, eps), Sinkhorn()) (or however we want to change the signature of In SciML, users differentiate through
With the |
I like the way POT.py does things. In my view, the more direct the better. I'm not really a fan of separating the |
I don't think we have to adopt the |
I just realized that we haven't settled on some standards regarding order of arguments and such. The new function Anyways, my suggestion is that we settle on cost in the beginning, such that the new versions of Another issue is that Finally, I've realized that |
IIRC I defined such a type in StochasticOptimalTransport. Since we don't need the full Distributions interface but only the probabilities and the support I assume, I don't think it's worth adding any proper Distribution and/or maintaining such a package currently. |
Another option would be to take a look at |
While I like some of the ideas and motivations behind MeasureTheory, I think very strongly that we should not switch to it. Mostly because it is not thorougly tested, it is not well established, it is not well documented, it requires Julia 1.5, it has a huge list of dependencies, and most importantly, I don't see the need for sophisticated measures or distributions, not even from Distributions. We just need something that wraps the support and the probabilities as https://github.com/JuliaOptimalTransport/StochasticOptimalTransport.jl/blob/1f6ee6ad45950423d55dff4f40b4fe334a56547d/src/utils.jl#L1-L10, but we don't need any sophisticated |
Awesome. I think we should go with your implementation, although I think we should just make some small changes, such as add a dimension property, and incorporate the |
I'd comment that in most settings, for the purposes of computing OT we don't even need to know the support, just the probabilities and the cost matrix. Of course, one would then need to precompute the cost matrix (as we currently do), and that is left up to the user. At least for the basic functionality like (un)regularised OT cost/plan computation, I'd say we don't need the supports. We could multiple dispatch though, so something like |
In ML (which is mostly where I apply OT), we usually have the empirical distribution (the support is the data instance in a batch, all with equal probability) and not the cost matrix. Besides coding the package, I also use it. In most cases I have the samples, and I have to create the cost matrix. So I'd really like to have a "We could multiple dispatch though, so something like |
Similar to #45, a lot of functions currently are either named mirroring Python OT's function naming choices, or just ad-hoc as they were added originally. (I agree with
ot_cost
andot_plan
for unregularised OT as per #45, for the record).sinkhorn
routines we should usesinkhorn_cost
andsinkhorn_plan
, etc.quadreg
to? This solves OT regularised with an L2 cost on the transport plan. Currently it's solved using one particular algorithm, but there are other approaches, so I'd prefer not to refer to any particular algorithm in the function name.One possibility is to have a general
ot_reg_cost
andot_reg_plan
wrapper for generalised regularisations. Then the user can specify the regularisation functional and algorithm upon calling. So something likeot_reg_cost(mu, nu, C, 0.05; reg_func = "L2", method = "lorenz")
The text was updated successfully, but these errors were encountered: