Deprecating and renaming additional functions #63

zsteve · 2021-05-22T17:52:54Z

Similar to #45, a lot of functions currently are either named mirroring Python OT's function naming choices, or just ad-hoc as they were added originally. (I agree with ot_cost and ot_plan for unregularised OT as per #45, for the record).

Along those lines, for the sinkhorn routines we should use sinkhorn_cost and sinkhorn_plan, etc.
What to rename quadreg to? This solves OT regularised with an L2 cost on the transport plan. Currently it's solved using one particular algorithm, but there are other approaches, so I'd prefer not to refer to any particular algorithm in the function name.
One possibility is to have a general ot_reg_cost and ot_reg_plan wrapper for generalised regularisations. Then the user can specify the regularisation functional and algorithm upon calling. So something like
ot_reg_cost(mu, nu, C, 0.05; reg_func = "L2", method = "lorenz")

The text was updated successfully, but these errors were encountered:

devmotion · 2021-05-23T20:10:53Z

I agree, I think a more concise and easier accessible API would be nice. E.g., I guess (actually similar to sinkhorn and sinkhorn2 in POT) it would be good if there is only one sinkhorn_cost and sinkhorn_plan function instead of one for the standard algorithm, the stabilized version and the stabilized version with epsilon scaling.

I am not sure if the API for all regularized optimal transport problems should be unified, maybe it could become too confusing with all the keyword arguments - not all algorithms would support them and their meaning might change depending on the algorithm. Maybe instead of ot_reg_cost(mu, nu, C, 0.05; reg_func = "L2", method = "lorenz") one could use ot_regularized_cost(mu, nu, C, 0.05; regularization=:L2, method=Lorenz(algorithm_specific_options...), general_options...)? Here, general_options could contain e.g. plan. Additionally, maybe one could even make the regularization parameter a mandatory keyword argument, i.e., one without default value, to keep the API closer to the one for exact OT problems.

A different idea would be to adopt the CommonSolve interface used by SciML and also e.g. Roots. Mainly, it would be something along

abstract type AbstractOTProblem end

struct OTProblem <: AbstractOTProblem
    cost::C
	mu::M
    nu::N
end

struct RegularizedOTProblem <: AbstractOTProblem
    cost::C
	mu::M
    nu::N
    regularization::R
	regularization_parameter::P
end

struct L2Regularization end
struct EntropyRegularization end

function EntropicOTProblem(cost, mu, nu, epsilon)
    return RegularizedOTProblem(cost, mu, nu, EntropyRegularization(), epsilon)
end
...

struct StabilizedSinkhorn
    scaling_factor::S
    scaling_steps::Int
....
end

prob = EntropicOTProblem(cost, mu, nu, 0.01)

solve(prob, StabilizedSinkhorn(); ...)

I am just not completely sure yet at which point one should be able to specify that only the cost or plan should be computed. As an option for solve maybe?

BTW I noticed a bug in the epsilon scaling and think regardless of any additional simplifications of the API it should be merged with the standard stabilized version. I'll prepare a PR.

zsteve · 2021-05-23T20:49:21Z

Thanks for the input. I agree that the approach in #66 is problematic for not being extensible (I guess my original thinking was more of a band-aid solution).

Maybe instead of ot_reg_cost(mu, nu, C, 0.05; reg_func = "L2", method = "lorenz") one could use ot_regularized_cost(mu, nu, C, 0.05; regularization=:L2, method=Lorenz(algorithm_specific_options...), general_options...)?

Yes it would be good to separate the problem parameters from the solver details (instead of leaving them in the kwargs). That's something that I can try and incorporate in the existing PRed branch.

Regarding the CommonSolve framework, would you say this should be an additional API to what we have currently (let's say it's PythonOT-esque), or should it replace the current API? I would say it might be good to have the additional API for users who want to work on their own extensions, but keep the simple functionality for standard usage. Another thing I have in mind is differentiability of the output. This isn't possible for a general solver, but it's possible at least for the Sinkhorn algorithm. Hence it would be nice to keep sinkhorn(2) as-is.

Also I wonder how general the API should be -- for instance, should we try to have a common API that can handle 2-marginal but also multimarginal/barycenter problems? Or should these problems have a separate API?

Keen to hear your thoughts on this.

devmotion · 2021-05-23T22:10:32Z

Regarding the CommonSolve framework, would you say this should be an additional API to what we have currently (let's say it's PythonOT-esque), or should it replace the current API? I would say it might be good to have the additional API for users who want to work on their own extensions, but keep the simple functionality for standard usage. Another thing I have in mind is differentiability of the output. This isn't possible for a general solver, but it's possible at least for the Sinkhorn algorithm. Hence it would be nice to keep sinkhorn(2) as-is.

I would assume that the main API would be the solve(...) interface since it is more powerful than the current approach and can be extended - you can just add a new problem type and/or algorithm if needed. However, it is still possible to keep convenient functions such as sinkhorn. It would be defined just as

sinkhorn(mu, nu, C, eps) = solve(EntropicOTProblem(C, mu, nu, eps), Sinkhorn())

(or however we want to change the signature of sinkhorn).

In SciML, users differentiate through solve all the time (that's the main point of packages such as DiffEqSensitivity and DiffEqFlux), so this does not depend on the API.

Also I wonder how general the API should be -- for instance, should we try to have a common API that can handle 2-marginal but also multimarginal/barycenter problems? Or should these problems have a separate API?

With the solve interface probably it could just be a special problem such as EnsembleOTProblem or BatchOTProblem or a special algorithm such as EnsembleAlgorithm(algorithm). If there is an internal implementation that works for both cases, with one marginals and multiple marginals, both solve calls could dispatch to it.

davibarreira · 2021-05-26T13:25:07Z

I like the way POT.py does things. In my view, the more direct the better. I'm not really a fan of separating the solve.

devmotion · 2021-05-26T13:32:11Z

I don't think we have to adopt the solve approach - there might be better choices and, as mentioned above, even if we do one can still keep convenient functions such as sinkhorn etc. However, now that we've moved the Python and POT parts to a separate package, I think the design should be driven mainly by what the best approach is in Julia and not only depend on POT. In particular, we should definitely exploit multiple dispatch which just does not exist (at least not properly) in Python.

davibarreira · 2021-06-02T19:11:25Z

I just realized that we haven't settled on some standards regarding order of arguments and such. The new function sinkhorn_divergence(c,mu,nu,epsilon) uses the same order as ot_cost(c,mu,nu) , but the emd2(mu,nu, C) and
sinkhorn2(mu,nu,C,epsilon)` starts with the distributions instead of the cost. I guess me and @devmotion talked about it in one of the PRs, but I can't find it.

Anyways, my suggestion is that we settle on cost in the beginning, such that the new versions of sinkhorn2 and emd2 (as well as all the others), would become, for example, sinkhorn2(C,mu,nu,epsilon). I also suggest that we use C when it's a cost matrix and c when it's a cost function.

Another issue is that mu and nu are sometimes distributions (as in the case of the 1D), and sometimes they are only the support (as in the case of the sinkhorn functions). This is confusing, and we should either rename it for consistency to something such as mu_supp and leave mu for the case when it is a distribution (measure).

Finally, I've realized that Distributions.jl does not have a generic type for Empirical Distributions. The DiscreteNonParametric only works for univariate distributions. So perhaps we can create a type EmpiricalDistributions to add consistency (I actually suggested this in the Distributions.jl GitHub, but don't know if they are interested).

devmotion · 2021-06-02T19:28:38Z

Finally, I've realized that Distributions.jl does not have a generic type for Empirical Distributions. The DiscreteNonParametric only works for univariate distributions. So perhaps we can create a type EmpiricalDistributions to add consistency (I actually suggested this in the Distributions.jl GitHub, but don't know if they are interested).

IIRC I defined such a type in StochasticOptimalTransport. Since we don't need the full Distributions interface but only the probabilities and the support I assume, I don't think it's worth adding any proper Distribution and/or maintaining such a package currently.

davibarreira · 2021-06-02T20:23:29Z

Another option would be to take a look at MeasureTheory.jl. Maybe it covers our needs, but I haven't actually used it. I see that it appeared in the issue you mentioned.

devmotion · 2021-06-02T20:37:21Z

While I like some of the ideas and motivations behind MeasureTheory, I think very strongly that we should not switch to it. Mostly because it is not thorougly tested, it is not well established, it is not well documented, it requires Julia 1.5, it has a huge list of dependencies, and most importantly, I don't see the need for sophisticated measures or distributions, not even from Distributions. We just need something that wraps the support and the probabilities as https://github.com/JuliaOptimalTransport/StochasticOptimalTransport.jl/blob/1f6ee6ad45950423d55dff4f40b4fe334a56547d/src/utils.jl#L1-L10, but we don't need any sophisticated cdf, logpdf, rand etc. definitions. For the 1D discrete measures, it is nice that we can use DiscreteNonParametric whose support is already sorted, but IMO we really don't have to operate with Distributions but can use any structure that fits.

davibarreira · 2021-06-02T21:03:54Z

Awesome. I think we should go with your implementation, although I think we should just make some small changes, such as add a dimension property, and incorporate the DiscreteNonParametric inside, so we can just use one larger type and dispatch accordingly. Also, we should maybe rename it from DiscreteMeasure to EmpiricalMeasure or FiniteDiscreteMeasure, cause other distributions such as Poisson are Discrete, but are not supported on algorithms such as Sinkhorn.
P.S: I find very odd that Distributions.jl does not have this type of distribution, cause it seems very ubiquitous and useful.

zsteve · 2021-06-04T18:44:31Z

I'd comment that in most settings, for the purposes of computing OT we don't even need to know the support, just the probabilities and the cost matrix.

Of course, one would then need to precompute the cost matrix (as we currently do), and that is left up to the user. At least for the basic functionality like (un)regularised OT cost/plan computation, I'd say we don't need the supports.

We could multiple dispatch though, so something like sinkhorn(mu::DiscreteMeasure, nu::DiscreteMeasure) handles the cost matrix computation for the user.

davibarreira · 2021-06-04T18:56:23Z

In ML (which is mostly where I apply OT), we usually have the empirical distribution (the support is the data instance in a batch, all with equal probability) and not the cost matrix. Besides coding the package, I also use it. In most cases I have the samples, and I have to create the cost matrix. So I'd really like to have a cost_matrix function ready to go :) Although, I can see how this should perhaps be left to the user.

"We could multiple dispatch though, so something like sinkhorn(mu::DiscreteMeasure, nu::DiscreteMeasure) handles the cost matrix computation for the user."
Yeah, I totally agree here. We could dispatch a version where the user only provides the probabilities and the cost function matrix, but I think we should then rename to something like e.g. sinkhorn(C, muprobs, nuprobs), and leave muand nu when they are actual measures.

devmotion mentioned this issue May 23, 2021

General regularisation #66

Open

davibarreira mentioned this issue Jun 2, 2021

Sinkhorn Divergence #92

Closed

devmotion mentioned this issue Jun 6, 2021

Simplify stabilized Sinkhorn algorithm #99

Merged

zsteve closed this as completed Jul 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecating and renaming additional functions #63

Deprecating and renaming additional functions #63

zsteve commented May 22, 2021

devmotion commented May 23, 2021

zsteve commented May 23, 2021

devmotion commented May 23, 2021

davibarreira commented May 26, 2021

devmotion commented May 26, 2021

davibarreira commented Jun 2, 2021

devmotion commented Jun 2, 2021

davibarreira commented Jun 2, 2021 •

edited

Loading

devmotion commented Jun 2, 2021 •

edited

Loading

davibarreira commented Jun 2, 2021 •

edited

Loading

zsteve commented Jun 4, 2021

davibarreira commented Jun 4, 2021 •

edited

Loading

Deprecating and renaming additional functions #63

Deprecating and renaming additional functions #63

Comments

zsteve commented May 22, 2021

devmotion commented May 23, 2021

zsteve commented May 23, 2021

devmotion commented May 23, 2021

davibarreira commented May 26, 2021

devmotion commented May 26, 2021

davibarreira commented Jun 2, 2021

devmotion commented Jun 2, 2021

davibarreira commented Jun 2, 2021 • edited Loading

devmotion commented Jun 2, 2021 • edited Loading

davibarreira commented Jun 2, 2021 • edited Loading

zsteve commented Jun 4, 2021

davibarreira commented Jun 4, 2021 • edited Loading

davibarreira commented Jun 2, 2021 •

edited

Loading

devmotion commented Jun 2, 2021 •

edited

Loading

davibarreira commented Jun 2, 2021 •

edited

Loading

davibarreira commented Jun 4, 2021 •

edited

Loading