WIP isolate convergence warning in LogReg #225

Badr-MOUFAD · 2022-04-01T13:31:32Z

fixes #215

celer/tests/isolate_conv_warn.py

…ate-conv-warn

codecov-commenter · 2022-04-01T19:15:07Z

Codecov Report

Merging #225 (634870b) into main (700c280) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #225   +/-   ##
=======================================
  Coverage   86.50%   86.50%           
=======================================
  Files          14       14           
  Lines         963      963           
  Branches      128      128           
=======================================
  Hits          833      833           
  Misses        100      100           
  Partials       30       30

Flag	Coverage Δ
unittests	`∅ <ø> (∅)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
celer/homotopy.py	`86.82% <ø> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 700c280...634870b. Read the comment docs.

celer/tests/isolate_conv_warn.py

mathurinm · 2022-04-02T08:20:03Z

To clarify: can you get the exact X, y and regularization strength that cause these convergence warnings output by pytest :

celer/tests/test_logreg.py::test_LogisticRegression[True]
  /home/mathurin/workspace/celer/celer/homotopy.py:311: ConvergenceWarning: Objective did not converge: duality gap: 0.841892524980679, tolerance: 0.005545177444479563. Increasing `tol` may make the solver faster without affecting the results much. 
  Fitting data with very small alpha causes precision issues.
    sol = newton_celer(

celer/tests/test_logreg.py::test_LogisticRegression[True]
  /home/mathurin/workspace/celer/celer/homotopy.py:311: ConvergenceWarning: Objective did not converge: duality gap: 23.09072008747797, tolerance: 0.006931471805599453. Increasing `tol` may make the solver faster without affecting the results much. 
  Fitting data with very small alpha causes precision issues.
    sol = newton_celer(

celer/tests/test_logreg.py::test_LogisticRegression[True]
  /home/mathurin/workspace/celer/celer/homotopy.py:311: ConvergenceWarning: Objective did not converge: duality gap: 2.1031969926275593, tolerance: 0.006931471805599453.

The goal is not just to produce a ConvergenceWarning, this is easy (just put a hard problem with a low regularization, lots of features and few iterations).

The goal is to understand why the solver fails in the setup of the test. So the first step is to get the exact problem causing the warning (not to generate random X and y ad to try to get a ConvergenceWarning on it).
Is it clearer now ?

celer/tests/inspect_conv_warn.py

celer/tests/conv_warning/isolate_conv_warn.py

Badr-MOUFAD · 2022-04-03T00:31:42Z

@mathurinm, I have two hypotheses regarding the ConvergenceWarning:

alpha is too small (small depends on X, y and tol). Hence, a lower bound should be computed to prevent that.
X is not centered. Indeed, the dumped data is drawn from normal(100, 1).

mathurinm · 2022-04-03T09:33:05Z

How is alpha as a fraction of alpha_max, ie norm(X.T @ y, ord=np.inf) ?
What makes you think centering plays a role here ?

Badr-MOUFAD · 2022-04-03T11:23:05Z

The order of alpha / alpha_max is 1e-5
I think centering is not implemented (referring to this line). The dumped data, namely X is drawn from N(100, 1). I don't get warnings when I test with X drawn from N(0, 1)

mathurinm · 2022-04-03T16:20:16Z

The fact that the data is not centered does not mean that the solver should not converge on it when fit_intercept=False. Fitting an intercept or not just means solving one optimization problem or another.

celer/tests/conv_warning/isolate_conv_warn.py

Badr-MOUFAD · 2022-04-03T22:17:46Z

primal decreases exponentially in the first iteration and slows down (stagnates) later.
(quasi) inverse pattern is observed for gap.

@mathurinm, also referring to the plot, primal stagnates whereas gap keeps decreasing. It seems like we are stuck at a degenerate point.
Can't we just break when primal stagnates since our objective is to minimize it?

josephsalmon · 2022-04-03T22:23:32Z

Some people do indeed, but mostly when the pb tends to degenerate towards probabilities 0 or 1:
see glmnet paper:

I have not heard of tricks around 0.5 0.5 (your case here) though.

mathurinm · 2022-04-04T06:54:41Z

The plot displays incomparable quantities: gap goes to 0 while primal converges to a > 0 value, hence primal may continue decreasing at the same speed as the gap but it's not visible. You need to substract the primal limit to the primal objectives if you want comparable quantities

It seems like we are stuck at a degenerate point.
Degenerate in which sense ?

Stopping based on the primal does not offer guarantees and that is not the way we have chosen in celer.

celer/tests/conv_warning/visualize_logs.py

Badr-MOUFAD · 2022-04-04T16:00:41Z

@mathurinm, you are right!
When visualizing primal residuals, we get the (quasi) same pattern.

Anyways, in LogisticRegression, when using celer as solver instead of proximal newton, the fit converges in 2 iterations.
refer to celer/tests/isolate_conv_warn.py

mathurinm · 2022-04-04T16:21:21Z

This figure is mathematically impossible, the duality gap is always greater than the primal suboptimality. Can you also look into the large peak for the gap around iteration 65 ?

josephsalmon · 2022-04-04T18:32:13Z

It is possible, but the double axis scales tricked you. A good reason to never use it.

I agree the peak is huge and looks strange

mathurinm · 2022-04-04T18:36:48Z

Ah yes thanks Joseph.
@Badr-MOUFAD the double axis is misleading and there is no reason to use it as the two plotted quantities are directly homogeneous and comparable.

Also beware of the way you compute the primal optimum, since we're looking at a convergence issue the last primal after 100 iterations is not necessarily equal to the optimum up to machine precision

Badr-MOUFAD · 2022-04-04T21:31:22Z

This figure is mathematically impossible,
the duality gap is always greater than the primal suboptimality
Can you also look into the large peak for the gap around iteration 65?

It's true. The scale of the multiaxis plot misled us.
With that being said, by fixing that, there is no contradiction. The plot of the dual gap is always above the residuals.

I don't have a comment on the pick around iteration 65. Indeed, it doesn't break the previous rule.

Finally, I am pretty sure that there is no (small) mistake in the implementation of newton_celer, especially knowing that it was coded by @mathurinm.

Badr-MOUFAD · 2022-04-04T21:32:07Z

Referring to my knowledge of deep learning, I think that the slowness of the solver might be due to (some sort of) vanishing gradient. Indeed, we compute the gradient of the sigmoid function with data drawn from N(100, 1).

celer/homotopy.py

.gitignore

celer/tests/conv_warning/isolate_conv_warn.py

celer/tests/conv_warning/reproduce_warning.py

celer/tests/conv_warning/isolate_conv_warn.py

mathurinm · 2022-04-07T12:13:02Z

Thanks for the reproducing scripts @Badr-MOUFAD

addressed in #227

identify sk-funcs that tigger conv-warn

50ae176

mathurinm reviewed Apr 1, 2022

View reviewed changes

celer/tests/isolate_conv_warn.py Outdated Show resolved Hide resolved

mathurinm reviewed Apr 1, 2022

View reviewed changes

celer/tests/isolate_conv_warn.py Outdated Show resolved Hide resolved

Badr-MOUFAD added 2 commits April 1, 2022 19:10

Merge branch 'main' of https://github.com/Badr-MOUFAD/celer into isol…

2917e3a

…ate-conv-warn

dummy data to reproduce ConvergenceWarning in LogReg

2f73eb0

Badr-MOUFAD commented Apr 1, 2022

View reviewed changes

celer/tests/isolate_conv_warn.py Outdated Show resolved Hide resolved

Badr-MOUFAD added 2 commits April 1, 2022 20:01

check C is the only trigger of warning

37e8cb7

remove bloc if __name__ == '__main__'

1f37510

mathurinm reviewed Apr 2, 2022

View reviewed changes

celer/tests/inspect_conv_warn.py Outdated Show resolved Hide resolved

Badr-MOUFAD added 2 commits April 2, 2022 13:50

dump data

deb2e5f

remove redundant files

4980392

Badr-MOUFAD commented Apr 2, 2022

View reviewed changes

celer/tests/conv_warning/isolate_conv_warn.py Outdated Show resolved Hide resolved

fit with alpha_max && test with N(0,1)

c3bbaf0

reproduce convergencewarning with C

0b7c5d9

Badr-MOUFAD commented Apr 3, 2022

View reviewed changes

celer/tests/conv_warning/isolate_conv_warn.py Outdated Show resolved Hide resolved

visualize primal/gap in logreg sklearn

3aaf7e9

mathurinm reviewed Apr 4, 2022

View reviewed changes

celer/tests/conv_warning/visualize_logs.py Outdated Show resolved Hide resolved

mathurinm reviewed Apr 4, 2022

View reviewed changes

celer/tests/conv_warning/visualize_logs.py Outdated Show resolved Hide resolved

primal residuals in vis && try celer instead of PN

a83d0f1

Badr-MOUFAD added 2 commits April 5, 2022 14:19

simulate gaps for LogReg with mul alpha

b3f5b8f

remove copilot imports

17b3644

Badr-MOUFAD commented Apr 5, 2022

View reviewed changes

celer/homotopy.py Outdated Show resolved Hide resolved

Badr-MOUFAD commented Apr 5, 2022

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

mathurinm reviewed Apr 6, 2022

View reviewed changes

celer/tests/conv_warning/isolate_conv_warn.py Outdated Show resolved Hide resolved

mathurinm added 3 commits April 6, 2022 14:16

better design

74c4d56

minimal script to reproduce warning

634870b

investigate

890860e