Using ElasticNet from sklearn instead of glmnet #1

Yves33 · 2024-12-16T10:26:32Z

Hi,

I'm interested in running your rode on my own patch seq datasets.
spare-rrr presently depends on glmnet which is tricky to install with latest scanpy/anndata/numpy/(...), due to compiler version.
from what I understand , glmnet is only used in 3 lines in sparse-RRR.

Would you have any suggestions (or ready to run code) on how to use sklearn elasticnet (https://scikit-learn.org/1.5/modules/linear_model.html#elastic-net) instead of glmnet?

not advanced enough in maths to understand weather both are identical!

Thanks in advance.

dkobak · 2024-12-16T10:41:39Z

Thanks for the interest. Unfortunately, sklearn.linear_model.ElasticNet does not support the group lasso penalty for multivariate Y which we use here (via family="mgaussian" in glmnet).

Yves33 · 2024-12-16T11:09:59Z

Thanks for your reactivity!
At least, I know it's not worth struggling with the code!

I'll keep maintaining a separate conda env with appropriate glmne for sparse-rrr

ybernaerts · 2025-01-03T18:03:37Z

Hi @Yves33 . Thank you for your valuable feedback. I wanted to expand on it.
It seems indeed that glmnet_python becomes increasingly hard to use. I recently could not get it to work despite intense effort to change (conda or virtual) environments. It has conflicts with recent scipy versions and especially MAC users have a hard time to get it to work (check e.g. here.

I therefore experimented using ElasticNet() anyway from scikit-learn and found it to work quite well, at least for the demo.ipynb notebook. I found almost the same exact set of genes selected, and an almost indistinguishable bibiplot visualization of the latent space. For the purpose of selecting predictors (genes) and exploratory visualization I think therefore that it can be very useful as well. In fact, plotting the l2-norm of the rows of $\mathbf{W}$ shows that something very close to group lasso penalty has been performed, substantiating its usefullness.

In fact, @dkobak , are we sure that ElasticNet does not use the group lasso penalty right now? I don't know if it has been updated but when we look at the docs,

I'm not sure if there is much difference with the group lasso penalty as how we derive it in the paper.

I might fork the repo soon and try to make things work with ElasticNet() instead of glmnet_py .

Yves33 · 2025-01-05T07:38:01Z

Thanks, I am eager to look at the scikit-learn version!

dkobak · 2025-01-07T09:46:22Z

Hi @ybernaerts! Where is this screenshot from? When I look here https://scikit-learn.org/dev/modules/generated/sklearn.linear_model.ElasticNet.html, I don't see any mention of multi-task version. However, I now realized that there is this https://scikit-learn.org/dev/modules/generated/sklearn.linear_model.MultiTaskElasticNet.html.

ybernaerts · 2025-01-07T13:07:18Z

Hi @dkobak , if you scroll down in your first link (or maybe do Ctrl+F and search for "static path"), you should find my screenshot. But indeed, I assume that when you call ElasticNet() with multivariate y, what actually will be called is MultiTaskElasticNet() (your second link) as the loss functions in both cases are the same.

dkobak · 2025-01-07T15:38:44Z

Hmm. I don't think so.

from sklearn.linear_model import ElasticNet, MultiTaskElasticNet
from sklearn.datasets import make_regression
import numpy as np

X, y = make_regression(n_features=20, n_targets=5, random_state=0)

regr = ElasticNet(random_state=0, alpha=20, l1_ratio=0.9)
regr.fit(X, y)
with np.printoptions(precision=3, suppress=True):
    print(regr.coef_.T)

print('')

regr = MultiTaskElasticNet(random_state=0, alpha=20, l1_ratio=0.9)
regr.fit(X, y)
with np.printoptions(precision=3, suppress=True):
    print(regr.coef_.T)

Output:

[[ 0.     0.    -0.    -0.    -0.   ]
 [-0.    -0.     0.    -0.    -0.   ]
 [ 0.     0.     0.     0.     0.   ]
 [ 0.     0.     0.     0.     0.   ]
 [10.119  4.043 18.978 16.47   8.537]
 [15.98  15.877  0.159 12.744 17.078]
 [15.196  0.747 -0.     8.916 18.717]
 [-0.    -0.     0.    -0.    -0.   ]
 [-3.892 -0.    -2.409 -0.936 -1.88 ]
 [ 5.325 12.809 14.288  8.653  0.   ]
 [19.951 14.4   24.569 22.724  8.202]
 [10.591 12.958 18.63  -0.    17.25 ]
 [ 5.841  6.238  0.     3.398  4.371]
 [21.441  0.    10.682  4.996  0.849]
 [-0.496 -0.    -1.561 -0.    -1.689]
 [-0.     0.    -0.    -0.     0.   ]
 [ 4.56  22.007 10.129  0.     6.582]
 [14.376 10.734  3.469 16.497  2.723]
 [27.783  2.929 24.296 25.264 30.608]
 [ 0.     0.     0.148  0.     0.   ]]

[[ 0.     0.    -0.    -0.    -0.   ]
 [-0.     0.     0.    -0.    -0.   ]
 [ 0.089  0.016  0.04   0.104  0.101]
 [ 0.533  0.587  0.407  0.529  0.503]
 [14.034  8.761 21.312 19.138 12.501]
 [18.819 18.896  5.47  16.116 19.957]
 [18.132  5.959 -4.405 12.732 20.919]
 [-0.    -0.     0.    -0.    -0.   ]
 [-5.8   -2.232 -4.927 -4.043 -4.482]
 [ 9.537 15.737 16.57  12.105  2.729]
 [22.962 18.073 26.743 25.142 12.353]
 [14.338 16.281 21.038 -2.715 20.039]
 [ 7.708  8.022  3.034  5.919  6.571]
 [22.489  3.422 13.788  9.216  5.614]
 [-3.447 -1.664 -3.889 -2.996 -4.105]
 [-0.     0.    -0.    -0.     0.   ]
 [ 8.806 23.41  13.095  4.281 10.362]
 [16.514 13.335  7.569 18.067  6.648]
 [29.959  7.318 26.831 27.723 32.475]
 [ 0.11   0.097  0.443  0.123  0.173]]

So my impression is that ElasticNet uses normal element-wise lasso penalty, whereas MultiTaskElasticNet uses group lasso (whole rows are zeroed out). Good news is that we can use MultiTaskElasticNet for our purposes!

ybernaerts · 2025-01-07T17:19:49Z

That example is convincing, yet it is confusing I find the same loss objective function in their docs then...
Anyway, I'll rewrite my branch to use MultiTaskElasticNet() in the demo, and so update the pull request

ybernaerts · 2025-01-07T18:05:59Z

I now updated my branch to use MultiTaskElasticNet() for demo purposes. So @dkobak , if you think having an extra demo using scikit-learn is useful in this repo, consider the pull request :)

dkobak · 2025-01-08T12:55:58Z

@ybernaerts Thanks! Merged, and edited your README a little bit.

@Yves33 Thanks for raising this issue!

dkobak changed the title ~~migrating to sklearn~~ Using ElasticNet from sklearn instead of glmnet Dec 16, 2024

ybernaerts mentioned this issue Jan 6, 2025

1 using elasticnet from sklearn instead of glmnet #2

Merged

ybernaerts linked a pull request Jan 6, 2025 that will close this issue

1 using elasticnet from sklearn instead of glmnet #2

Merged

dkobak closed this as completed in #2 Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using ElasticNet from sklearn instead of glmnet #1

Using ElasticNet from sklearn instead of glmnet #1

Yves33 commented Dec 16, 2024

dkobak commented Dec 16, 2024

Yves33 commented Dec 16, 2024

ybernaerts commented Jan 3, 2025

Yves33 commented Jan 5, 2025

dkobak commented Jan 7, 2025

ybernaerts commented Jan 7, 2025 •

edited

Loading

dkobak commented Jan 7, 2025

ybernaerts commented Jan 7, 2025

ybernaerts commented Jan 7, 2025 •

edited

Loading

dkobak commented Jan 8, 2025

Using ElasticNet from sklearn instead of glmnet #1

Using ElasticNet from sklearn instead of glmnet #1

Comments

Yves33 commented Dec 16, 2024

dkobak commented Dec 16, 2024

Yves33 commented Dec 16, 2024

ybernaerts commented Jan 3, 2025

Yves33 commented Jan 5, 2025

dkobak commented Jan 7, 2025

ybernaerts commented Jan 7, 2025 • edited Loading

dkobak commented Jan 7, 2025

ybernaerts commented Jan 7, 2025

ybernaerts commented Jan 7, 2025 • edited Loading

dkobak commented Jan 8, 2025

ybernaerts commented Jan 7, 2025 •

edited

Loading

ybernaerts commented Jan 7, 2025 •

edited

Loading