Adding Multi-Task ElasticNet support #238

YuhanLiin · 2022-08-14T21:12:05Z

Continuation of #194

…elastic_net

This reverts commit d992c2c.

YuhanLiin · 2022-08-14T21:13:40Z

Running the new multi-task example gives the following output:

intercept:  [182.11111111111111, 35.666666666666664, 55.55555555555556]
params: [[-0.9723742003724933, -0.12992938479472216, 0.20256364290951492],
 [0.017231919622246364, -0.00785311972200309, 0.006638074127064588],
 [0.0269082650844912, 0.021197761913871658, -0.027310155988705367]]
z score: Ok([[-1.0608739975723132, -1.6800812346255631, 2.2563865434388126],
 [0.018800267889146287, -0.10154653698276617, 0.0739424949093583],
 [0.029357297568142263, 0.274102444676571, -0.3042118890982639]], shape=[3, 3], strides=[3, 1], layout=Cc (0x5), const ndim=2)
predicted variance: [-47.348308658688005, -2.278139252532177, -50.96366188947603]

The variance looks pretty high, but I'm not sure if that's an issue.

codecov-commenter · 2022-08-14T21:24:38Z

Codecov Report

Base: 38.68% // Head: 38.59% // Decreases project coverage by -0.09% ⚠️

Coverage data is based on head (9551892) compared to base (44b244c).
Patch coverage: 31.70% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #238      +/-   ##
==========================================
- Coverage   38.68%   38.59%   -0.10%     
==========================================
  Files          93       93              
  Lines        6087     6223     +136     
==========================================
+ Hits         2355     2402      +47     
- Misses       3732     3821      +89

Impacted Files	Coverage Δ
algorithms/linfa-elasticnet/src/hyperparams.rs	`14.58% <0.00%> (ø)`
algorithms/linfa-elasticnet/src/lib.rs	`0.00% <0.00%> (ø)`
algorithms/linfa-elasticnet/src/algorithm.rs	`35.05% <33.33%> (-2.14%)`	⬇️
...rithms/linfa-trees/src/decision_trees/algorithm.rs	`39.73% <0.00%> (+1.78%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

bytesnake

👍 can you point me to the test with high variance?

bytesnake · 2022-08-20T16:15:35Z

algorithms/linfa-elasticnet/src/algorithm.rs

+    penalty: F,
+) -> (Array2<F>, F, u32) {
+    let n_samples = F::cast(x.shape()[0]);
+    let n_features = x.shape()[1];


using ncols / nrows is a bit more expressive

I feel like the current naming better expresses the properties of the dataset (# of rows = # of samples, # of cols = # of features). Plus this naming convention is used basically everywhere in this crate.

nah I meant the method name, not the variable

algorithms/linfa-elasticnet/src/algorithm.rs

bytesnake · 2022-08-20T16:22:17Z

algorithms/linfa-elasticnet/src/hyperparams.rs

 #[derive(Clone, Debug, PartialEq)]
-pub struct ElasticNetValidParams<F> {
+pub struct ElasticNetValidParamsBase<F, const MULTI_TASK: bool> {


should the multi task flag not be derived from the dataset?

I should be able to encapsulate multi-task use case in another Fit impl on the same type

I just tried to do that and it didn't work. The single and multitarget Fit impls bound the target data type with AsSingleTarget and AsMultiTarget respectively. If I put both impls on a unified param type then I get a conflicting impl error, even though AsSingleTarget and AsMultiTarget are implemented on completely different types. This idea would be doable if we made the target types Array1 and Array2 instead of generics bounded by traits.

the conflicting bounds are probably introduced here: https://github.com/rust-ml/linfa/blob/master/src/dataset/impl_targets.rs#L25-L26 have you also tried bounding the type with T: AsTargets<Ix = Ix?> directly?

Tried it. Still the same error. Rust compiler probably isn't smart enough to realize that those two bounds are completely disjoint.

YuhanLiin · 2022-08-26T06:31:33Z

+1 can you point me to the test with high variance?

It's not a test, but the new example I added

bytesnake · 2022-11-09T09:17:21Z

reviewed the example and made two changes to make the explained variance more usable

use more than two samples for validation, otherwise the second class has zero variance
compare validation dataset to estimated values (R2 is not symmetric)

--- a/algorithms/linfa-elasticnet/examples/multitask_elasticnet.rs
+++ b/algorithms/linfa-elasticnet/examples/multitask_elasticnet.rs
@@ -3,7 +3,7 @@ use linfa_elasticnet::{MultiTaskElasticNet, Result};

 fn main() -> Result<()> {
     // load Diabetes dataset
-    let (train, valid) = linfa_datasets::linnerud().split_with_ratio(0.90);
+    let (train, valid) = linfa_datasets::linnerud().split_with_ratio(0.80);

     // train pure LASSO model with 0.1 penalty
     let model = MultiTaskElasticNet::params()
@@ -18,7 +18,7 @@ fn main() -> Result<()> {

     // validate
     let y_est = model.predict(&valid);
-    println!("predicted variance: {}", valid.r2(&y_est)?);
+    println!("predicted variance: {}", y_est.r2(&valid)?);

     Ok(())
 }

which gives

predicted variance: [-4.143623744690414, -0.2630142563112303, -0.2542410304293199]

so worse than taking the average, but the dataset is really small 😅

PABannier and others added 30 commits January 19, 2022 13:55

added block coordinate descent function

81d0f1e

added duality_gap_mtl computation

7e8408e

ENH cd pass to be consistent with bcd

1b69b8e

added prox operator for MTL Enet

e5b76e7

added helper functions for tests

196a2a5

fix failing CD tests

dc91b62

working ent mtl penalties

3bb5333

bcd lower objective test pass

f9f9959

added MultiTaskEnet struct

e23d876

added MTENET documentation

f316c95

added API MTENET

1ae6522

added variance, z-score, conf interval for multitask ENET

f13a150

added multi-task estimators

b347963

added tests for MTL

80d9a02

pass comments

3d981ef

CLN files

36f4b70

cleaner implementation

15b280b

cln tests

d81a5cd

changed map into fold

a21bc1a

added tests for Enet and MTL

7f32afc

added incorrect target shape

6c56e06

WIP: made variance params generic over the number of tasks

a07fb39

added z_score and confidence_95th for MTL

75023d9

map instead of fold

ba3a574

fix confidence interval and z-score

114fc03

converted back fold to map

30744c0

pass comments

3b5f37d

Merge branch 'master' of https://github.com/PABannier/linfa into mtl_…

c7de3e2

…elastic_net

WIP make compute_variance generic over the dimension

bc1ad06

Fix compiler errors

b09e37b

YuhanLiin added 9 commits August 14, 2022 12:14

Merge branch 'master' into mtl_elastic_net

a3cc0b4

Fix multi enet tests

b7667d6

Make compute_intercept generic

d992c2c

Revert "Make compute_intercept generic"

d707040

This reverts commit d992c2c.

Replace for loops in block_coordinate_descent with general_mat_mul calls

fa5ee8c

Bring back generic compute_intercept

693cb85

Replace manual norm calculations with norm trait calls

1d2a68d

Add docs and derives to multi task types

8a0779d

Add example for multitask_elasticnet

6f6f026

YuhanLiin mentioned this pull request Aug 14, 2022

Adding Multi-Task ElasticNet support #194

Draft

8 tasks

Remove bad derives

4cfbbd5

bytesnake reviewed Aug 20, 2022

View reviewed changes

YuhanLiin added 4 commits August 27, 2022 02:59

Address review comments

5b125f9

Merge branch 'master' into mtl_elastic_net

7e7b63c

Fix CI issues

9f7d40a

Rename shape() calls to nrows and ncols

66895b6

YuhanLiin added 3 commits November 12, 2022 13:02

Merge branch 'master' into mtl_elastic_net

f70f0b7

Fix multitask elasticnet example

c18c4db

Fix docs

9551892

YuhanLiin merged commit 21357e2 into rust-ml:master Nov 12, 2022

YuhanLiin deleted the mtl_elastic_net branch November 12, 2022 19:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Multi-Task ElasticNet support #238

Adding Multi-Task ElasticNet support #238

YuhanLiin commented Aug 14, 2022

YuhanLiin commented Aug 14, 2022

codecov-commenter commented Aug 14, 2022 •

edited

Loading

bytesnake left a comment

bytesnake Aug 20, 2022

YuhanLiin Aug 27, 2022

bytesnake Aug 30, 2022

bytesnake Aug 20, 2022

YuhanLiin Aug 26, 2022

YuhanLiin Aug 27, 2022

bytesnake Aug 30, 2022

YuhanLiin Sep 2, 2022

YuhanLiin commented Aug 26, 2022

bytesnake commented Nov 9, 2022

Adding Multi-Task ElasticNet support #238

Adding Multi-Task ElasticNet support #238

Conversation

YuhanLiin commented Aug 14, 2022

YuhanLiin commented Aug 14, 2022

codecov-commenter commented Aug 14, 2022 • edited Loading

Codecov Report

bytesnake left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

YuhanLiin commented Aug 26, 2022

bytesnake commented Nov 9, 2022

codecov-commenter commented Aug 14, 2022 •

edited

Loading