NF: Visualize the latent space and pack data_per_streamline in the batch loader. #245

levje · 2024-09-27T16:45:28Z

Description

It was asked to be able to visualize the latent space based on #220 w.r.t to FINTA from Legarreta et al (2021). As in the original paper, we are projecting the latent space coming out of the auto-encoder into 2D using t-SNE which preserves a smaller distance for similar streamlines and a higher distance for different streamlines.

The class latent_streamlines.py:BundlesLatentSpaceVisualizer is the bulk of the changes done and was made to potentially be reused for other data that needs to be projected and plotted in 2D. Each time we reach an epoch where the loss is minimal compared to what we encountered so far, we plot the latent space of that new epoch.

(Having a future PR adding hooks everywhere in the trainer/models in a similar fashion to LightningAI or PyTorch.nn.Module would add more flexibility to the library in my opinion!)

Scripts:

ae_train_model.py : Same script as added by [NF] Auto-encoders - streamlines - FINTA #220, with the additional part to be able to automatically plot/save the figures at each interval of epochs given by the new argument --viz_latent_space .

Testing data and script

ae_train_model.py \
        $experiments \
	$experiment_name \
	$o_hdf5 \
	target \
	-v INFO \
	--batch_size_training 1200 \
	--batch_size_units nb_streamlines \
	--nb_subjects_per_batch 5 \
	--learning_rate 0.001 \
	--weight_decay 0.13 \
	--optimizer Adam \
	--max_epochs 1000 \
	--max_batches_per_epoch_training 20 \
	--comet_workspace <comet_workpace> \
	--comet_project dwi_ml-ae-fibercup \
	--patience 100 \
	--viz_latent_space \
        --color_by 'dps_bundle_index'
        --bundles_mapping <file with a mapping to bundles>

Have you

Added a description of the content of this PR above
Followed proper commit message formatting
Added data and a script to test this PR
Made sure that PEP8 issues are resolved
Tested the code yourself right before pushing
Added unit tests to test the code you implemented

People this PR concerns

@arnaudbore @AntoineTheb

pep8speaks · 2024-09-27T16:45:38Z

Hello @levje, Thank you for updating !

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-10-07 15:45:40 UTC

…latent-space

levje · 2024-10-02T22:11:14Z

@EmmaRenauld if you have an idea on how I could implement commit f0973ff differently, please let me know. It's a little "à bric-à-brac".

levje · 2024-10-06T02:04:20Z

After a few rounds of cleanup and after our discussion this Thursday about the utility of visualizing the current latent space evolution, I stripped the code to only print the latent space of the best epoch. So, each time we get a new best epoch, the latent space will be plotted and saved. This simplifies the code a lot and makes it simpler and more reusable.

To plot "when a new best epoch found", I figured it would be a lot cleaner if I just have a function that can be called within the BestEpochMonitor, which is the newest addition to the modifications we talked about.

It should also now be fine whenever we don't specify any data_per_streamline in the HDF5 file (there will only be one color) and it will also work if you don't specify the bundle_index. @arnaudbore, from what I tested, Fibercup should be working fine now, let me know otherwise.

Finally, just to make it clear, I modified the structure of the HDF5 to have a group '<subject-id>/target/data_per_streamline/bundle_index'. Each entry/dataset within the data_per_streamline group in the HDF5 will be loaded in the dictionary as a numpy array to be included in the returned sft.

arnaudbore

Not a fan of having the color class in the latent space one, also looking at the output I feel like the colors could also be better chosen. Apart from that LGTM !

scripts_python/ae_train_model.py

levje added 2 commits September 20, 2024 12:51

Latent space visualization integration

5149706

Viz latent space each n epochs

c665cf2

levje added the enhancement New feature or request label Sep 27, 2024

levje self-assigned this Sep 27, 2024

levje added 4 commits September 27, 2024 13:20

autopep8 pass

8df0c0f

Merge branch 'master' into levje/viz-latent-space

7d6de7c

Merge branch 'master' of github.com:scil-vital/dwi_ml into levje/viz-…

8968c7f

…latent-space

Fix to cpu

cb64063

levje changed the title ~~[WIP] Visualize the latent space from the auto encoder~~ Visualize the latent space from the auto encoder Oct 2, 2024

levje requested review from arnaudbore and EmmaRenauld October 2, 2024 14:57

levje added 2 commits October 2, 2024 18:09

Use bundle index within HDF5 for coloring latent space

f0973ff

Subplots with best epoch: part I

7f41de6

levje added 8 commits October 3, 2024 13:10

Color matching between epochs and save the plot of the best epoch

327ca86

Fix best epoch legend and colors

701737a

Cleaup data_per_streamline retrieval in the HDF5

28f3f3d

Cleanup: part 1

a228c86

Cleanup: part 2

1d82356

Cleanup: part 3

115a7dc

Cleanup: part 4

1b1499c

Cleanup: part 5

39ecd7b

levje changed the title ~~Visualize the latent space from the auto encoder~~ NF: Visualize the latent space Oct 6, 2024

levje changed the title ~~NF: Visualize the latent space~~ NF: Visualize the latent space and pack data_per_streamline in the batch loader. Oct 6, 2024

levje added 3 commits October 7, 2024 09:59

Fix dps unpacking

2e4a191

Fix tests

2de0a43

Fix pep8

7f3d931

arnaudbore reviewed Oct 7, 2024

View reviewed changes

scripts_python/ae_train_model.py Outdated Show resolved Hide resolved

scripts_python/ae_train_model.py Outdated Show resolved Hide resolved

levje added 3 commits October 7, 2024 15:25

Move color generation func into separate file

e051f19

Doc update

a6a9a67

Fix missing import

26ded25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NF: Visualize the latent space and pack data_per_streamline in the batch loader. #245

NF: Visualize the latent space and pack data_per_streamline in the batch loader. #245

levje commented Sep 27, 2024 •

edited

Loading

pep8speaks commented Sep 27, 2024 •

edited

Loading

levje commented Oct 2, 2024

levje commented Oct 6, 2024 •

edited

Loading

arnaudbore left a comment

NF: Visualize the latent space and pack data_per_streamline in the batch loader. #245

Are you sure you want to change the base?

NF: Visualize the latent space and pack data_per_streamline in the batch loader. #245

Conversation

levje commented Sep 27, 2024 • edited Loading

Description

Testing data and script

Have you

People this PR concerns

pep8speaks commented Sep 27, 2024 • edited Loading

Comment last updated at 2024-10-07 15:45:40 UTC

levje commented Oct 2, 2024

levje commented Oct 6, 2024 • edited Loading

arnaudbore left a comment

Choose a reason for hiding this comment

levje commented Sep 27, 2024 •

edited

Loading

pep8speaks commented Sep 27, 2024 •

edited

Loading

levje commented Oct 6, 2024 •

edited

Loading