Unable to find *.pt files for region_4096_pretraining #13

ramprs21 · 2022-08-01T23:04:18Z

Am I right in expecting the patch level feature *.pt files (433779 files each containing 256 x 384 tensor) used for pretraining the second stage of HIPT to be present in the HIPT/3-Self-Supervised-Eval/embeddings_patch_lib/ directory?

Currently, I only see the following pickle files in that directory.

25M     bcss_train_resnet50_trunc.pkl
9.3M    bcss_train_vits_tcga_brca_dino.pkl
4.5M    bcss_val_resnet50_tcga_brca_simclr.pkl
2.3M    bcss_val_resnet50_trunc.pkl
868K    bcss_val_vits_tcga_brca_dino.pkl
19M     breastpathq_train_resnet50_tcga_brca_simclr.pkl
9.4M    breastpathq_train_resnet50_trunc.pkl
3.6M    breastpathq_train_vits_tcga_brca_dino.pkl
1.5M    breastpathq_val_resnet50_tcga_brca_simclr.pkl
744K    breastpathq_val_resnet50_trunc.pkl
280K    breastpathq_val_vits_tcga_brca_dino.pkl
783M    crc100knonorm_train_resnet50_tcga_brca_simclr.pkl
393M    crc100knonorm_train_resnet50_trunc.pkl
149M    crc100knonorm_train_vits_tcga_brca_dino.pkl
57M     crc100knonorm_val_resnet50_tcga_brca_simclr.pkl
29M     crc100knonorm_val_resnet50_trunc.pkl
11M     crc100knonorm_val_vits_tcga_brca_dino.pkl
783M    crc100k_train_resnet50_tcga_brca_simclr.pkl
393M    crc100k_train_resnet50_trunc.pkl
149M    crc100k_train_vits_tcga_brca_dino.pkl
57M     crc100k_val_resnet50_tcga_brca_simclr.pkl
29M     crc100k_val_resnet50_trunc.pkl
11M     crc100k_val_vits_tcga_brca_dino.pkl

Thanks in advance.

The text was updated successfully, but these errors were encountered:

Richarizardd · 2022-08-01T23:18:29Z

Hi @ramprs21 - see https://github.com/mahmoodlab/HIPT/tree/master/3-Self-Supervised-Eval/embeddings_slide_lib.

ramprs21 · 2022-08-01T23:26:30Z

Hi @Richarizardd, the *.pt files in 3-Self-Supervised-Eval/embeddings_slide_lib/embeddings_slide_lib/vit256mean_tcga_slide_embeddings seem to not be the right dimensions (192 instead of 384), so I believe they are computed using outputs of 2nd stage. See below,

Python 3.9.10 | packaged by conda-forge | (main, Feb  1 2022, 21:24:11) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> data = torch.load('3-Self-Supervised-Eval/embeddings_slide_lib/embeddings_slide_lib/vit256mean_tcga_slide_embeddings/TCGA-BA-6869-01Z-00-DX1.6e58648e-3309-47bb-b2c7-b71bcd9dc69b.pt')
>>> data.shape
torch.Size([52, 192])

Whereas I am looking for inputs to the 2nd stage pre-training which I believe are a list of *.pt files each containing tensor of dimension (256x384).

Richarizardd · 2022-08-02T00:20:20Z

Hi @ramprs21 - apologies for the confusion. The previous link refers to the already pre-extracted "region-level" feature embeddings for each slide in TCGA. Regarding the *.pt files for hierarchical pretraining, it is logistically difficult at the moment to make available all [M x 256 x 384] "patch-level" feature embeddings, where M is the number of regions. Looking into ways to make this more available!

ramprs21 · 2022-08-04T18:39:20Z

Thank you @Richarizardd. Could you please update here whenever you make the 1st stage features available? Thank you :)

bryanwong17 · 2022-12-15T07:59:51Z

Hi @ramprs21 @Richarizardd , For the hierarchical pretraining (2nd stage), will the training time be much faster than the 1st stage since one region can now be converted into [256,384] which, when trained, will be reshaped again into [1,384,16,16]?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to find *.pt files for region_4096_pretraining #13

Unable to find *.pt files for region_4096_pretraining #13

ramprs21 commented Aug 1, 2022

Richarizardd commented Aug 1, 2022

ramprs21 commented Aug 1, 2022

Richarizardd commented Aug 2, 2022

ramprs21 commented Aug 4, 2022

bryanwong17 commented Dec 15, 2022

Unable to find *.pt files for region_4096_pretraining #13

Unable to find *.pt files for region_4096_pretraining #13

Comments

ramprs21 commented Aug 1, 2022

Richarizardd commented Aug 1, 2022

ramprs21 commented Aug 1, 2022

Richarizardd commented Aug 2, 2022

ramprs21 commented Aug 4, 2022

bryanwong17 commented Dec 15, 2022