Releases · MzeroMiko/VMamba

22 Feb 16:51

9dfcfc9

Semantic Segmentation on ADE20K

Backbone	Input	#params	FLOPs	Segmentor	mIoU(SS)	mIoU(MS)	configs/logs/logs(ms)/ckpts
Vanilla-VMamba-T	512x512	55M	~~939G~~ 964G	UperNet@160k	47.3	48.3	config/log/log(ms)/ckpt
Vanilla-VMamba-S	512x512	76M	~~1037G~~ 1081G	UperNet@160k	49.5	50.5	config/log/log(ms)/ckpt
Vanilla-VMamba-B	512x512	110M	~~1167G~~ 1226G	UperNet@160k	50.0	51.3	config/log/log(ms)/ckpt

Assets 14

22 Feb 16:09

MzeroMiko

#v0det

9dfcfc9

VMamba v0 Detection checkpoints

Object Detection on COCO

Backbone	#params	FLOPs	Detector	bboxAP	bboxAP50	bboxAP75	segmAP	segmAP50	segmAP75	configs/logs/ckpts
Vanilla-VMamba-T	42M	~~262G~~ 286G	MaskRCNN@1x	46.5	68.5	50.7	42.1	65.5	45.3	config/log/ckpt
Vanilla-VMamba-S	64M	~~357G~~ 400G	MaskRCNN@1x	48.2	69.7	52.5	43.0	66.6	46.4	config/log/ckpt
Vanilla-VMamba-B	96M	~~482G~~ 540G	MaskRCNN@1x	48.6	70.0	53.1	43.3	67.1	46.7	config/log/ckpt
:---:	:---:	:---:	:---:	:---:	:---:	:---:	:---:	:---:	:---:	:---:
Vanilla-VMamba-T	42M	~~262G~~ 286G	MaskRCNN@3x	48.5	70.0	52.7	43.2	66.9	46.4	config/log/ckpt
Vanilla-VMamba-S	64M	~~357G~~ 400G	MaskRCNN@3x	49.7	70.4	54.2	44.0	67.6	47.3	config/log/ckpt

Assets 12

18 Feb 03:28

MzeroMiko

#v0cls

9dfcfc9

VMamba v0 Classification checkpoints

Checkpoints for VMamba (alias of vssm version 0)

These checkpoints correspond to the experiments done before date #20240119.

name	pretrain	resolution	acc@1	#params	FLOPs	best epoch	use ema	config
VMamba-T	ImageNet-1K	224x224	82.2	22M	~~4.5G~~ 5.6G	292	did'nt add	config
VMamba-S	ImageNet-1K	224x224	83.5	44M	~~9.1G~~ 11.2G	238	true	config
VMamba-B	ImageNet-1K	224x224	83.2	75M	~~15.2G~~ 18.0G	260	did'nt add	config
VMamba-B*	ImageNet-1K	224x224	83.7	75M	~~15.2G~~ 18.0G	241	true	config

Most backbone models trained without ema, which do not enhance performance \cite(Swin-Transformer). We use ema because our model is still under development, without hyperparameter tuning.

The checkpoints used in object detection and segmentation is VMamba-B with droppath 0.5 + no ema. VMamba-B* represents for VMamba-B with droppath 0.6 + ema, the performance of which is non-ema: 83.3 in epoch 262; ema: 83.7 in epoch 241

Assets 10

20 Mar 03:13

MzeroMiko

#v2seg

0e09029

VMamba v2 Segmentation checkpoints Latest

Latest

Semantic Segmentation on ADE20K

Backbone	Input	#params	FLOPs	Segmentor	mIoU(SS)	mIoU(MS)	configs/logs/logs(ms)/ckpts
VMamba-T[`s2l5`]	512x512	62M	948G	UperNet@160k	48.3	48.6	config/log/log(ms)/ckpt
VMamba-S[`s2l15`]	512x512	82M	1028G	UperNet@160k	50.6	51.2	config/log/log(ms)/ckpt
VMamba-B[`s2l15`]	512x512	122M	1170G	UperNet@160k	51.0	51.6	config/log/log(ms)/ckpt
VMamba-T[`s1l8`]	512x512	62M	949G	UperNet@160k	47.9	48.8	config/log/log(ms)/ckpt

Assets 13

upernet_vssm_4xb4-160k_ade20k-512x512_base.log

1.04 MB 2024-03-19T15:44:26Z
upernet_vssm_4xb4-160k_ade20k-512x512_base_iter_160000.pth

466 MB 2024-03-19T15:31:59Z
upernet_vssm_4xb4-160k_ade20k-512x512_base_tta.log

31.7 KB 2024-03-20T03:25:31Z
upernet_vssm_4xb4-160k_ade20k-512x512_small.log

1.04 MB 2024-03-19T15:44:10Z
upernet_vssm_4xb4-160k_ade20k-512x512_small_iter_144000.pth

312 MB 2024-03-19T15:32:22Z
upernet_vssm_4xb4-160k_ade20k-512x512_small_tta.log

31.7 KB 2024-03-20T03:25:39Z
upernet_vssm_4xb4-160k_ade20k-512x512_tiny.log

1010 KB 2024-03-19T15:41:32Z
upernet_vssm_4xb4-160k_ade20k-512x512_tiny_iter_160000.pth

238 MB 2024-03-19T15:32:28Z
upernet_vssm_4xb4-160k_ade20k-512x512_tiny_s.log

1 MB 2024-05-26T14:33:57Z
upernet_vssm_4xb4-160k_ade20k-512x512_tiny_s_iter_160000.pth

236 MB 2024-05-26T14:34:08Z
Source code (zip)

2024-05-26T15:35:29Z
Source code (tar.gz)

2024-05-26T15:35:29Z

20 Mar 03:06

MzeroMiko

#v2det

5b3bb7d

VMamba v2 Detection checkpoints

Object Detection on COCO

Backbone	#params	FLOPs	Detector	bboxAP	bboxAP50	bboxAP75	segmAP	segmAP50	segmAP75	configs/logs/ckpts
VMamba-T[`s2l5`]	50M	270G	MaskRCNN@1x	47.4	69.5	52.0	42.7	66.3	46.0	config/log/ckpt
VMamba-S[`s2l15`]	70M	384G	MaskRCNN@1x	48.7	70.0	53.4	43.7	67.3	47.0	config/log/ckpt
VMamba-B[`s2l15`]	108M	485G	MaskRCNN@1x	49.2	71.4	54.0	44.1	68.3	47.7	config/log/ckpt
VMamba-B[`s2l15`]	108M	485G	MaskRCNN@1x[`bs8`]	49.2	70.9	53.9	43.9	67.7	47.6	config/log/ckpt
VMamba-T[`s1l8`]	50M	271G	MaskRCNN@1x	47.3	69.3	52.0	42.7	66.4	45.9	config/log/ckpt
:---:	:---:	:---:	:---:	:---:	:---:	:---:	:---:	:---:	:---:	:---:
VMamba-T[`s2l5`]	50M	270G	MaskRCNN@3x	48.9	70.6	53.6	43.7	67.7	46.8	config/log/ckpt
VMamba-S[`s2l15`]	70M	384G	MaskRCNN@3x	49.9	70.9	54.7	44.20	68.2	47.7	config/log/ckpt
VMamba-T[`s1l8`]	50M	271G	MaskRCNN@3x	48.8	70.4	53.50	43.7	67.4	47.0	config/log/ckpt

Models in this subsection is initialized from the models trained in classfication.
we now calculate FLOPs with the algrithm @ albertgu provides, which will be bigger than previous calculation (which is based on the selective_scan_ref function, and ignores the hardware-aware algrithm).

Assets 18

16 Mar 08:47

MzeroMiko

#v2cls

0e09029

VMamba v2 Classification checkpoints

Classification on ImageNet-1K

name	pretrain	resolution	acc@1	#params	FLOPs	TP.	Train TP.	configs/logs/ckpts
VMamba-T[`s2l5`]	ImageNet-1K	224x224	82.5	31M	4.9G	1340	464	config/log/ckpt
VMamba-S[`s2l15`]	ImageNet-1K	224x224	83.6	50M	8.7G	877	314	config/log/ckpt
VMamba-B[`s2l15`]	ImageNet-1K	224x224	83.9	89M	15.4G	646	247	config/log/ckpt
VMamba-T[`s1l8`]	ImageNet-1K	224x224	82.6	30M	4.9G	1686	571	config/log/ckpt
VMamba-S[`s1l20`]	ImageNet-1K	224x224	83.3	49M	8.6G	1106	390	config/log/ckpt
VMamba-B[`s1l20`]	ImageNet-1K	224x224	83.8	87M	15.2G	827	313	config/log/ckpt

Models in this subsection is trained from scratch with random or manual initialization. The hyper-parameters are inherited from Swin, except for drop_path_rate and EMA. All models are trained with EMA except for the Vanilla-VMamba-T.
TP.(Throughput) and Train TP. (Train Throughput) are assessed on an A100 GPU paired with an AMD EPYC 7542 CPU, with batch size 128. Train TP. is tested with mix-resolution, excluding the time consumption of optimizers.
FLOPs and parameters are now gathered with head (In previous versions, without head, so the numbers raise a little bit).
we calculate FLOPs with the algorithm @ albertgu provides, which will be bigger than previous calculation (which is based on the selective_scan_ref function, and ignores the hardware-aware algorithm).

Assets 14

22 Feb 02:13

MzeroMiko

#20240220

a46caff

Checkpoints for nightly builds! Pre-release

Pre-release

name	pretrain	resolution	acc@1	#params	FLOPs	best epoch	use ema	config
VMamba-T	ImageNet-1K	224x224	82.5	32M	5G	258	true	config

We use ema because our model is still under development, without hyperparameter tuning.

This is a pre-release

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Semantic Segmentation on ADE20K

Object Detection on COCO

Semantic Segmentation on ADE20K

Object Detection on COCO

Classification on ImageNet-1K

Releases: MzeroMiko/VMamba

VMamba v0 Segmentation checkpoints

Semantic Segmentation on ADE20K

VMamba v0 Detection checkpoints

Object Detection on COCO

VMamba v0 Classification checkpoints

VMamba v2 Segmentation checkpoints

Semantic Segmentation on ADE20K

VMamba v2 Detection checkpoints

Object Detection on COCO

VMamba v2 Classification checkpoints

Classification on ImageNet-1K

Checkpoints for nightly builds!