Allow arbitrary image sizes and upstream changes from Swin-Transformer-Object-Detection #17

vadimkantorov · 2022-01-26T10:28:32Z

It is useful in object detection context to allow arbitrary sizes by doing dynamic mask computation (probably possible only with relative position encoding).

These kinds of edits were done in https://github.com/SwinTransformer/Swin-Transformer-Object-Detection and in https://github.com/megvii-research/SOLQ/. It would be nice if you upstreamed these changes. This will simplify trying out ESviT checkpoints as pretraining for object detection.

Also, fyi I created a similar issue in SimMIM: microsoft/SimMIM#13. Overall, having some stable version of swin_transformer.py somewhere (maybe even in main SwinTransformer/Swin-Transformer repo?) supporting dynamic masking would help a lot :)

Thanks!

sym0926 · 2024-08-14T11:58:52Z

Hi,do you have ckpt and train logs , can you share with me ? I got an error ,when I download them.

vadimkantorov mentioned this issue Jun 30, 2022

Swin in this repo + dynamic resolution pytorch/vision#6227

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow arbitrary image sizes and upstream changes from Swin-Transformer-Object-Detection #17

Allow arbitrary image sizes and upstream changes from Swin-Transformer-Object-Detection #17

vadimkantorov commented Jan 26, 2022

sym0926 commented Aug 14, 2024

Allow arbitrary image sizes and upstream changes from Swin-Transformer-Object-Detection #17

Allow arbitrary image sizes and upstream changes from Swin-Transformer-Object-Detection #17

Comments

vadimkantorov commented Jan 26, 2022

sym0926 commented Aug 14, 2024