Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More about issue 4 #7

Open
MeycL opened this issue Aug 23, 2023 · 4 comments
Open

More about issue 4 #7

MeycL opened this issue Aug 23, 2023 · 4 comments

Comments

@MeycL
Copy link

MeycL commented Aug 23, 2023

I have the same doubt as the author of issue 4, since DAT sets the image size to 64. For example, if I have a 256×256 image as input, how do I preprocess it? Just resize?

@zhengchen1999
Copy link
Owner

The image size (e.g., 64) in DAT is the input image size during training. Namely, we apply input images of size 64 × 64 to train DAT. But this is just for training convenience. DAT can support images of any size. For example, under the SR-x2 task, you take an input of size $640 \times 380$ and get an output image of $1280 \times 760$.

@styler00dollar
Copy link

styler00dollar commented Aug 23, 2023

How important is it to set img_size during training? I was finetuning the official models the last few weeks without adjusting this value and models still do seem to work quite good. Does it matter more if a model gets trained from scratch?

Also as a sidenote, thanks for making 2x models and models with different inference requirements. DAT is my favorite network currently.

@zhengchen1999
Copy link
Owner

zhengchen1999 commented Aug 23, 2023

The img_size equals the patch size. But not mandatory.
For "Does it matter more if a model gets trained from scratch?". In training, patch size has an impact on model performance. When the batch size is 48x48, the model's performance is lower than that of 64x64. But img_size doesn't affect performance.

In fact, the img_size is to simplify the calculation of the mask (for SW-SA) in training. Considering that the input size does not change during training, we preserve the mask value corresponding to img_size to speed up the calculation. If the input image size is not equal to img_size, the mask needs to be recalculated (https://github.com/zhengchen1999/DAT/blob/main/basicsr/archs/dat_arch.py#L398).

@styler00dollar
Copy link

Thanks for the quick reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants