How to load a pretrained model? #25

kamadforge · 2020-11-27T18:24:41Z

It seems that although there is a flag for pre-trained model in the trainer.py, but it is not used to load the model and the training proceeds from scratch.
Note: I ended up loading it using the load checkpoint function.

akamaster · 2021-07-18T04:21:52Z

True, this needs to be corrected. Are you willing to make a contribution?

YihaoChan · 2022-01-22T05:53:19Z

Hi, I solved this problem. Hope this helps you.

import torch
from resnet import *
import dill  # in order to save Lambda Layer

# your devices
device_ids = [0, 1]

# the network architecture coresponding to the checkpoint
model = resnet20()

# remember to set map_location
check_point = torch.load('resnet20-12fca82f.th', map_location='cuda:%d' % device_ids[0])

# cause the model are saved from Parallel, we need to wrap it
model = torch.nn.DataParallel(model, device_ids=device_ids)
model.load_state_dict(check_point['state_dict'])

# pay attention to .module! without this, if you load the model, it will be attached with [Parallel.module]
# that will lead to some trouble!
torch.save(model.module, 'resnet20_check_point.pth', pickle_module=dill)

# load the converted pretrained model
net = torch.load('resnet20_check_point.pth', map_location='cuda:%d' % device_ids[0])
x = torch.rand(size=(1, 3, 32, 32)).cuda(device_ids[0])
out = net(x)
print(out)

YihaoChan · 2022-01-22T05:54:58Z

It seems that although there is a flag for pre-trained model in the trainer.py, but it is not used to load the model and the training proceeds from scratch. Note: I ended up loading it using the load checkpoint function.

A year passed...Hope you have solved this problem. If haven't, have a try on my solution. Good luck!

zhmzm · 2022-10-17T01:15:23Z

Hi, I solved this problem. Hope this helps you.

import torch
from resnet import *
import dill  # in order to save Lambda Layer

# your devices
device_ids = [0, 1]

# the network architecture coresponding to the checkpoint
model = resnet20()

# remember to set map_location
check_point = torch.load('resnet20-12fca82f.th', map_location='cuda:%d' % device_ids[0])

# cause the model are saved from Parallel, we need to wrap it
model = torch.nn.DataParallel(model, device_ids=device_ids)
model.load_state_dict(check_point['state_dict'])

# pay attention to .module! without this, if you load the model, it will be attached with [Parallel.module]
# that will lead to some trouble!
torch.save(model.module, 'resnet20_check_point.pth', pickle_module=dill)

# load the converted pretrained model
net = torch.load('resnet20_check_point.pth', map_location='cuda:%d' % device_ids[0])
x = torch.rand(size=(1, 3, 32, 32)).cuda(device_ids[0])
out = net(x)
print(out)

model = model.cuda()
param = torch.load("pretrained_models/resnet56-4bfd9763.th")
model.load_state_dict(param['state_dict'])

Thanks, this also works.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to load a pretrained model? #25

How to load a pretrained model? #25

kamadforge commented Nov 27, 2020 •

edited

Loading

akamaster commented Jul 18, 2021

YihaoChan commented Jan 22, 2022 •

edited

Loading

YihaoChan commented Jan 22, 2022

zhmzm commented Oct 17, 2022 •

edited

Loading

How to load a pretrained model? #25

How to load a pretrained model? #25

Comments

kamadforge commented Nov 27, 2020 • edited Loading

akamaster commented Jul 18, 2021

YihaoChan commented Jan 22, 2022 • edited Loading

YihaoChan commented Jan 22, 2022

zhmzm commented Oct 17, 2022 • edited Loading

kamadforge commented Nov 27, 2020 •

edited

Loading

YihaoChan commented Jan 22, 2022 •

edited

Loading

zhmzm commented Oct 17, 2022 •

edited

Loading