Avoid deepspeed plugin converting the whole model #20543
-
I am using deepspeed plugin in lightning to train my model. I want the first part of my model to be float32, while the second part to be bfloat16 (the optimizer only trains the first part). However, I found lightning will convert the whole model to float32 if I do not specify the precision. How to keep my pre-defined model dtype untouched? |
Beta Was this translation helpful? Give feedback.
Answered by
Boltzmachine
Jan 11, 2025
Replies: 1 comment
-
I found you should implement the dtype conversion in your from lightning.pytorch.plugins import DeepSpeedPrecision
from lightning.pytorch.strategies import DeepSpeedStrategy
from typing_extensions import override
class DeepSpeedPrecisionWithoutModuleConversion(DeepSpeedPrecision):
@override
def convert_module(self, module):
return module and pass to the trainer as trainer = Trainer(
...,
DeepSpeedStrategy(stage=2, precision_plugin=DeepSpeedPrecisionWithoutModuleConversion('32-true'))
) |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
Boltzmachine
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I found you should implement the dtype conversion in your
LightningModule
, and avoidDeepSpeedStrategy
from converting your module.and pass to the trainer as