Possible issue in Accelerate FSDP Documentation #3195

Quicksilver466 · 2024-10-24T17:17:16Z

Hi,

I was reading the accelerate documentation on FSDP (Documentation Page Link), when I noticed that the section which goes over all the keys and their values in the generated configs, there might have a jumble up while explaining the following 2 keys: fsdp_transformer_layer_cls_to_wrap and fsdp_auto_wrap_policy.

Details

fsdp_transformer_layer_cls_to_wrap: Only applicable for Transformers. When using fsdp_auto_wrap_policy=TRANSFORMER_BASED_WRAP, a user may provide a comma-separated string of transformer layer class names (case-sensitive) to wrap, e.g., BertLayer, GPTJBlock, T5Block, BertLayer,BertEmbeddings,BertSelfOutput. This is important because submodules that share weights (e.g., embedding layers) should not end up in different FSDP wrapped units. Using this policy, wrapping happens for each block containing Multi-Head Attention followed by a couple of MLP layers. Remaining layers including the shared embeddings are conveniently wrapped in same outermost FSDP unit. Therefore, use this for transformer-based models. You can use the model._no_split_modules for Transformer models by answering yes to Do you want to use the model's _no_split_modules to wrap. It will try to use model._no_split_modules` when possible.

I maybe wrong, in which case I have a doubt, if we select model._no_split_modules, then is mentioning the fsdp_transformer_layer_cls_to_wrap in the config necessary ? Since accelerate is anyway going to only wrap the layers which have Mulit-headed attention block followed by MLP, is specifically giving the class name even required ?

The text was updated successfully, but these errors were encountered:

muellerzr · 2024-10-24T17:21:13Z

Correct, fsdp_transformer_layer_cls_to_wrap is optional if model._no_split_modules exists, we'll auto figure that out. (So if the model comes from transformers)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible issue in Accelerate FSDP Documentation #3195

Possible issue in Accelerate FSDP Documentation #3195

Quicksilver466 commented Oct 24, 2024

muellerzr commented Oct 24, 2024

Possible issue in Accelerate FSDP Documentation #3195

Possible issue in Accelerate FSDP Documentation #3195

Comments

Quicksilver466 commented Oct 24, 2024

muellerzr commented Oct 24, 2024