Skip to content

Commit

Permalink
Pytorch parallelization (#5915)
Browse files Browse the repository at this point in the history
* .md file and updated title

* updated title

* changed to correct template

* added description

* first section added

* added import section

* added example

* added syntax

* added output section

* added CatalogContent

* changed syntax to psuedo shell and output to shell

* changed subjects to add data science

* updated the description

* updated the description per PR recommendation.

* changed the wording under the model parallelization paragraph

* removed the section Setting Up the Environment

* changed the location of model defintion in example

* Update content/pytorch/concepts/parallelizing-models/parallelizing-models.md



* changed the code example per recommendations in PR

* Update parallelizing-models.md

minor fixes

* Update parallelizing-models.md

* Update content/pytorch/concepts/parallelizing-models/parallelizing-models.md

* Update content/pytorch/concepts/parallelizing-models/parallelizing-models.md

* Update content/pytorch/concepts/parallelizing-models/parallelizing-models.md

---------
  • Loading branch information
parkersarahl authored Jan 29, 2025
1 parent 3675eee commit a435fc6
Showing 1 changed file with 61 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
Title: 'Parallelizing Models'
Description: 'Parallelizing Models in PyTorch allows the training of deep learning models that require more memory than a single GPU can provide.'
Subjects:
- 'Computer Science'
- 'Data Science'
- 'Machine Learning'
Tags:
- 'Algorithms'
- 'Machine Learning'
- 'PyTorch'
CatalogContent:
- 'intro-to-py-torch-and-neural-networks'
- 'paths/build-a-machine-learning-model'
---

**Model parallelization** in PyTorch allows the training of deep learning models that require more memory than a single GPU can provide. The model is divided into different parts (e.g., layers or modules), with each part assigned to a separate GPU. These GPUs perform computations simultaneously, speeding up the processing of large models. They communicate with each other and share data to ensure that the output from one GPU can be used by another when necessary.

## Syntax

To utilize model parallelization the model should be wrapped using the following syntax:

```shell
class ModelParallel(nn.Module):
# Model definition goes here
```

## Example

The code demonstrates how to assign different layers of a neural network model to different GPUs for parallelizing models in PyTorch:

```py
import torch.nn as nn

# Define a model split across two GPUs
class ModelParallel(nn.Module):
def __init__(self):
super(ModelParallel, self).__init__()
self.layer1 = nn.Linear(1000, 500).to('cuda:0') # First GPU
self.layer2 = nn.Linear(500, 100).to('cuda:1') # Second GPU

def forward(self, x):
x = x.to('cuda:0') # Input to first GPU
x = self.layer1(x)
x = x.to('cuda:1') # Output of first layer to second GPU
x = self.layer2(x)
return x

model = ModelParallel()
x = torch.randn(64, 1000)
output = model(x)
```

The output of the above code would result in a tensor. The exact values would depend on the initialization of the model weights and the input data, but could be expected to look similar to the following output:

```shell
tensor([[ 0.1324, -0.2847, ..., 0.5921], # First sample in the batch
[-0.0412, 0.4891, ..., -0.2345], # Second sample in the batch
...
[ 0.2347, -0.1011, ..., 0.4567]]) # 64 rows, each with 100 values
```

0 comments on commit a435fc6

Please sign in to comment.