Skip to content

Commit

Permalink
feat: add custom build of infiniband with ubuntu 24.04
Browse files Browse the repository at this point in the history
Because Microsoft is only providing an image from
2022... :/

Signed-off-by: vsoch <[email protected]>
  • Loading branch information
vsoch committed Jan 9, 2025
1 parent 5d02845 commit a728ce4
Show file tree
Hide file tree
Showing 5 changed files with 473 additions and 0 deletions.
4 changes: 4 additions & 0 deletions azure/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -443,6 +443,10 @@ Now let's run lammps!
# This should work (one node with ib and shared memory)
flux run -o cpu-affinity=per-task -N1 -n 96 --env UCX_TLS=ib,sm --env UCX_NET_DEVICES=mlx5_ib0:1 lmp -v x 1 -v y 1 -v z 1 -in in.reaxff.hns -nocite

/opt/hpcx-v2.19-gcc-mlnx_ofed-ubuntu22.04-cuda12-x86_64/hpcx-rebuild/lib:/opt/hpcx-v2.19-gcc-mlnx_ofed-ubuntu22.04-cuda12-x86_64/hcoll/lib
flux run -o cpu-affinity=per-task -N2 -n 192 --env OMPI_MPI_mca_coll_hcoll_enable=0 --env OMPI_MPI_mca_coll_ucc_enable=0 --env UCX_TLS=ib --env UCX_NET_DEVICES=mlx5_ib0:1 lmp -v x 1 -v y 1 -v z 1 -in in.reaxff.hns -nocite


# -x UCC_LOG_LEVEL=debug -x UCC_TLS=ucp
flux run -o cpu-affinity=per-task -N2 -n 192 --env UCC_LOG_LEVEL=info --env UCC_TLS=ucp --env UCC_CONFIG_FILE= -OMPI_MPI_mca_coll_ucc_enable=0 --env UCX_TLS=dc_x --env UCX_NET_DEVICES=mlx5_ib0:1 lmp -v x 1 -v y 1 -v z 1 -in in.reaxff.hns -nocite
```
Expand Down
18 changes: 18 additions & 0 deletions azure/build-ubuntu-24.04/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
.PHONY: all
all: init fmt validate build

.PHONY: init
init:
packer init .

.PHONY: fmt
fmt:
packer fmt .

.PHONY: validate
validate:
packer validate .

.PHONY: build
build:
packer build flux-usernetes.pkr.hcl
38 changes: 38 additions & 0 deletions azure/build-ubuntu-24.04/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Build Packer Images

Note that I needed to do this build from a cloud shell, so clone and then:

```bash
git clone https://github.com/converged-computing/flux-usernetes
cd flux-usernetes/azure/build-ubuntu-24.04
```

And install packer

```bash
wget https://releases.hashicorp.com/packer/1.11.2/packer_1.11.2_linux_amd64.zip
unzip packer_1.11.2_linux_amd64.zip
mkdir -p ./bin
mv ./packer ./bin/
export PATH=$(pwd)/bin:$PATH
```

Get your account information for azure as follows:

```bash
az account show
```

And export variables in the following format. Note that the resource group needs to actually exist - I created mine in the console UI.

```bash
export AZURE_SUBSCRIPTION_ID=xxxxxxxxx
export AZURE_TENANT_ID=xxxxxxxxxxx
export AZURE_RESOURCE_GROUP_NAME=packer-testing
```

Then build!

```bash
make
```
Loading

0 comments on commit a728ce4

Please sign in to comment.