Skip to content

Commit

Permalink
build: add flux usernetes build
Browse files Browse the repository at this point in the history
Signed-off-by: vsoch <[email protected]>
  • Loading branch information
vsoch committed Jan 8, 2025
1 parent 831034d commit f2159ae
Show file tree
Hide file tree
Showing 5 changed files with 508 additions and 1 deletion.
13 changes: 12 additions & 1 deletion azure/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

### 1. Build Images

You can find the instructions for building your bases [here](https://github.com/converged-computing/flux-tutorials/tree/main/tutorial/azure/build) in the flux-tutorials repository. You'll need to create a resource group with your packer image (e.g., packer-testing) and an image (e.g., flux-framework) before continuing here.
You'll need to do the [build](build), first, and before that creating a resource group with your packer image (e.g., packer-testing) and an image (e.g., flux-usernetes) before continuing here.

### 2. Deploy Terraform

Expand Down Expand Up @@ -84,6 +84,7 @@ pssh -h hosts.txt -x "-i ./id_azure" "/bin/bash /tmp/update_brokers.sh flux $lea

Note that if it fails, you need to wait a bit - I usually step away for a second or two to give the VM time to finish setting up.


### 4. Install LAMMPS and OSU

Before we shell in, let's install lammps and the osu benchmarks on "bare metal":
Expand Down Expand Up @@ -303,11 +304,21 @@ They are exactly the same, and `UCX_TLS` doesn't seem to matter, but likely you

This is a work in progress - I'm still manually testing with [these scripts](https://github.com/converged-computing/flux-tutorials/tree/add-azure-base/tutorial/azure/install) but it isn't working yet. The container cannot ping hosts outside it, and I don't see vxlan as a loaded module.

```console
script=usernetes
for address in $(az vmss list-instance-public-ips -g terraform-testing -n flux | jq -r .[].ipAddress)
do
echo "Installing ${script} to $address"
scp -i ./id_azure ./install/install_${script}.sh azureuser@${address}:/tmp/install_${script}.sh
done
pssh -h hosts.txt -t 100000000 -x "-i ./id_azure" "/bin/bash /tmp/install_${script}.sh"
```

#### TBA Install Infiniband

At this point we need to expose infiniband on the host to the pods. This took a few steps,
and what I learned (and the instructions are in [the repository here](https://github.com/converged-computing/aks-infiniband-install).

### 9. Cleanup

When you are done:
Expand Down
18 changes: 18 additions & 0 deletions azure/build/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
.PHONY: all
all: init fmt validate build

.PHONY: init
init:
packer init .

.PHONY: fmt
fmt:
packer fmt .

.PHONY: validate
validate:
packer validate .

.PHONY: build
build:
packer build flux-usernetes.pkr.hcl
38 changes: 38 additions & 0 deletions azure/build/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Build Packer Images

Note that I needed to do this build from a cloud shell, so clone and then:

```bash
git clone https://github.com/converged-computing/flux-tutorials
flux-tutorials/tutorial/azure/build
```

And install packer

```bash
wget https://releases.hashicorp.com/packer/1.11.2/packer_1.11.2_linux_amd64.zip
unzip packer_1.11.2_linux_amd64.zip
mkdir -p ./bin
mv ./packer ./bin/
export PATH=$(pwd)/bin:$PATH
```

Get your account information for azure as follows:

```bash
az account show
```

And export variables in the following format. Note that the resource group needs to actually exist - I created mine in the console UI.

```bash
export AZURE_SUBSCRIPTION_ID=xxxxxxxxx
export AZURE_TENANT_ID=xxxxxxxxxxx
export AZURE_RESOURCE_GROUP_NAME=packer-testing
```

Then build!

```bash
make
```
Loading

0 comments on commit f2159ae

Please sign in to comment.