Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add future reservation support #3227

Merged
merged 1 commit into from
Dec 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,7 @@ No modules.
| <a name="input_enable_shielded_vm"></a> [enable\_shielded\_vm](#input\_enable\_shielded\_vm) | Enable the Shielded VM configuration. Note: the instance image must support option. | `bool` | `false` | no |
| <a name="input_enable_smt"></a> [enable\_smt](#input\_enable\_smt) | Enables Simultaneous Multi-Threading (SMT) on instance. | `bool` | `false` | no |
| <a name="input_enable_spot_vm"></a> [enable\_spot\_vm](#input\_enable\_spot\_vm) | Enable the partition to use spot VMs (https://cloud.google.com/spot-vms). | `bool` | `false` | no |
| <a name="input_future_reservation"></a> [future\_reservation](#input\_future\_reservation) | If set, will make use of the future reservation for the nodeset. Input can be either the future reservation name or its selfLink in the format 'projects/PROJECT\_ID/zones/ZONE/futureReservations/FUTURE\_RESERVATION\_NAME'.<br/>See https://cloud.google.com/compute/docs/instances/future-reservations-overview | `string` | `""` | no |
| <a name="input_guest_accelerator"></a> [guest\_accelerator](#input\_guest\_accelerator) | List of the type and count of accelerator cards attached to the instance. | <pre>list(object({<br/> type = string,<br/> count = number<br/> }))</pre> | `[]` | no |
| <a name="input_instance_image"></a> [instance\_image](#input\_instance\_image) | Defines the image that will be used in the Slurm node group VM instances.<br/><br/>Expected Fields:<br/>name: The name of the image. Mutually exclusive with family.<br/>family: The image family to use. Mutually exclusive with name.<br/>project: The project where the image is hosted.<br/><br/>For more information on creating custom images that comply with Slurm on GCP<br/>see the "Slurm on GCP Custom Images" section in docs/vm-images.md. | `map(string)` | <pre>{<br/> "family": "slurm-gcp-6-8-hpc-rocky-linux-8",<br/> "project": "schedmd-slurm-public"<br/>}</pre> | no |
| <a name="input_instance_image_custom"></a> [instance\_image\_custom](#input\_instance\_image\_custom) | A flag that designates that the user is aware that they are requesting<br/>to use a custom and potentially incompatible image for this Slurm on<br/>GCP module.<br/><br/>If the field is set to false, only the compatible families and project<br/>names will be accepted. The deployment will fail with any other image<br/>family or name. If set to true, no checks will be done.<br/><br/>See: https://goo.gle/hpc-slurm-images | `bool` | `false` | no |
Expand Down
12 changes: 12 additions & 0 deletions community/modules/compute/schedmd-slurm-gcp-v6-nodeset/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ locals {
spot = var.enable_spot_vm
termination_action = try(var.spot_instance_config.termination_action, null)
reservation_name = local.reservation_name
future_reservation = local.future_reservation
maintenance_interval = var.maintenance_interval
instance_properties_json = jsonencode(var.instance_properties)

Expand Down Expand Up @@ -141,6 +142,17 @@ locals {
reservation_name = local.res_match.whole == null ? "" : "${local.res_prefix}${local.res_short_name}${local.res_suffix}"
}

locals {
fr_match = regex("^(?P<whole>projects/(?P<project>[a-z0-9-]+)/zones/(?P<zone>[a-z0-9-]+)/futureReservations/)?(?P<name>[a-z0-9-]+)?$", var.future_reservation)

fr_name = local.fr_match.name
fr_project = coalesce(local.fr_match.project, var.project_id)
fr_zone = coalesce(local.fr_match.zone, var.zone)

future_reservation = var.future_reservation == "" ? "" : "projects/${local.fr_project}/zones/${local.fr_zone}/futureReservations/${local.fr_name}"
}


# tflint-ignore: terraform_unused_declarations
data "google_compute_reservation" "reservation" {
count = length(local.reservation_name) > 0 ? 1 : 0
Expand Down
24 changes: 24 additions & 0 deletions community/modules/compute/schedmd-slurm-gcp-v6-nodeset/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -54,4 +54,28 @@ output "nodeset" {
condition = !var.enable_placement || !var.dws_flex.enabled
error_message = "Cannot use DWS Flex with `enable_placement`."
}

precondition {
condition = var.reservation_name == "" || var.future_reservation == ""
error_message = "Cannot use reservations and future reservations in the same nodeset"
}

precondition {
abbas1902 marked this conversation as resolved.
Show resolved Hide resolved
condition = !var.enable_placement || var.future_reservation == ""
error_message = "Cannot use `enable_placement` with future reservations."
}

precondition {
condition = var.future_reservation == "" || length(var.zones) == 0
error_message = <<-EOD
If a future reservation is specified, `var.zones` should be empty.
EOD
}

precondition {
condition = var.future_reservation == "" || local.fr_zone == var.zone
error_message = <<-EOD
The zone of the deployment must match that of the future reservation"
EOD
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -463,6 +463,21 @@ variable "reservation_name" {
}
}

variable "future_reservation" {
description = <<-EOD
If set, will make use of the future reservation for the nodeset. Input can be either the future reservation name or its selfLink in the format 'projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME'.
See https://cloud.google.com/compute/docs/instances/future-reservations-overview
EOD
type = string
default = ""
nullable = false

validation {
condition = length(regexall("^(projects/([a-z0-9-]+)/zones/([a-z0-9-]+)/futureReservations/([a-z0-9-]+))?$", var.future_reservation)) > 0 || length(regexall("^([a-z0-9-]+)$", var.future_reservation)) > 0
error_message = "Future reservation must be either the future reservation name or its selfLink in the format 'projects/PROJECT_ID/zone/ZONE/futureReservations/FUTURE_RESERVATION_NAME'."
}
abbas1902 marked this conversation as resolved.
Show resolved Hide resolved
}
abbas1902 marked this conversation as resolved.
Show resolved Hide resolved

variable "maintenance_interval" {
description = <<-EOD
Sets the maintenance interval for instances in this nodeset.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -336,7 +336,7 @@ limitations under the License.
| <a name="input_metadata"></a> [metadata](#input\_metadata) | Metadata, provided as a map. | `map(string)` | `{}` | no |
| <a name="input_min_cpu_platform"></a> [min\_cpu\_platform](#input\_min\_cpu\_platform) | Specifies a minimum CPU platform. Applicable values are the friendly names of<br/>CPU platforms, such as Intel Haswell or Intel Skylake. See the complete list:<br/>https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform | `string` | `null` | no |
| <a name="input_network_storage"></a> [network\_storage](#input\_network\_storage) | An array of network attached storage mounts to be configured on all instances. | <pre>list(object({<br/> server_ip = string,<br/> remote_mount = string,<br/> local_mount = string,<br/> fs_type = string,<br/> mount_options = string,<br/> client_install_runner = optional(map(string))<br/> mount_runner = optional(map(string))<br/> }))</pre> | `[]` | no |
| <a name="input_nodeset"></a> [nodeset](#input\_nodeset) | Define nodesets, as a list. | <pre>list(object({<br/> node_count_static = optional(number, 0)<br/> node_count_dynamic_max = optional(number, 1)<br/> node_conf = optional(map(string), {})<br/> nodeset_name = string<br/> additional_disks = optional(list(object({<br/> disk_name = optional(string)<br/> device_name = optional(string)<br/> disk_size_gb = optional(number)<br/> disk_type = optional(string)<br/> disk_labels = optional(map(string), {})<br/> auto_delete = optional(bool, true)<br/> boot = optional(bool, false)<br/> })), [])<br/> bandwidth_tier = optional(string, "platform_default")<br/> can_ip_forward = optional(bool, false)<br/> disable_smt = optional(bool, false)<br/> disk_auto_delete = optional(bool, true)<br/> disk_labels = optional(map(string), {})<br/> disk_size_gb = optional(number)<br/> disk_type = optional(string)<br/> enable_confidential_vm = optional(bool, false)<br/> enable_placement = optional(bool, false)<br/> enable_oslogin = optional(bool, true)<br/> enable_shielded_vm = optional(bool, false)<br/> enable_maintenance_reservation = optional(bool, false)<br/> enable_opportunistic_maintenance = optional(bool, false)<br/> gpu = optional(object({<br/> count = number<br/> type = string<br/> }))<br/> dws_flex = object({<br/> enabled = bool<br/> max_run_duration = number<br/> use_job_duration = bool<br/> })<br/> labels = optional(map(string), {})<br/> machine_type = optional(string)<br/> maintenance_interval = optional(string)<br/> instance_properties_json = string<br/> metadata = optional(map(string), {})<br/> min_cpu_platform = optional(string)<br/> network_tier = optional(string, "STANDARD")<br/> network_storage = optional(list(object({<br/> server_ip = string<br/> remote_mount = string<br/> local_mount = string<br/> fs_type = string<br/> mount_options = string<br/> client_install_runner = optional(map(string))<br/> mount_runner = optional(map(string))<br/> })), [])<br/> on_host_maintenance = optional(string)<br/> preemptible = optional(bool, false)<br/> region = optional(string)<br/> service_account = optional(object({<br/> email = optional(string)<br/> scopes = optional(list(string), ["https://www.googleapis.com/auth/cloud-platform"])<br/> }))<br/> shielded_instance_config = optional(object({<br/> enable_integrity_monitoring = optional(bool, true)<br/> enable_secure_boot = optional(bool, true)<br/> enable_vtpm = optional(bool, true)<br/> }))<br/> source_image_family = optional(string)<br/> source_image_project = optional(string)<br/> source_image = optional(string)<br/> subnetwork_self_link = string<br/> additional_networks = optional(list(object({<br/> network = string<br/> subnetwork = string<br/> subnetwork_project = string<br/> network_ip = string<br/> nic_type = string<br/> stack_type = string<br/> queue_count = number<br/> access_config = list(object({<br/> nat_ip = string<br/> network_tier = string<br/> }))<br/> ipv6_access_config = list(object({<br/> network_tier = string<br/> }))<br/> alias_ip_range = list(object({<br/> ip_cidr_range = string<br/> subnetwork_range_name = string<br/> }))<br/> })))<br/> access_config = optional(list(object({<br/> nat_ip = string<br/> network_tier = string<br/> })))<br/> spot = optional(bool, false)<br/> tags = optional(list(string), [])<br/> termination_action = optional(string)<br/> reservation_name = optional(string)<br/> startup_script = optional(list(object({<br/> filename = string<br/> content = string })), [])<br/><br/> zone_target_shape = string<br/> zone_policy_allow = set(string)<br/> zone_policy_deny = set(string)<br/> }))</pre> | `[]` | no |
| <a name="input_nodeset"></a> [nodeset](#input\_nodeset) | Define nodesets, as a list. | <pre>list(object({<br/> node_count_static = optional(number, 0)<br/> node_count_dynamic_max = optional(number, 1)<br/> node_conf = optional(map(string), {})<br/> nodeset_name = string<br/> additional_disks = optional(list(object({<br/> disk_name = optional(string)<br/> device_name = optional(string)<br/> disk_size_gb = optional(number)<br/> disk_type = optional(string)<br/> disk_labels = optional(map(string), {})<br/> auto_delete = optional(bool, true)<br/> boot = optional(bool, false)<br/> })), [])<br/> bandwidth_tier = optional(string, "platform_default")<br/> can_ip_forward = optional(bool, false)<br/> disable_smt = optional(bool, false)<br/> disk_auto_delete = optional(bool, true)<br/> disk_labels = optional(map(string), {})<br/> disk_size_gb = optional(number)<br/> disk_type = optional(string)<br/> enable_confidential_vm = optional(bool, false)<br/> enable_placement = optional(bool, false)<br/> enable_oslogin = optional(bool, true)<br/> enable_shielded_vm = optional(bool, false)<br/> enable_maintenance_reservation = optional(bool, false)<br/> enable_opportunistic_maintenance = optional(bool, false)<br/> gpu = optional(object({<br/> count = number<br/> type = string<br/> }))<br/> dws_flex = object({<br/> enabled = bool<br/> max_run_duration = number<br/> use_job_duration = bool<br/> })<br/> labels = optional(map(string), {})<br/> machine_type = optional(string)<br/> maintenance_interval = optional(string)<br/> instance_properties_json = string<br/> metadata = optional(map(string), {})<br/> min_cpu_platform = optional(string)<br/> network_tier = optional(string, "STANDARD")<br/> network_storage = optional(list(object({<br/> server_ip = string<br/> remote_mount = string<br/> local_mount = string<br/> fs_type = string<br/> mount_options = string<br/> client_install_runner = optional(map(string))<br/> mount_runner = optional(map(string))<br/> })), [])<br/> on_host_maintenance = optional(string)<br/> preemptible = optional(bool, false)<br/> region = optional(string)<br/> service_account = optional(object({<br/> email = optional(string)<br/> scopes = optional(list(string), ["https://www.googleapis.com/auth/cloud-platform"])<br/> }))<br/> shielded_instance_config = optional(object({<br/> enable_integrity_monitoring = optional(bool, true)<br/> enable_secure_boot = optional(bool, true)<br/> enable_vtpm = optional(bool, true)<br/> }))<br/> source_image_family = optional(string)<br/> source_image_project = optional(string)<br/> source_image = optional(string)<br/> subnetwork_self_link = string<br/> additional_networks = optional(list(object({<br/> network = string<br/> subnetwork = string<br/> subnetwork_project = string<br/> network_ip = string<br/> nic_type = string<br/> stack_type = string<br/> queue_count = number<br/> access_config = list(object({<br/> nat_ip = string<br/> network_tier = string<br/> }))<br/> ipv6_access_config = list(object({<br/> network_tier = string<br/> }))<br/> alias_ip_range = list(object({<br/> ip_cidr_range = string<br/> subnetwork_range_name = string<br/> }))<br/> })))<br/> access_config = optional(list(object({<br/> nat_ip = string<br/> network_tier = string<br/> })))<br/> spot = optional(bool, false)<br/> tags = optional(list(string), [])<br/> termination_action = optional(string)<br/> reservation_name = optional(string)<br/> future_reservation = string<br/> startup_script = optional(list(object({<br/> filename = string<br/> content = string })), [])<br/><br/> zone_target_shape = string<br/> zone_policy_allow = set(string)<br/> zone_policy_deny = set(string)<br/> }))</pre> | `[]` | no |
| <a name="input_nodeset_dyn"></a> [nodeset\_dyn](#input\_nodeset\_dyn) | Defines dynamic nodesets, as a list. | <pre>list(object({<br/> nodeset_name = string<br/> nodeset_feature = string<br/> }))</pre> | `[]` | no |
| <a name="input_nodeset_tpu"></a> [nodeset\_tpu](#input\_nodeset\_tpu) | Define TPU nodesets, as a list. | <pre>list(object({<br/> node_count_static = optional(number, 0)<br/> node_count_dynamic_max = optional(number, 5)<br/> nodeset_name = string<br/> enable_public_ip = optional(bool, false)<br/> node_type = string<br/> accelerator_config = optional(object({<br/> topology = string<br/> version = string<br/> }), {<br/> topology = ""<br/> version = ""<br/> })<br/> tf_version = string<br/> preemptible = optional(bool, false)<br/> preserve_tpu = optional(bool, false)<br/> zone = string<br/> data_disks = optional(list(string), [])<br/> docker_image = optional(string, "")<br/> network_storage = optional(list(object({<br/> server_ip = string<br/> remote_mount = string<br/> local_mount = string<br/> fs_type = string<br/> mount_options = string<br/> client_install_runner = optional(map(string))<br/> mount_runner = optional(map(string))<br/> })), [])<br/> subnetwork = string<br/> service_account = optional(object({<br/> email = optional(string)<br/> scopes = optional(list(string), ["https://www.googleapis.com/auth/cloud-platform"])<br/> }))<br/> project_id = string<br/> reserved = optional(string, false)<br/> }))</pre> | `[]` | no |
| <a name="input_on_host_maintenance"></a> [on\_host\_maintenance](#input\_on\_host\_maintenance) | Instance availability Policy. | `string` | `"MIGRATE"` | no |
Expand Down
Loading
Loading