Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated terraform configuration to initialize empty input and ouput directories #4843

Merged
2 changes: 1 addition & 1 deletion deploy/terraform-custom-datacommons/modules/locals.tf
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ locals {
},
{
name = "OUTPUT_DIR"
value = "gs://${local.dc_gcs_data_bucket_path}/output"
value = "gs://${local.dc_gcs_data_bucket_path}/${var.dc_gcs_data_bucket_output_folder}"
},
{
name = "FORCE_RESTART"
Expand Down
16 changes: 15 additions & 1 deletion deploy/terraform-custom-datacommons/modules/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,20 @@ resource "google_storage_bucket" "dc_gcs_data_bucket" {
uniform_bucket_level_access = true
}

# Input 'folder' for the data loading job. Initialized as an empty blob
resource "google_storage_bucket_object" "dc_gcs_data_bucket_input_folder" {
name = "${var.dc_gcs_data_bucket_input_folder}/"
content = ""
bucket = "${google_storage_bucket.dc_gcs_data_bucket.name}"
}

# Output 'folder' for the data loading job. Initialized as an empty blob
resource "google_storage_bucket_object" "dc_gcs_data_bucket_output_folder" {
name = "${var.dc_gcs_data_bucket_output_folder}/"
content = ""
bucket = "${google_storage_bucket.dc_gcs_data_bucket.name}"
}

# Generate a random suffix to append to api keys.
# A deleted API key fully expires 30 days after deletion, and in the 30-day
# window the ID remains taken. This suffix allows terraform to give API
Expand Down Expand Up @@ -360,7 +374,7 @@ resource "google_cloud_run_v2_job" "dc_data_job" {

env {
name = "INPUT_DIR"
value = "gs://${local.dc_gcs_data_bucket_path}/input"
value = "gs://${local.dc_gcs_data_bucket_path}/${var.dc_gcs_data_bucket_input_folder}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need one for OUTPUT_DIR, since we're giving them that variable too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already added that! dc_gcs_data_bucket_output_folder

}
}
execution_environment = "EXECUTION_ENVIRONMENT_GEN2"
Expand Down
12 changes: 12 additions & 0 deletions deploy/terraform-custom-datacommons/modules/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,18 @@ variable "dc_gcs_data_bucket_path_override" {
default = ""
}

variable "dc_gcs_data_bucket_input_folder" {
Copy link
Contributor

@kmoscoe kmoscoe Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I wonder if we need the "dc" prefix, since we don't use it for other services. Also, is it possible to remove "path" from the previous option? It's kind of misleading since it's only the bucket name. Can we also remove the "override"? We don't name any other optional variables that way, so it seems weird here. I would just call it "gcs_data_bucket_name".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Removed dc_ prefix to all of the bucket variables, and renamed the bucket name to gcs_data_bucket_name

description = "Input data folder in the GCS data bucket"
type = string
default = "input"
}

variable "dc_gcs_data_bucket_output_folder" {
description = "Output data folder in the GCS data bucket"
type = string
default = "output"
}

variable "dc_gcs_data_bucket_location" {
description = "Data Commons GCS data bucket location"
type = string
Expand Down
Loading