-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Cleanup region azure * mend
- Loading branch information
Showing
20 changed files
with
406 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,6 +24,7 @@ Table of Contents | |
* [EMR Serverless](#emr-serverless) | ||
* [Azure](#azure-1) | ||
* [Login](#login) | ||
* [HDInsight](#hdinsight) | ||
* [AKS](#aks) | ||
* [GCP](#gcp-1) | ||
* [Login](#login-2) | ||
|
@@ -67,6 +68,7 @@ as well. Check code comments for details. | |
| GCP | Dataproc |2.0.27-ubuntu18| 3.1.3 | 1.0.0 | 0.3.3 | -| | ||
| GCP | Dataproc Serverless|1.0.21| 3.2.2 | 1.1.0 | 0.4.1 | gcr.io/${TF_VAR_project_name}/spark-py:pysequila-0.4.1-dataproc-latest | | ||
| Azure | AKS |1.23.12|3.2.2|1.1.0|0.4.1| docker.io/biodatageeks/spark-py:pysequila-0.4.1-aks-latest| | ||
| Azure | HDInsight| 5.0.300.1 | 3.2.2 | 1.1.0 | 0.4.1 |- | | ||
| AWS | EKS|1.23.9 | 3.2.2 | 1.1.0 | 0.4.1 | docker.io/biodatageeks/spark-py:pysequila-0.4.1-eks-latest| | ||
| AWS | EMR Serverless|emr-6.7.0 | 3.2.1 | 1.1.0 | 0.4.1 |- | | ||
|
||
|
@@ -118,8 +120,10 @@ terraform init | |
|
||
## Using SeQuiLa cli Docker image for Azure | ||
```bash | ||
export TF_VAR_region=westeurope | ||
docker pull biodatageeks/sequila-cloud-cli:latest | ||
docker run --rm -it \ | ||
-e TF_VAR_region=${TF_VAR_region} \ | ||
-e TF_VAR_pysequila_version=${TF_VAR_pysequila_version} \ | ||
-e TF_VAR_sequila_version=${TF_VAR_sequila_version} \ | ||
-e TF_VAR_pysequila_image_aks=${TF_VAR_pysequila_image_aks} \ | ||
|
@@ -162,7 +166,7 @@ terraform init | |
|
||
## Azure | ||
* [AKS (Azure Kubernetes Service)](#AKS): :white_check_mark: | ||
|
||
* [HDInsight](#hdinsight): :white_check_mark: | ||
## AWS | ||
* [EMR Serverless](#emr-serverless): :white_check_mark: | ||
* [EKS(Elastic Kubernetes Service)](#EKS): :white_check_mark: | ||
|
@@ -277,6 +281,60 @@ az login | |
az account set --subscription "Azure subscription 1" | ||
``` | ||
|
||
## HDInsight | ||
:bulb: According to the [release notes](https://learn.microsoft.com/en-us/azure/hdinsight/hdinsight-50-component-versioning?source=recommendations) | ||
HDInisght 5.0 comes with Apache Spark 3.1.2. Unfortunately it is 3.0.2: | ||
|
||
 | ||
|
||
Since HDInsight is in fact a full-fledged Hadoop cluster we were able to add to the Terraform module support for Apache Spark 3.2.2 using | ||
[a script action](https://learn.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-customize-cluster-linux) mechanism. | ||
|
||
|
||
### Deploy | ||
```bash | ||
export TF_VAR_hdinsight_gateway_password=$(tr -dc A-Za-z0-9 </dev/urandom | head -c 16 ; echo '') | ||
export TF_VAR_hdinsight_ssh_password=$(tr -dc A-Za-z0-9 </dev/urandom | head -c 16 ; echo '') | ||
terraform apply -var-file=../../env/azure.tfvars -var-file=../../env/azure-hdinsight.tfvars -var-file=../../env/_all.tfvars | ||
``` | ||
Check Terraform output variables for ssh connection string, credentials and Spark Submit command, e.g. | ||
```bash | ||
Apply complete! Resources: 0 added, 0 changed, 0 destroyed. | ||
|
||
Outputs: | ||
|
||
hdinsight_gateway_password = "w8aN6oVSJobq7eu4" | ||
hdinsight_ssh_password = "wun6RzBBPWD9z9ke" | ||
pysequila_submit_command = <<EOT | ||
export SPARK_HOME=/opt/spark | ||
spark-submit \ | ||
--master yarn \ | ||
--packages org.biodatageeks:sequila_2.12:1.1.0 \ | ||
--conf spark.pyspark.python=/usr/bin/miniforge/envs/py38/bin/python3 \ | ||
--conf spark.driver.cores=1 \ | ||
--conf spark.driver.memory=1g \ | ||
--conf spark.executor.cores=1 \ | ||
--conf spark.executor.memory=3g \ | ||
--conf spark.executor.instances=1 \ | ||
--conf spark.files=wasb://[email protected]/data/Homo_sapiens_assembly18_chr1_chrM.small.fasta,wasb://[email protected]/data/Homo_sapiens_assembly18_chr1_chrM.small.fasta.fai \ | ||
wasb://[email protected]/jobs/pysequila/sequila-pileup.py | ||
EOT | ||
ssh_command = "ssh [email protected]" | ||
``` | ||
|
||
### Run | ||
1. Use `ssh_command` and `hdinsight_ssh_password` to connect to the head node. | ||
2. Run `pysequila_submit_command` command. | ||
|
||
 | ||
 | ||
|
||
### Cleanup | ||
```bash | ||
terraform destroy -var-file=../../env/azure.tfvars -var-file=../../env/azure-hdinsight.tfvars -var-file=../../env/_all.tfvars | ||
``` | ||
|
||
## AKS | ||
### Deploy | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
output "hdinsight_gateway_password" { | ||
value = try(module.hdinsight[0].hdinsight_gateway_password, "No HDInsight setup.") | ||
} | ||
|
||
output "hdinsight_ssh_password" { | ||
value = try(module.hdinsight[0].hdinsight_ssh_password, "No HDInsight setup.") | ||
} | ||
|
||
output "ssh_command" { | ||
value = try(module.hdinsight[0].ssh_command, "No HDInsight setup.") | ||
} | ||
|
||
output "pysequila_submit_command" { | ||
value = try(module.hdinsight[0].pysequila_submit_command, "No HDInsight setup.") | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
azure-hdinsight-deploy = true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +0,0 @@ | ||
region = "westeurope" | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
# hdinsight | ||
|
||
<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK --> | ||
## Requirements | ||
|
||
No requirements. | ||
|
||
## Providers | ||
|
||
| Name | Version | | ||
|------|---------| | ||
| <a name="provider_azurerm"></a> [azurerm](#provider\_azurerm) | n/a | | ||
| <a name="provider_random"></a> [random](#provider\_random) | n/a | | ||
|
||
## Modules | ||
|
||
No modules. | ||
|
||
## Resources | ||
|
||
| Name | Type | | ||
|------|------| | ||
| [azurerm_hdinsight_spark_cluster.sequila](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/hdinsight_spark_cluster) | resource | | ||
| [azurerm_storage_blob.sequila](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/storage_blob) | resource | | ||
| [azurerm_storage_blob.spark](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/storage_blob) | resource | | ||
| [random_string.random-suffix](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/string) | resource | | ||
|
||
## Inputs | ||
|
||
| Name | Description | Type | Default | Required | | ||
|------|-------------|------|---------|:--------:| | ||
| <a name="input_data_files"></a> [data\_files](#input\_data\_files) | Data files to copy to staging bucket | `list(string)` | n/a | yes | | ||
| <a name="input_gateway_password"></a> [gateway\_password](#input\_gateway\_password) | Hadoop gateway password (i.e. Ambari, YARN UI console, etc) | `string` | n/a | yes | | ||
| <a name="input_hdinsight_version"></a> [hdinsight\_version](#input\_hdinsight\_version) | HDInsight version | `string` | `"5.0"` | no | | ||
| <a name="input_node_ssh_password"></a> [node\_ssh\_password](#input\_node\_ssh\_password) | SSH password to all nodes in the cluster | `string` | n/a | yes | | ||
| <a name="input_pysequila_version"></a> [pysequila\_version](#input\_pysequila\_version) | PySeQuiLa version | `string` | n/a | yes | | ||
| <a name="input_region"></a> [region](#input\_region) | Location of the cluster | `string` | n/a | yes | | ||
| <a name="input_resource_group"></a> [resource\_group](#input\_resource\_group) | Azure resource group | `string` | n/a | yes | | ||
| <a name="input_sequila_version"></a> [sequila\_version](#input\_sequila\_version) | SeQuiLa version | `string` | n/a | yes | | ||
| <a name="input_spark_version"></a> [spark\_version](#input\_spark\_version) | Apache Spark version | `string` | `"3.2.2"` | no | | ||
| <a name="input_storage_account_access_key"></a> [storage\_account\_access\_key](#input\_storage\_account\_access\_key) | Storage account access key | `string` | n/a | yes | | ||
| <a name="input_storage_account_name"></a> [storage\_account\_name](#input\_storage\_account\_name) | n/a | `string` | n/a | yes | | ||
| <a name="input_storage_container_id"></a> [storage\_container\_id](#input\_storage\_container\_id) | Azure storage container | `string` | n/a | yes | | ||
| <a name="input_storage_container_name"></a> [storage\_container\_name](#input\_storage\_container\_name) | n/a | `string` | n/a | yes | | ||
|
||
## Outputs | ||
|
||
| Name | Description | | ||
|------|-------------| | ||
| <a name="output_hdinsight_gateway_password"></a> [hdinsight\_gateway\_password](#output\_hdinsight\_gateway\_password) | n/a | | ||
| <a name="output_hdinsight_ssh_password"></a> [hdinsight\_ssh\_password](#output\_hdinsight\_ssh\_password) | n/a | | ||
| <a name="output_pysequila_submit_command"></a> [pysequila\_submit\_command](#output\_pysequila\_submit\_command) | n/a | | ||
| <a name="output_ssh_command"></a> [ssh\_command](#output\_ssh\_command) | n/a | | ||
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK --> |
Oops, something went wrong.