This terraform deployment is fully automated. However, due to some AWS limitations, you'll have to bootstrap a few things before booting up your first EMR cluster.
- AWS CLI tool.
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
AWS account env variables set.- Terraform
aws configure
aws emr create-default-roles
[NOTE!] This will create the default AWS EMR roles. This deployment depends upon them. You only need to run this once per AWS account.
Using terraform create buckets, network, ecr, ec2 instance, iam, etc.
- Use provided example
./accumulator/infra/examples/main.tf
.
- Adjust variable values.
- Setup where to store terraform state(locally or s3)
[NOTE!] if preferred, you can run terraform directly from infra folder setting variables with
-var
or-var-file
- Apply terraform changes:
terraform init
terraform plan
terraform apply
- Infra among other related resources now should have following components:
- ECR:
zk-rollup-docker-registry
where to store Sequencer image builds. - S3:
emr_input
for EMR incoming data. - S3:
emr_output
for EMR result data. - S3:
emr_data
for EMR meta data. It contains the code for the mappers/reducers, the bootstrap script, etc. - EC2:
nginx
andsequencer
.
[NOTE!] Sequencer image should be built and stored in above mentioned ECR. At the moment it's built by github actions workflow. Inspect the
publish
job for further details.
- Now that the infra components are in place, run steps described in GitHub actions workflow. After they successfully run, you should have:
- A zk rollup image that is published to ECR repository(created with Terraform on the 2nd step).
- -emr-data bucket that contains
emr_bootstrap_script.sh
,mapper.js
,reducer.js
andcompilation/
-
On EC2 instance aws console page look up IP address as well as DNS name.
-
Terraform when creating an EC2 instance tried starting
podman-zk-rollup.service
however at that stage image as well as ERC repository did not exist yet. Therefore, restart thepodman-zk-rollup
service. See bellow. -
You should be able to use
ec2-instance-dns-name:443
withgrpcurl
.
The sequencer is launched on a NixOS EC2 machine. This machine runs the Sequencer docker image through a systemd-managed podman service. This service is reverse-proxied by Nginx.
You can force the EC2 machine to load the latest Sequencer image via:
ssh [email protected] "sudo systemctl restart podman-zk-rollup.service"
You can see the containers logs via:
ssh [email protected] "journalctl -u podman-zk-rollup.service -f"
To undo everything follow these steps:
- Delete everything from emr-
{input,output,data}
buckets. - Delete all docker images from ECR.
- run
terraform destroy