Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Juju provider not automatically handling SSH known_hosts when adding Juju machines via SSH. #580

Open
Vultaire opened this issue Sep 18, 2024 · 2 comments
Labels
area/ssh-key hint/main going on main branch

Comments

@Vultaire
Copy link

Description

Unsure if this is a bug or a feature request, but: when experimenting with a Juju-on-LXD Terraform script, where I spin up LXDs and then add them to Juju via SSH, one problem which came up is prompting for verification of SSH host keys. If I have a Terraform script like the above which spins up LXDs and prepopulates their known_hosts files to allow SSH login, I would hope that I can then add them as Juju machines via SSH address.

What happens is: during "terraform plan", yes/no prompts pop up blocking deploy of machines. These prompts are also mixed with other Terraform output, so not only does it break automation, but it's not immediately clear that it is waiting for a "yes" response for each machine added in this way.

The local-exec provisioner can be used to work around this (see notes), but as that is discouraged in Terraform, it seemed worth filing this bug.

Urgency

Casually reporting

Terraform Juju Provider version

0.14.0

Terraform version

v1.9.6-dev (via snap, channel latest/stable)

Juju version

3.5.2

Terraform Configuration(s)

terraform {
  required_providers {
    juju = {
      source  = "juju/juju"
      version = "0.14.0"
    }
    lxd = {
      source = "terraform-lxd/lxd"
      version = "2.3.0"
    }
  }
}

provider "juju" {}

provider "lxd" {}

variable "ssh_public_key" {
  type = string
  default = "~/.ssh/id_rsa.pub"
}

variable "ssh_private_key" {
  type = string
  default = "~/.ssh/id_rsa"
}

resource "lxd_instance" "juju_machine_1" {
  name = "juju-machine-1"
  image = "ubuntu:jammy"
}

resource "lxd_instance_file" "juju_machine_1_keys" {
  instance = lxd_instance.juju_machine_1.name
  source_path = pathexpand(var.ssh_public_key)
  target_path = "/home/ubuntu/.ssh/authorized_keys"
  mode = "0600"
  # These are implicitly set, but then get prompted for replacement on updates
  # if not explicitly set.  Possible LXD provider bug.
  uid = 1000
  gid = 1000
}

resource "juju_model" "machine_test" {
  name = "machine-test"
}

resource "juju_machine" "juju_machine_1" {
  model = juju_model.machine_test.name
  ssh_address = "ubuntu@${lxd_instance.juju_machine_1.ipv4_address}"
  public_key_file = pathexpand(var.ssh_public_key)
  private_key_file = pathexpand(var.ssh_private_key)

  depends_on = [lxd_instance_file.juju_machine_1_keys]
}

resource "juju_application" "ubuntu" {
  name = "ubuntu"
  model = juju_model.machine_test.name
  charm {
    name = "ubuntu"
    channel = "latest/stable"
    base = "[email protected]"
  }
  units = 1
  placement = juju_machine.juju_machine_1.machine_id
}

Reproduce / Test

terraform init; terraform apply

Debug/Panic Output

Including only the relevant portion:

[...]
2024-09-18T08:27:00.124-0700 [TRACE] provider.terraform-provider-juju_v0.14.0: Calling provider defined Resource Create: @module=sdk.framework tf_req_id=18e41831-6a82-0042-dfd2-8837ee90a85a tf_rpc=ApplyResourceChange @caller=github.com/hashicorp/[email protected]/internal/fwserver/server_createresource.go:100 tf_provider_addr=registry.terraform.io/juju/juju tf_resource_type=juju_machine timestamp=2024-09-18T08:27:00.124-0700
2024-09-18T08:27:00.124-0700 [TRACE] provider.terraform-provider-juju_v0.14.0: ModelUUID cache looking for "machine-test": @caller=github.com/juju/terraform-provider-juju/internal/juju/client.go:243 @module=juju.client machine-test="uuid(b20d4cf9-c128-4531-85b1-0ccec73bb38a) type(iaas)" timestamp=2024-09-18T08:27:00.124-0700
2024-09-18T08:27:00.124-0700 [TRACE] provider.terraform-provider-juju_v0.14.0: Found uuid for "machine-test" in cache: @caller=github.com/juju/terraform-provider-juju/internal/juju/client.go:243 @module=juju.client timestamp=2024-09-18T08:27:00.124-0700
The authenticity of host '10.97.53.11 (10.97.53.11)' can't be established.
ED25519 key fingerprint is SHA256:ABvGgO1SbE7207OAvmrrQbqDcpw3JqBl3AlAB95aMPY.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? 2024-09-18T08:27:03.797-0700 [TRACE] dag/walk: vertex "juju_application.ubuntu (expand)" is waiting for "juju_machine.juju_machine_1"
2024-09-18T08:27:03.797-0700 [TRACE] dag/walk: vertex "juju_application.ubuntu" is waiting for "juju_application.ubuntu (expand)"
2024-09-18T08:27:03.797-0700 [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/juju/juju\"] (close)" is waiting for "juju_application.ubuntu"
2024-09-18T08:27:05.107-0700 [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/juju/juju\"] (close)"
2024-09-18T08:27:08.800-0700 [TRACE] dag/walk: vertex "juju_application.ubuntu" is waiting for "juju_application.ubuntu (expand)"
2024-09-18T08:27:08.800-0700 [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/juju/juju\"] (close)" is waiting for "juju_application.ubuntu"
2024-09-18T08:27:08.800-0700 [TRACE] dag/walk: vertex "juju_application.ubuntu (expand)" is waiting for "juju_machine.juju_machine_1"
2024-09-18T08:27:10.108-0700 [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/juju/juju\"] (close)"
juju_machine.juju_machine_1: Still creating... [10s elapsed]
2024-09-18T08:27:13.803-0700 [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/juju/juju\"] (close)" is waiting for "juju_application.ubuntu"
2024-09-18T08:27:13.803-0700 [TRACE] dag/walk: vertex "juju_application.ubuntu (expand)" is waiting for "juju_machine.juju_machine_1"
2024-09-18T08:27:13.803-0700 [TRACE] dag/walk: vertex "juju_application.ubuntu" is waiting for "juju_application.ubuntu (expand)"
2024-09-18T08:27:15.109-0700 [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/juju/juju\"] (close)"
2024-09-18T08:27:18.806-0700 [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/juju/juju\"] (close)" is waiting for "juju_application.ubuntu"
2024-09-18T08:27:18.807-0700 [TRACE] dag/walk: vertex "juju_application.ubuntu" is waiting for "juju_application.ubuntu (expand)"
2024-09-18T08:27:18.807-0700 [TRACE] dag/walk: vertex "juju_application.ubuntu (expand)" is waiting for "juju_machine.juju_machine_1"
2024-09-18T08:27:20.110-0700 [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/juju/juju\"] (close)"
juju_machine.juju_machine_1: Still creating... [20s elapsed]
[...repeats the last block of messages every 10 seconds...]

Notes & References

Workaround exists:

  • Use a terraform_data resource with the local-exec provisioner to add host keys via ssh-keyscan
  • Add a dependency on the above resource prior to adding the machine

Here is a concrete patch which implements the workaround for the provided example configuration:

diff --git a/sscce.tf b/sscce.tf
--- a/sscce.tf
+++ b/sscce.tf
@@ -45,13 +45,30 @@ resource "juju_model" "machine_test" {
   name = "machine-test"
 }
 
+resource "terraform_data" "accept_ssh_host_keys" {
+  # NOTE: Adding SSH machines like this is not fully automatic: there's an SSH host prompt.
+  # We need to finesse it with some glue.
+
+  # Drop previous host keys, if any
+  provisioner "local-exec" {
+    command = "ssh-keygen -f ${pathexpand("~/.ssh/known_hosts")} -R ${lxd_instance.juju_machine_1.ipv4_address}"
+  }
+
+  # Add new host keys to ~/.ssh/known_hosts
+  provisioner "local-exec" {
+    command = "ssh-keyscan ${lxd_instance.juju_machine_1.ipv4_address} >> ${pathexpand("~/.ssh/known_hosts")}"
+  }
+
+  depends_on = [lxd_instance_file.juju_machine_1_keys]
+}
+
 resource "juju_machine" "juju_machine_1" {
   model = juju_model.machine_test.name
   ssh_address = "ubuntu@${lxd_instance.juju_machine_1.ipv4_address}"
   public_key_file = pathexpand(var.ssh_public_key)
   private_key_file = pathexpand(var.ssh_private_key)
 
-  depends_on = [lxd_instance_file.juju_machine_1_keys]
+  depends_on = [lxd_instance_file.juju_machine_1_keys, terraform_data.accept_ssh_host_keys]
 }
 
 resource "juju_application" "ubuntu" {
@nvinuesa nvinuesa added hint/main going on main branch area/ssh-key labels Sep 27, 2024
@hmlanigan
Copy link
Member

@Vultaire Have you tried the juju_ssh_keys resource? Or cloudinit_userdata in model config?

@Vultaire
Copy link
Author

Vultaire commented Jan 7, 2025

Hello Heather,

Sorry for the delay; I just saw the notification today.

I don't think my issue is being clearly understood; I don't believe this is an issue regarding acceptance of my SSH key to log into the machine, but rather an issue because the SSH host key isn't being accepted automatically and is being exposed as a prompt while running "terraform plan". But, I will nonetheless also try the "juju_ssh_key" suggestion.

From a fresh Juju 3.5.5 controller on LXD, using the terraform configuration from this ticket without workarounds, with only this modification:

resource "juju_ssh_key" "mykey" {
  model = juju_model.machine_test.name
  payload = file(pathexpand(var.ssh_public_key))
}

The result is:

lxd_instance.juju_machine_1: Creating...
juju_model.machine_test: Creating...
juju_model.machine_test: Creation complete after 1s [id=7d43dbbf-5cbe-4421-8de9-a04a1f7e1211]
juju_ssh_key.mykey: Creating...
lxd_instance.juju_machine_1: Creation complete after 7s [name=juju-machine-1]
lxd_instance_file.juju_machine_1_keys: Creating...
lxd_instance_file.juju_machine_1_keys: Creation complete after 0s
juju_machine.juju_machine_1: Creating...
The authenticity of host '10.213.127.78 (10.213.127.78)' can't be established.
ED25519 key fingerprint is SHA256:MHWUrFkEzCMTRzuUV1ksAGJU3/wkB1stY5BXR7y6/A8.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? 

That is: it is the SSH host key verification which is causing problems, not verification of my client SSH key.

It should be pretty easy to copy/paste the provided terraform plan, as is, and recreate the issue in a local, LXD-backed Juju environment and see what I'm talking about here. If you run the above, and if you do not press Enter when prompted, it will end up timing out. It only successfully deploys if you respond to this prompt.

The workaround I provided does effectively address this, but it is a kludge. I would prefer if the Juju provider had a more elegant way of supporting this, because as is, it seems like SSH machines can only be added either with prompts like this requiring user input (breaking automation), or that the SSH client on the machine running terraform is pre-primed with the known_hosts entry for the target system (which won't work for machines created via the TF scripts). The only alternative I have is by manually inserting glue like the workaround I've noted under "Notes & References", which fills this gap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ssh-key hint/main going on main branch
Projects
None yet
Development

No branches or pull requests

3 participants