This is the documentation - and executable code! - for the Service Mesh Academy workshop about certificate management with Vault. The easiest way to use this file is to execute it with demosh.
Things in Markdown comments are safe to ignore when reading this later. When
executing this with demosh, things after the horizontal rule below (which
is just before a commented @SHOW
directive) will get displayed.
@import demosh/demo-tools.sh --> @import demosh/check-requirements.sh -->
BAT_STYLE="grid,numbers"
Welcome to the Service Mesh Academy about Vault and certificates with Linkerd! We'll show off a fairly real-world scenario, where we'll install Linkerd without generating any certificates by hand, and without having Linkerd generate the certificates itself. Instead, we'll use Vault running outside the cluster along with cert-manager running inside the cluster to securely manage all the keys for us.
Note that our goal is not to teach you how to use Vault, in particular: it's to show a practical, relatively low-effort way to actually use external PKI with Linkerd to bootstrap a zero-trust environment in Kubernetes. Many companies have existing external PKI already set up (whether with Vault or something else); being able to make use of it without too much work is a huge win.
In order to demo all this simply, we'll be running Kubernetes in a k3d
cluster. We'll run Vault in Docker to make things easy to demo, but we will
not be running Docker in Kubernetes: it will be a separate Docker container
that happens to be connected to the same Docker network as our k3d
cluster.
The big win of this setup is that you can run it completely on a laptop with no external dependencies. If you want to replicate this with clusters in the cloud, the main thing to worry about is making sure that your Kubernetes cluster has IP connectivity to your Vault instance. Everything else should be pretty much the same.
The way all the pieces fit together here is more complex than normal:
- We'll start by creating our
k3d
cluster. This will be namedpki-cluster
, and we'll tellk3d
to connect it to a network namedpki-network
. - We'll then fire up Vault in a Docker container that's also connected to
pki-network
. (And yes, we'll use Vault in dev mode to make life easier, but that's the only way we'll cheat in this setup.) - We'll then use the
vault
CLI running on our local host machine to configure Vault in Docker.
Taken together, this implies that we'll have to make sure that we can talk to the Vault instance both from inside the Docker network and from our host machine. This mirrors many real-world setups where your Kubernetes cluster is on one network, but you do administration from a different network.
First up, make sure we don't have an old cluster lying around:
k3d cluster delete pki-cluster
Next up, create our new cluster! This command looks complex, but it's actually less terrible than you might think -- most of it is just turning off things we don't need (traefik, local-storage, and metrics-server), and we also expose ports 80 and 443 to our local system to make it easy to try services out.
k3d cluster create pki-cluster \
-p "80:80@loadbalancer" -p "443:443@loadbalancer" \
--network=pki-network \
--k3s-arg '--disable=local-storage,traefik,metrics-server@server:*;agents:*'
Once that's done, let's just make sure we can talk to it:
kubectl get ns
We have a running k3d
cluster, so now let's get Vault going. This is another
complex-looking command:
docker run \
--detach \
--rm --name vault \
-p 8200:8200 \
--network=pki-network \
--cap-add=IPC_LOCK \
hashicorp/vault \
server \
-dev -dev-listen-address 0.0.0.0:8200 \
-dev-root-token-id my-token
Let's break that down.
docker run
: start a container running in Docker--detach
: basically, run the container in the background--rm --name vault
: remove the container when it dies, and name it "vault" so we can find it easily later-p 8200:8200
: expose Vault's API port to our local system--network=pki-network
: connect to the same network as ourk3d
cluster--cap-add=IPC_LOCK
: give the container theIPC_LOCK
capability, which Vault wants
Next is the image name (hashicorp/vault
), and then comes the command line
for Vault itself:
server
is the command to run-dev -dev-listen-address 0.0.0.0:8200
: run Vault in dev mode, binding on all interfaces rather than justlocalhost
-dev-root-token-id my-token
: set the dev-mode root "password" tomy-token
, which we will use to trivially log in later
So, off we go! First make sure we don't have some other vault
container
lying around...
docker kill vault
...then fire up the one we want.
docker run \
--detach \
--rm --name vault \
-p 8200:8200 \
--network=pki-network \
--cap-add=IPC_LOCK \
hashicorp/vault \
server \
-dev -dev-listen-address 0.0.0.0:8200 \
-dev-root-token-id my-token
OK, Vault is running! Let's make sure of that by checking its status from our
local system, using the vault
CLI. We'll start by setting the VAULT_ADDR
environment variable, so that we don't have to include it in every command.
Remember, we'll be running the vault
CLI on our local system, so we can just
do this all using our local shell.
export VAULT_ADDR=http://0.0.0.0:8200/
Then we can run vault status
to make sure all's well.
vault status
So far so good.
Now that we have Vault running, the next step is to set up Vault. We're not going to go deep into the details of the setup, because this is very Vault-specific, but we'll talk a bit about it.
We'll start by authenticating our vault
CLI to the Vault server, using the
dev-root-token-id
that we passed to the server when we started it running.
Remember, this is running on the local host.
vault login my-token
Next up, we need to enable the Vault PKI engine, and configure its maximum allowed expiry time for certificates. Here we're using 90 days (2160 hours).
vault secrets enable pki
vault secrets tune -max-lease-ttl=2160h pki
After that, we need to tell Vault to enable the URLs that cert-manager expects to use when talking to Vault.
vault write pki/config/urls \
issuing_certificates="http://127.0.0.1:8200/v1/pki/ca" \
crl_distribution_points="http://127.0.0.1:8200/v1/pki/crl"
Finally, cert-manager will need to present Vault with a token before Vault will actually do things that cert-manager needs. Vault associates tokens with policies, which are kind of like roles in other systems, so we'll start by creating a policy that allows us to do anything...
echo 'path "pki*" { capabilities = ["create", "read", "update", "delete", "list", "sudo"]}' \
| vault policy write pki_policy -
...and then we'll get a token for that policy.
VAULT_TOKEN=$(vault write -field=token /auth/token/create \
policies="pki_policy" \
no_parent=true no_default_policy=true \
renewable=true ttl=767h num_uses=0)
That's the Vault setup done!
Finally, we'll tell Vault to actually create our Linkerd trust anchor. Note that:
- this certificate only exists within Vault;
- we explicitly give it the common name of
root.linkerd.cluster.local
; - we set its TTL to our maximum of 2160 hours; and
- we tell Vault to generate it using elliptic-curve crypto (
key_type=ec
).
We tell vault write
to only output the certificate, which we save so that we
can inspect it. Note that the certificate contains no private information, so
this is entirely safe.
CERT=$(vault write -field=certificate pki/root/generate/internal \
common_name=root.linkerd.cluster.local \
ttl=2160h key_type=ec)
echo "$CERT" | step certificate inspect -
That's actually all we need there! Now it's on to get cert-manager installed.
We'll start by using Helm to install both cert-manager and trust-manager.
helm repo add --force-update jetstack https://charts.jetstack.io
helm repo update
When we install cert-manager, we'll have it create the cert-manager
namespace, and install the cert-manager CRDs too.
helm install cert-manager jetstack/cert-manager \
-n cert-manager --create-namespace \
--set installCRDs=true --wait
trust-manager will be installed in the cert-manager
namespace, but we'll
explicitly tell it to use the linkerd
namespace as its "trust namespace".
The trust namespace is the single namespace from which trust-manager is
allowed to read information, and we're going to need it to read the Linkerd
identity issuer.
We don't need to create the cert-manager
namespace here (it already exists),
but we do need to create the linkerd
namespace manually so that we can use
it as the trust namespace.
kubectl create namespace linkerd
helm install trust-manager jetstack/trust-manager \
-n cert-manager \
--set app.trust.namespace=linkerd \
--wait
Let's make sure that things look happy.
kubectl get pods -n cert-manager
OK, cert-manager is running! Next step, we need to configure it to produce the certificates we need. This starts with saving the Vault token we got awhile back for cert-manager to use.
kubectl create secret generic \
my-secret-token \
--namespace=cert-manager \
--from-literal=token="$VAULT_TOKEN"
We don't want to actually look into that secret, but we can describe it to make sure that there's some data in it, at least.
kubectl describe secret -n cert-manager my-secret-token
So, yes, there's some data in there. Good sign!
Recall that Linkerd needs two certificates:
- the trust anchor is the root of the heirarchy for Linkerd; and
- the identity issuer is an intermediate CA cert that must be signed by the trust anchor.
We've already told Vault to create the trust anchor for us: next up, we need to configure cert-manager to create the identity issuer certificate. To do this, cert-manager will produce a certificate signing request (CSR), which it will then hand to Vault. Vault will use the CSR to produce a signed identity issuer for cert-manager.
To make all this happen, we use a cert-manager ClusterIssuer resource to tell cert-manager how to talk to Vault. This ClusterIssuer needs three critical bits of information:
- The access token, which we just saved in a Secret.
- The address of the Vault server.
- The URL path to use to ask Vault for a new certificate. For Vault, this is
pki/root/sign-intermediate
.
So the address of the Vault server is the missing bit at the moment: we can't
use 0.0.0.0
as we've been doing from our local host, because cert-manager
needs to talk to Vault from inside the Docker network. That means we need to
figure out the address of the vault
container within that network.
Fortunately, that's not that hard: docker inspect pki-network
will show us
all the details of everything attached to the pki-network
, as JSON, so we
can use jq
to extract the single bit that we need: the IPv4Address
contained in the block that also has a Name
of vault
:
VAULT_DOCKER_ADDRESS=$(docker inspect pki-network \
| jq -r '.[0].Containers | .[] | select(.Name == "vault") | .IPv4Address' \
| cut -d/ -f1)
#@immed
echo Vault is running at ${VAULT_DOCKER_ADDRESS}
Given the right address for Vault, we can assemble the correct YAML:
#@immed
sed -e "s/%VAULT_DOCKER_ADDRESS%/${VAULT_DOCKER_ADDRESS}/g" < k8s/vault-issuer.template > k8s/vault-issuer.yaml
bat k8s/vault-issuer.yaml
Let's go ahead and apply that, then check to make sure it's happy.
kubectl apply -f k8s/vault-issuer.yaml
kubectl get clusterissuers -o wide
Now that cert-manager can sign our certificates, let's go ahead and tell cert-manager how to set things up for Linkerd. First, we'll use a Certificate resource to tell cert-manager how to use the Vault issuer to issue our identity issuer certificate:
bat k8s/identity-issuer-cert.yaml
NOTE that this Certificate goes in the linkerd
namespace, not the
cert-manager
namespace! This is because Linkerd actually needs access to the
identity issuer, so we have cert-manager create it where it will need to be
used.
kubectl apply -f k8s/identity-issuer-cert.yaml
Finally, we'll use a Bundle resource to tell trust-manager to copy only the
public half of the trust anchor into a ConfigMap for Linkerd to use. Note that
Bundles are always cluster-scoped -- but also note that the reason we don't
have to specify namespaces for the source and destination is that
trust-manager can only read from its trust namespace, in this case linkerd
,
and it defaults to writing there too.
bat k8s/trust-anchor-bundle.yaml
kubectl apply -f k8s/trust-anchor-bundle.yaml
Once all that is done, let's take a look at our Certificate and Bundle resources:
kubectl get bundle,certificate -A
We can see that everything looks good there, great!
Let's also take a look to make sure that we actually have the things we expect in the Linkerd namespace. This is a bit ugly if we try to just view the raw certificates, so let's start with a function to make things simpler to view:
inspect_cert () {
sub_selector='\(.extensions.subject_key_id | .[0:16])... \(.subject_dn)'
iss_selector='\(.extensions.authority_key_id | .[0:16])... \(.issuer_dn)'
step certificate inspect --format json \
| jq -r "\"Issuer: $iss_selector\",\"Subject: $sub_selector\""
}
OK. First up, our linkerd-identity-trust-roots
ConfigMap should have the
trust anchor's public half in the ca-bundle.crt
key:
kubectl get configmap \
-n linkerd linkerd-identity-trust-roots \
-o jsonpath='{ .data.ca-bundle\.crt }' \
| inspect_cert
We see that this is indeed a self-signed certificate, great!
Next up: the linkerd-identity-issuer
Secret. It includes three keys:
tls.crt
is the public half of the identity issuer cert;tls.key
is the private key of the identity issuer cert; andca.crt
is the public half of the identity issuer's signer (this should be the same as what's in the bundle above).
Annoyingly, these have an extra layer of base64 encoding applied to them, but
that's OK, we can unwrap that easily enough. So first up, let's look at
tls.crt
:
kubectl get secret \
-n linkerd linkerd-identity-issuer \
-o jsonpath='{ .data.tls\.crt }' \
| base64 -d | inspect_cert
Not a self-signed cert, good! And here's the info for the trust anchor again:
#@immed
kubectl get configmap \
-n linkerd linkerd-identity-trust-roots \
-o jsonpath='{ .data.ca-bundle\.crt }' \
| inspect_cert
So we can see that yes, the identity issuer was issued by the trust anchor.
Just for the fun of it, we can look at ca.crt
in the identity issuer:
kubectl get secret \
-n linkerd linkerd-identity-issuer \
-o jsonpath='{ .data.ca\.crt }' \
| base64 -d | inspect_cert
and yes, that's the same as the trust anchor.
Finally we're ready to deploy Linkerd! We may as well use Helm for this, too. Start by setting up Helm repos:
helm repo add --force-update linkerd https://helm.linkerd.io/stable
helm repo update
...then install the Linkerd CRDs.
helm install linkerd-crds -n linkerd linkerd/linkerd-crds
After that we can actually install Linkerd! Pay attention to these --set
parameters we pass here:
identity.issuer.scheme=kubernetes.io/tls
tells Helm that it should expect the identity issuer to already exist, so don't try to create one, andidentity.externalCA=true
tells Helm that it should expect the trust bundle to already exist, too.
These things, of course, are being handled by cert-manager and trust-manager.
helm install linkerd-control-plane linkerd/linkerd-control-plane \
-n linkerd \
--set identity.issuer.scheme=kubernetes.io/tls \
--set identity.externalCA=true
Once that's done, we can use linkerd check
to validate that everything
worked:
linkerd check
Note that we see a warning for the identity issuer certificate not being valid for at least 60 days. That's expected, since we created that with a 48-hour lifespan!
After all that, we have Vault generating all our certificates, cert-manager and trust-manager handling rotating and distributing them as needed, and Linkerd consuming them for mTLS everywhere.
Critically, Vault is not running in our cluster, and if you look back over this whole process, the private key for the trust anchor has never been revealed outside of Vault. Using an external CA to isolate key generation lets us dramatically increase security of the overall system.
Vault, of course, isn't the only external CA we can use: cert-manager supports a lot of different issuers, including ACME, Vault, Venafi, and many others issuers (see https://cert-manager.io/docs/configuration/external/). We used Vault for this workshop because it's free to use and relatively easy to set up in Docker, but you're encouraged to try other kinds of external CAs.
With that, we're ready to go on and install workloads into our mesh. You can find the source for this demo at
https://github.com/BuoyantIO/service-mesh-academy
in the certificates-with-vault
directory. As always, we welcome feedback!
Join us at https://slack.linkerd.io/ for more.