-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
amazon-cloudwatch-observability fails with open /root/.aws/credentials ignoring the IRSA credentials #1101
Comments
facing the same while runing cloudwatch agent as a daemonset.
|
I uninstalled the amazon-cloudwatch-observability eks add-on and installed using the the instructions at https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-metrics.html But I can set
The |
Although, using The CloudWatch Container Insights still lacks the "Top 10 Nodes by CPU Utilization" , etc. I guess all metrics that have I guess this means that amazon-cloudwatch-agent really needs IMDS , and maybe it should documented so. You can't have it without it, can you?
|
Restrict the use of host networking and block access to instance metadata service
But IMDS access seems like hard requirement for using cloudwatch agent for kubernetes container insights with enhanced observability ( The alternatives are
|
Hi @ecerulm, we are aware of the current IMDS requirement and are tracking an alternative for it when IMDS is unavailable internally. |
This isn't the case - if you edit the operator's agent resource configuration like so, you will see that it is capable of using IRSA: $ kubectl -n amazon-cloudwatch edit amazoncloudwatchagents.cloudwatch.aws.amazon.com Apply this change: apiVersion: v1
items:
- apiVersion: cloudwatch.aws.amazon.com/v1alpha1
kind: AmazonCloudWatchAgent
metadata:
annotations:
pulumi.com/patchForce: "true"
creationTimestamp: "2024-04-01T08:21:38Z"
generation: 5
labels:
app.kubernetes.io/managed-by: amazon-cloudwatch-agent-operator
name: cloudwatch-agent
namespace: amazon-cloudwatch
resourceVersion: "3839446"
uid: 542fecd4-0368-4ab1-8d8b-e7e5ad47c538
spec:
config: '{"agent":{"region":"us-west-2"},"logs":{"metrics_collected":{"app_signals":{"hosted_in":"opal-quokka-6860d02"},"kubernetes":{"cluster_name":"opal-quokka-6860d02","enhanced_container_insights":true}}},"traces":{"traces_collected":{"app_signals":{}}}}'
env:
+ - name: RUN_WITH_IRSA
+ value: true
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName |
Like I commented at #1101 (comment) even with RUN_WITH_IRSA it still goes to IMDS to obtain the instance id, etc
The instance id, etc are needed metrics for “ kubernetes container insights with enhanced observability (enhanced_container_insights)” and since they can’t be obtained those metric are not sent. I don’t think there is anyway to pass the instance id, etc by any other means today but @jefchien seems to be indicating that there may be working in some alternative. |
Any ETA on this fix? |
Maybe the agent can grab the instance ID from the |
With this change applied to the agent, the issue still persists. As mentioned in another comment, I created a custom launch template with an increased number of max hops, which solved the issue. I do understand, however, that this may be a security concern and should be avoided, but, as a temporary measure until the addon is fixed, it is acceptable for our use case. |
The workroud is valid. Need to write the "True" value starting with uppercase. |
I used this helm chart to deploy the add-on: Modifying apiVersion: cloudwatch.aws.amazon.com/v1alpha1
kind: AmazonCloudWatchAgent
metadata:
name: {{ template "cloudwatch-agent.name" . }}
namespace: {{ .Release.Namespace }}
spec:
+ hostNetwork: true
image: {{ template "cloudwatch-agent.image" . }}
mode: daemonset
...
env:
+ - name: RUN_WITH_IRSA
+ value: "True"
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
... Although this is not required, I configured Gatekeeper to restrict host network access exclusive to CloudWatch Agent pods for enhanced security.
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: k8sallowedhostnetworking
spec:
crd:
spec:
names:
kind: K8sAllowedHostNetworking
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8sallowedhostnetworking
default allow = false
allow {
input.review.object.metadata.labels["app.kubernetes.io/name"] == "cloudwatch-agent"
}
violation[{"msg": msg}] {
not allow
input.review.object.spec.hostNetwork == true
msg := "Host network is not allowed for this pod"
}
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedHostNetworking
metadata:
name: allowed-host-networking
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"] |
Thank you @kwangjong The official EKS addon with a 1.30 cluster does not work (it produces the local credentials file problem). Your solution based upon https://github.com/aws-observability/helm-charts has worked for me. |
Describe the bug
When IDMSv2 is enabled in the worker nodes with hop limit 1 , the IMDSv2 is not accessible from the pods. In general, I don't want pods to access IMDS since they can get credentials for the node IAM role.
When IDMSv2 is no accessible it seems that the cloudagent (i'm using amazon-cloudwatch-observability eks addon) , tries to use credentials from the non existing file /root/.aws/credentials instead of using the credentials from IRSA. The pod uses a service account with IRSA annotation and it was the environment variables
AWS_ROLE_ARN
andAWS_WEB_IDENTITY_TOKEN_FILE
(injected by IRSA).But I believe the amazon-cloudwatch-agent is ignoring the IRSA credentials (I suspect it it's because IMDSv2 is not available , and the it decides it "onprem")
I see on startup of the pod
Steps to reproduce
If possible, provide a recipe for reproducing the error.
EKS 1.29
EKS nodes 1.29 bottlerocket
with IMDSv2 (http tokens required, hop limit 1)
amazon-cloudwatch-observability eks addon v1.4.0-eksbuild.1 (default config)
What did you expect to see?
I expect to allow me to override "OnPrem" / "EC2" from the eks addon configuration , I don't see that as possibility in the amazon-cloudwatch-observability addon,
What did you see instead?
I see that it detects "OnPremise" and I believe that in turn forces it to use /root/.aws/credentials when in fact it should be using the credentials from IRSA via the existing environment variables
$AWS_ROLE_ARN
andAWS_WEB_IDENTITY_TOKEN_FILE
.What version did you use?
Version: (e.g.,
v1.247350.0
, etc)I'm using amazon-cloudwatch-observability eks addon version v1.4.0-eksbuild.1, I don't know which version of the amazon-cloudwatch-agent is indluded with that
What config did you use?
Config: (e.g. the agent json config file)
Environment
EKS 1.29
EKS nodes 1.29 bottlerocket
with IMDSv2 (http tokens required, hop limit 1)
amazon-cloudwatch-observability eks addon v1.4.0-eksbuild.1, (default config)
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: