Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWXBackup: mkdir: cannot create directory '/backups/tower-openshift-backup-2024-04-17-141257': Permission denied #1830

Open
3 tasks done
PWeverton opened this issue Apr 17, 2024 · 10 comments · May be fixed by #1854
Open
3 tasks done

Comments

@PWeverton
Copy link

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.

Bug Summary

Upgraded awx operator from 1.1.4 to 2.13.1 and started to get issues when trying to take backups.
Here`s an example of the AWXBackup I have:

apiVersion: awx.ansible.com/v1beta1
kind: AWXBackup
metadata:
name: awx-demo
namespace: awx-test
spec:
deployment_name: awx-demo
backup_pvc: 'backup-pvc'
no_log: false

Once applied, operator tries to create a folder for the backup on the db-management pod. However, its getting the issue permission denied

[backup : Set backup directory name] **************************************\r\ntask path: /opt/ansible/roles/backup/tasks/postgres.yml:55\nok: [localhost] => {"ansible_facts": {"backup_dir": "/backups/tower-openshift-backup-2024-04-17-141257"}, "changed": false}\n\r\nTASK [backup : Create directory for backup] ************************************\r\ntask path: /opt/ansible/roles/backup/tasks/postgres.yml:59\nansible.cfg.\nfatal: [localhost]: FAILED! => {"changed": true, "rc": 1, "return_code": 1, "stderr": "mkdir: cannot create directory '/backups/tower-openshift-backup-2024-04-17-141257': Permission denied\n", "stderr_lines": ["mkdir: cannot create directory '/backups/tower-openshift-backup-2024-04-17-141257': Permission denied"]

AWX Operator version

2.13.1

AWX version

24.0.0

Kubernetes platform

kubernetes

Kubernetes/Platform version

microk8s v1.28.8

Modifications

no

Steps to reproduce

Fresh installation and trying to create a backup using AWXBackup CR.

Expected results

Take the backup successfully

Actual results

Failed backup

Additional information

No response

Operator Logs

No response

@jessicamack
Copy link
Member

Hello @PWeverton, can you read through this issue and see if it applies to your case #1775? The new postgres image is expecting to write to your dir as uid-26. There are some workarounds discussed to address the change.

@PWeverton
Copy link
Author

Hello @jessicamack, thanks for replying.
Well, the operator is having issues when trying to create the dir on db-management pod, so I don't think the issue you marked is related to it. However, I just reproduced the action items suggested there.
Here's what I did:

  • Used operator 2.15.0, as that feature was added on this tag.
  • Then adjusted my AWX CR:

apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx-app
spec:
no_log: false
service_type: nodeport
postgres_data_volume_init: true
postgres_init_container_commands: |
chown 26:0 /var/lib/pgsql/data
chmod 700 /var/lib/pgsql/data

Even after this change, the issue with the permissions still there.

@kurokobo
Copy link
Contributor

This issue is not addressed by #1805 (postgres_data_volume_init and postgres_init_container_commands) since this issue is in following situation:

  • Occurs in ephemeral *-db-management pod instead of main PSQL pod
  • Occurs in backup pvc instead of main PSQL pvc
  • No init container for *-db-management pod is implemented in the current AWX Operator
  • From 2.13.0, the image for *-db-management pod has also been changed to sclorg's one

So we should implement init container for *-db-management pod and have a flag to modify owners/perms for backup pvc, or have a flag to run *-db-management pod as UID:0.

@rooftopcellist
F.Y.I.

@PWeverton
Copy link
Author

hi @kurokobo, any movement here?
Thanks

@ranvit
Copy link
Contributor

ranvit commented May 10, 2024

I just made PR #1854 , I'm able to take successful backups now if I run that init container once per PVC

@pombaer
Copy link

pombaer commented May 13, 2024

Please add this chang to the next Release since awxbackup also cannot create directory in my deployment because of permission issues. Changing the permissions on the NFS server to User ID 26 solved it but this is an manuall configuration step das workarround.

@pombaer
Copy link

pombaer commented May 14, 2024

May it helps someone, i workaround this problem by creating a cronjob which crates my backup and added an initcontainer which sets the permissions to 26:26 on the backup folder.

@bar0n36
Copy link

bar0n36 commented May 15, 2024

I hit this issue after upgrading to 2.15.0. As per @pombaer first suggestion, I added another NFS mount and set the owner UID and GID to 26, then created a new PV/PVC and pushed the backup to that.
For anyone using AWS EFS, you need to create an access point with the correct uid and gid and mount with that for it to work properly.

@PWeverton PWeverton reopened this Aug 5, 2024
@morley461
Copy link

morley461 commented Sep 6, 2024

Did this issue get resolved? I have the same issue running 2.19.1

Interestingly if I change the Postgres in my awxbackup.yml to

  _postgres_image: docker.io/postgres
  _postgres_image_version: 15-alpine

The issue goes away for the "mkdir: cannot create directory '/backups': Permission denied" and I can take a successful backup. However this just shifts my issue to a restore problem of.

"pg_restore: error: unsupported version (1.15) in file header"

So I went back and modified the permissions to 26 on the /backups and it works, but my hack was dirty so wondering the correct way this will be done.

 # _postgres_image: docker.io/postgres
 # _postgres_image_version: 15-alpine

running k3s.
Client Version: v1.29.4+k3s1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4+k3s1

@PWeverton
Copy link
Author

The PR still open and seems it will take a while to be merged.
As a workaround, you can clone the repo and modify the way this is handled. You can use one of these 2 options:

roles/backup/templates/management-pod.yml.j2

1 - Add an init container

initContainers:
- name: init-pvc-chown
image: busybox
# _postgres_image runs as uid 26
command: ["sh", "-c", "chown -R :26 /backups && chmod -R 770 /backups"]
volumeMounts:
- name: {{ ansible_operator_meta.name }}-backup
mountPath: /backups
readOnly: false

2 - Run the container with privileged user

containers:

  • name: {{ ansible_operator_meta.name }}-db-management
    image: "{{ _postgres_image }}"
    imagePullPolicy: "{{ image_pull_policy }}"
    command: ["sleep", "infinity"]
    securityContext:
    runAsUser: 0
    privileged: true

Once you have it in place, just build the image and set the image url on your operator deployment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants