Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

awx-ee ansible-runner settings override #80

Open
DrackThor opened this issue Jul 20, 2021 · 2 comments
Open

awx-ee ansible-runner settings override #80

DrackThor opened this issue Jul 20, 2021 · 2 comments

Comments

@DrackThor
Copy link

Hi,

I am currently running AWX 19.2.2 with an execution environment build atop awx-ee:0.5.5.
There I'm connecting to a windows machine and running a PS script.
After exactly 30min the Job aborts without any errors.
I assume this has something to do with the idle_timeout of ansible-runner.
So far I could not find any other issue regarding timeout, that's what lead me to the ansible-runner in the first place.
Is there a possibility to override the env/settings (increase idle_timeout) when using a awx-ee based execution environment?

Thanks in advance!

@shanemcd
Copy link
Member

We would need to pass through the idle_timeout kwarg from AWX to Runner. Arguably this issue should live in the AWX repo.

@kladiv
Copy link

kladiv commented Jul 31, 2021

Hi @shanemcd
i guess i got a similar behaviour on AWX 19.2.1 (EE 0.4.0)

I deployed a playbooks that run a task like below:

  raw: "ps -ef | grep -w /opt/xensource/sm/LVMoISCSISR | grep -v grep | grep -wq vdi_delete"
  register: quiesce_ps
  failed_when: false
  until: quiesce_ps.rc == 1
  retries: "{{ quiesce_wait_max_retries }}"
  delay: "{{ quiesce_wait_retries_delay }}"
  become: no
  delegate_to: "{{ xen_pool_master_inventory_hostname }}"

or (changed for test and to check if got same error):

  shell: >-
    RC=0;
    while [ $RC -eq 0 ]; do
      sleep 60;
      ps -ef | grep -w /opt/xensource/sm/LVMoISCSISR | grep -v grep | grep -wq vdi_delete;
      RC=$?;
    done
  register: quiesce_status
  async: 10800 # 3 hrs
  poll: 60 # 1 min
  become: no
  delegate_to: "{{ xen_pool_master_inventory_hostname }}"

Both the until/retries/delay task and async/poll task fails the Job without any error after about 4hrs. Every time i run, it fails after 4 hrs.

image

image

Another playbook task (it makes XenServer big VM export via command module) fails the Job after about 14hrs without any error:

image

I tried to put AWX Job Timeout different from zero (unlimited) to an high value... but same behaviour/job failure.
Could it be related to EE and Ansible Runner?

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants