User must retire a AWX service with VMs twice to succeed #22571

johny5v · 2023-06-19T11:22:30Z

johny5v
Jun 19, 2023

Hi, recently I created an issue for my problem (#22556), but I had no response so far. I also suspect that it could be more of a problem on our side than a bug, so this might be a better place to ask.

Recently we upgraded to Najdorf-1.3 from Kasparov, and one type of our service retirement started to behave strangely.
We have automatic provisioning for K8S cluster. The service template is based on Ansible Tower job, which creates VMs on OpenStack via Terraform and installs K8S clusters. Finally (when the MIQ OpenStack provider discovers them) we link new VMs to the MIQ service.
For retirement we have modified /Cloud/VM/Retirement/StateMachines/VMRetirement/Default by adding an assertion step to the top. This step checks VM custom keys for Terraform ID and skips the rest of the VM retirement state machine if such key exists on the VM. Then the service retirement task calls another Ansible Tower job which performs Terraform destroy. At least this worked fine until we migrate to Najdorf a month ago.
Now a user starts a retirement on K8S service, VM retirements tasks are spawned and processed bud nothing happens. Retirement request stays in Active state forever, log ends with

[----] I, [2023-06-07T11:30:12.769456 #2201:93a8]  INFO -- automation: Q-task_id([r50662_vm_retire_task_80773]) <AEMethod [/KBCZ-Openstack/Cloud/VM/Retirement/StateMachines/VMRetirement/update_retirement_status]> Ending
[----] I, [2023-06-07T11:30:12.769678 #2201:93a8]  INFO -- automation: Q-task_id([r50662_vm_retire_task_80773]) Method exited with rc=MIQ_OK
[----] I, [2023-06-07T11:30:12.770929 #2201:93a8]  INFO -- automation: Q-task_id([r50662_vm_retire_task_80773]) Next State=[]
[----] I, [2023-06-07T11:30:12.771374 #2201:93a8]  INFO -- automation: Q-task_id([r50662_vm_retire_task_80773]) Followed  Relationship [miqaedb:/Cloud/VM/Retirement/StateMachines/VMRetirement/SkipBecauseTerraform#create]
[----] I, [2023-06-07T11:30:12.772595 #2201:93a8]  INFO -- automation: Q-task_id([r50662_vm_retire_task_80773]) Followed  Relationship [miqaedb:/cloud/VM/Lifecycle/Retirement#create]

Repeated attempts lead to the same result. So I replaced the Cloud VM retirement state machine with my version, which performs only FinishRetirement step. So now when a user tries to retire this service for the first time, all its VMs get changed to the Retired state in MIQ (although the request still stays in the Active state forever). Then on a subsequent retirement attempt finally service retirement task is spawned in the log and VM's are finally physically destroyed via Ansible Tower Terraform job and MIQ service is retired. This second retirement request finished as a success.

Any help or idea would be appreciated. Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ManageIQ

User must retire a AWX service with VMs twice to succeed #22571

{{title}}

Replies: 0 comments

Select a reply

ManageIQ

User must retire a AWX service with VMs twice to succeed #22571

johny5v Jun 19, 2023

Replies: 0 comments

johny5v
Jun 19, 2023