Running Ansible task throws shared connection closed error

by jupiter   Last Updated April 16, 2018 07:00 AM

I have a below shell script "test_files.sh" on all the remote servers that I am executing from ansible from a local box.

#!/bin/bash

new_primary=(476 1309 819 1162 280 133 623 672 1015 1358)
new_secondary=(1426 610 274 130 1570 466 946)

export primary=/data01/primary
export secondary=/data02/secondary

export fol1=/data/snap/$1
export fol2=/data/snap/$2

do_copy() {
  el=$1
  primsec=$2
  scp -C -o StrictHostKeyChecking=no [email protected]:"$fol1"/proc_"$el"_tkk.data "$primsec"/. || scp -C -o StrictHostKeyChecking=no [email protected]:"$fol2"/proc_"$el"_tkk.data "$primsec"/.
}
export -f do_copy

parallel -j 3 do_copy {} $primary ::: ${new_primary[@]} &
parallel -j 3 do_copy {} $secondary ::: ${new_secondary[@]} &
wait
echo "all files done"

Now here is my ansible which I am running from local box and this ansible will execute above script:

---
- name: copy files
  hosts: one_box
  serial: 1
  tasks:
      - name: execute the script 
        shell: /home/goldy/test_files.sh 20180415 20180409

      - name: sleep for few seconds
        pause: seconds=20

So when I execute above ansible, it starts executing my above shell script and I can see files are getting copied on my remote servers but after sometime, I am getting an error like this and I can't seem to figure out what's wrong?

TASK [execute the script] ******************************************************
fatal: [machine.abc.host.com]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Shared connection to machine.abc.host.com closed.\r\n", "unreachable": true}

Is there anything wrong I am doing? I am running ansible 2.4.3.0. Below is my inventory file:

[one_box]
machine.abc.host.com

And this is how I am running it:

[email protected]_box:/var/lib/jenkins/workspace/copy$ ansible-playbook -e 'host_key_checking=False' test_files.yml -u goldy

After running with verbose option below is the output:

TASK [execute the script] *******************************************************************************************************************************************************************
task path: /var/lib/jenkins/workspace/data-copy/test_files.yml:6
Using module file /usr/lib/python2.7/dist-packages/ansible/modules/commands/command.py
<machine.abc.host.com> ESTABLISH SSH CONNECTION FOR USER: goldy
<machine.abc.host.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=goldy -o ConnectTimeout=10 -o ControlPath=/export/home/jenkins/.ansible/cp/bcb67829a1 machine.abc.host.com '/bin/sh -c '"'"'echo ~ && sleep 0'"'"''
<machine.abc.host.com> (0, '/export/home/goldy\n', '')
<machine.abc.host.com> ESTABLISH SSH CONNECTION FOR USER: goldy
<machine.abc.host.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=goldy -o ConnectTimeout=10 -o ControlPath=/export/home/jenkins/.ansible/cp/bcb67829a1 machine.abc.host.com '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo /export/home/goldy/.ansible/tmp/ansible-tmp-1523851306.81-45754594854364 `" && echo ansible-tmp-1523851306.81-45754594854364="` echo /export/home/goldy/.ansible/tmp/ansible-tmp-1523851306.81-45754594854364 `" ) && sleep 0'"'"''
<machine.abc.host.com> (0, 'ansible-tmp-1523851306.81-45754594854364=/export/home/goldy/.ansible/tmp/ansible-tmp-1523851306.81-45754594854364\n', '')
<machine.abc.host.com> PUT /tmp/tmpZ0MgKd TO /export/home/goldy/.ansible/tmp/ansible-tmp-1523851306.81-45754594854364/command.py
<machine.abc.host.com> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=goldy -o ConnectTimeout=10 -o ControlPath=/export/home/jenkins/.ansible/cp/bcb67829a1 '[machine.abc.host.com]'
<machine.abc.host.com> (0, 'sftp> put /tmp/tmpZ0MgKd /export/home/goldy/.ansible/tmp/ansible-tmp-1523851306.81-45754594854364/command.py\n', '')
<machine.abc.host.com> ESTABLISH SSH CONNECTION FOR USER: goldy
<machine.abc.host.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=goldy -o ConnectTimeout=10 -o ControlPath=/export/home/jenkins/.ansible/cp/bcb67829a1 machine.abc.host.com '/bin/sh -c '"'"'chmod u+x /export/home/goldy/.ansible/tmp/ansible-tmp-1523851306.81-45754594854364/ /export/home/goldy/.ansible/tmp/ansible-tmp-1523851306.81-45754594854364/command.py && sleep 0'"'"''
<machine.abc.host.com> (0, '', '')
<machine.abc.host.com> ESTABLISH SSH CONNECTION FOR USER: goldy
<machine.abc.host.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=goldy -o ConnectTimeout=10 -o ControlPath=/export/home/jenkins/.ansible/cp/bcb67829a1 -tt machine.abc.host.com '/bin/sh -c '"'"'/usr/bin/python /export/home/goldy/.ansible/tmp/ansible-tmp-1523851306.81-45754594854364/command.py; rm -rf "/export/home/goldy/.ansible/tmp/ansible-tmp-1523851306.81-45754594854364/" > /dev/null 2>&1 && sleep 0'"'"''


<machine.abc.host.com> (255, '', 'Shared connection to machine.abc.host.com closed.\r\n')
fatal: [machine.abc.host.com]: UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: Shared connection to machine.abc.host.com closed.\r\n",
    "unreachable": true
}
        to retry, use: --limit @/var/lib/jenkins/workspace/data-copy/test_files.retry

PLAY RECAP **********************************************************************************************************************************************************************************
machine.abc.host.com : ok=1    changed=0    unreachable=1    failed=0

I am thinking what might be happening is - since ansible is running my shell script on the remote server and my script copies three files at a time and each file size is 11GB so it will take some time to get copied so while three files are getting copied somehow ansible reaches its time limit and then maybe throwing exception or something? In the above logs I see ControlPersist=60s, is it related to this? Bcoz definitely files won't get copied in 60s.

Or is it related to connect_timeout = 30 in ansible.cfg file?



Related Questions


win_update fails to update server

Updated August 24, 2017 04:00 AM

Ansible JSON Output

Updated March 02, 2017 16:00 PM



Login into an EC2 instance as root user

Updated January 21, 2018 13:00 PM