Ansible: Two Errors, One Cause

Learning something new can be a frustrating experience, especially when you are running into errors without knowing how to solve them, because you haven’t reached the necessary level of background knowledge. You start googling; there are people like you who encountered the exact same error situation, but then the discussion either is getting very technical (and not really helping you) or no solution was given at all. I’ve racked my brain with Ansible (2.1.2.0) and two error messages when trying to execute a simple playbook:

This was meant to update three virtual target hosts on my Mac which were set up with Ubuntu 14.04. All VMs have individual IP addresses and are reachable from within my network. But for some reason this didn’t work. Running on the Mac it returned messages:

fatal: [192.168.0.20]: FAILED! => {"changed": false, "cmd": "apt-get update", "failed": true, "msg": "[Errno 2] No such file or directory", "rc": 2}

And running on a Linux (Ubuntu) client I got:

fatal: [192.168.0.20]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to lock apt for exclusive operation"}

Some of mytarget hosts seemed to be working while others didn’t:

PLAY RECAP *********************************************************************
192.168.0.19 : ok=1 changed=0 unreachable=0 failed=1
192.168.0.20 : ok=1 changed=0 unreachable=0 failed=1
192.168.0.21 : ok=2 changed=0 unreachable=0 failed=0

(though the working same might change when applying your playbook multiple times) then you find yourself googling and browsing on a lot of websites that eventually mislead you (except you are using a rather old version of Ansible). Even more disturbing is that some commands like

ansible all -m ping

might work while pinging via ansible-playbook might not! This ‘no such file or directory’ error was typically appearing on my Mac while ‘failed to lock apt for exclusive operation’ was the specialty of my Ubuntu host.

What’s the problem here?

Have a look at your inventory: this is a file called ‘hosts’ and will be found typically in /usr/local/etc/ansible/ (if you have installed ansible using brew on a Mac) or /etc/ansible/hosts on a Linux box. Let’s say it looks like so:

[atlanta]
192.168.0.19

[austin]
192.168.0.20

[moscow]
192.168.0.21

This is a simple list of hosts that are target machines for our ansible ad-hoc commands or playbooks. There are a lot of variations you can add to these entries, and one of these was guilty for these error messages I was experiencing. My hosts file looked like this:

[atlanta]
192.168.0.19 ansible_connection=local

[austin]
192.168.0.20 ansible_connection=local

[moscow]
192.168.0.21 ansible_connection=local

And this proves how important it is to read the whole manual: with ansible_connection=local all ansible commands are executed on the local ansible host, that means the machine where ansible itself runs. These commands never touch ground on the target servers. Now it totally makes sense that my Mac complained it couldn’t find any file or directory “apt” (there simply isn’t one) and the “failed to lock apt” error on Linux looks like a resource conflict between ansible and the apt command it tries to evoke.

The solution was 1) to remove these ansible_connection=local strings from the hosts file and 2) to check the passwordless ssh connections to my target servers. As it turned out, they weren’t passwordless at all; since I don’t want to send passwords for each login I’ve corrected this. This is described elsewhere and often enough, so I’m keeping this short:

$ ssh-keygen -t rsa # if necessary (if you don't have an id_rsa.pub in your ~/.ssh directory)
$ cat ~/.ssh/id_rsa.pub | ssh username@hostname 'cat >> ~/.ssh/authorized_keys'

Then chmod 600 ~/.ssh/authorized_keys on the target server; then login with ssh user@server: ssh might ask you if you want to connect; answering ‘yes’ will add the target host to your list of known hosts.

After these changes both Ansibles on the Mac and the Ubuntu client were finally working. The error messages mentioned above might have several causes; this was one of them. I can’t remember what made me configure it the wrong way, but I guess that’s part of the learning curve.

About Manfred Berndtgen

Manfred Berndtgen, maintainer of this site, is a part-time researcher with enough spare time for doing useless things and sharing them with the rest of the world. His main photographic subjects are made of plants or stones, and since he's learning Haskell everything seems functional to him.