SSH: Could not resolve hostname base.geekmush.org: No address associated with hostname, BUT “ping” works!

January 28, 2014

Lately, I’ve been playing with Docker on my iMac (with Vagrant/Virtualbox) and more recently on a Digital Ocean droplet.

When playing with Docker, you get lots of containers (like a very thinly provisioned VM) being created with random IP addresses.  Using a tool like skydns with Michael Crosby’s skydock can be very helpful!  It watches the Docker events stream and submits new hosts to the skydns server on the fly — AWESOME!

So, on my iMac (Vagrant/VirtualBox), this works as expected.  “docker run -name base -t -d geekmush/base” fires up a new container with <image> loaded in it, and skydock sees the event and tells skydns about the new host “base” (or “base.dev.geekmush.org” in my local dev space that skydns reports for.

Great!  Got a new container running, skydns knows about it, so what’s the problem?

root@geekmush01:~/scripts# ping -c 1 base
PING base.dev.geekmush.org (172.17.0.79) 56(84) bytes of data.
64 bytes from 172.17.0.79: icmp_req=1 ttl=64 time=0.084 ms
--- base.dev.geekmush.org ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.084/0.084/0.084/0.000 ms

Yay!  Again, what’s the problem???

root@geekmush01:~/scripts# ssh base.dev.geekmush.org
ssh: Could not resolve hostname base.dev.geekmush.org: No address associated with hostname

Normally, this error would make you go running and check your /etc/resolv.conf and /etc/nsswitch.conf files (and it *should*).

HOWEVER, I ran into a new one (for me).  I checked all the usual suspects and was really banging my head on the wall.  “ping base” worked, “ping base.dev.geekmush.org” worked, but the “ssh” version did NOT!  I even strace’d the ssh command and it *looked* good!

(strace abridged) 
open("/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3
read(3, "# /etc/nsswitch.conf\n#\n# Example"..., 4096) = 545
read(3, "", 4096) = 0
    :  (OK, it's reading the correct /etc/nsswitch.conf)
open("/root/.ssh/config", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/etc/ssh/ssh_config", O_RDONLY) = 3
read(3, "\n# This is the ssh client system"..., 4096) = 1709
    :  (No funky ssh_config files in the way)
open("/etc/services", O_RDONLY|O_CLOEXEC) = 3
read(3, "# Network services, Internet sty"..., 4096) = 4096
open("/etc/host.conf", O_RDONLY|O_CLOEXEC) = 3
read(3, "# The \"order\" line is only used "..., 4096) = 92
read(3, "", 4096) = 0
open("/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = 3
read(3, "search dev.geekmush.org\nnameserv"..., 4096) = 137
    : (it's reading the correct /etc/resolv.conf)
open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 3
read(3, "127.0.0.1\tlocalhost geekmush01\n\n"..., 4096) = 215
    : (/etc/hosts file, check!)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libnss_dns.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3>\1\21"..., 832) = 832
    : (well, damn!  it looked it up via DNS ... what the heck???)
open("/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = 3
read(3, "search dev.geekmush.org\nnameserv"..., 4096) = 137
read(3, "", 4096) = 0
    : (what??  why is it looking it up again, it just *found* it???)

So, now I’m really scratching my head!  Let’s look at the DNS server log …

2014/01/29 03:16:19 Received DNS Request for "base.dev.geekmush.org." from "172.17.42.1:52768"
2014/01/29 03:16:19 Received DNS Request for "base.dev.geekmush.org." from "172.17.42.1:52768"
2014/01/29 03:16:19 Received DNS Request for "base.dev.geekmush.org.dev.geekmush.org." from "172.17.42.1:34536"
2014/01/29 03:16:19 Error: Service does not exist in registry
2014/01/29 03:16:19 Received DNS Request for "base.dev.geekmush.org.dev.geekmush.org." from "172.17.42.1:34536"

Looks good and confirms that the DNS server is receiving our requests and is responding with the proper IP.  Odd that we see the FQDN with another domain name suffixed, but maybe it’s just the ssh client getting desperate (kind of!  this is a hint to us!).

OK, time to get medieval with this son of a gun … tcpdump, report for duty!

tcpdump -i lo port 53  (my skydns is listening on 127.0.0.1 among other ports)
03:16:19.329048 IP localhost.44883 > localhost.domain: 54926+ A? base.dev.geekmush.org. (39)
03:16:19.330459 IP localhost.domain > localhost.44883: 54926- 1/0/0 A 172.17.0.79 (76)
03:16:19.331190 IP localhost.44883 > localhost.domain: 34613+ AAAA? base.dev.geekmush.org. (39)
03:16:19.331933 IP localhost.domain > localhost.44883: 34613- 0/0/0 (39)
03:16:19.332512 IP localhost.54032 > localhost.domain: 21940+ A? base.dev.geekmush.org.dev.geekmush.org. (56)
03:16:19.333479 IP localhost.domain > localhost.54032: 21940 ServFail- 0/0/0 (56)
03:16:19.333763 IP localhost.54032 > localhost.domain: 6717+ AAAA? base.dev.geekmush.org.dev.geekmush.org. (56)
03:16:19.334214 IP localhost.domain > localhost.54032: 6717- 0/0/0 (56)

Yep, there’s the look up and the A record is returned properly and then … waaaaaait a second!  WTF is it doing subsequently looking for an AAAA record (IPv6)?  Damnit, Jim, we aren’t serving IPv6 AAAA addresses from our skydns — DOH!

OK, now we at least know what the heck ssh’s problem is, let’s smack it around and tell it that we’re not playing with IPv6 right now.

root@geekmush01:~/scripts# vi /etc/ssh/ssh_config 
  Edit the line "AddressFamily" and change it to "inet" from "any" or "inet6", save file.

root@geekmush01:~/scripts# ssh base
Last login: Wed Jan 29 03:04:44 2014 from 172.17.42.1
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.8.0-19-generic x86_64)
root@4a5e455692e6:~#
(tcpdump output)
03:43:41.862269 IP localhost.44556 > localhost.domain: 48276+ A? base.dev.geekmush.org. (39)
03:43:41.863197 IP localhost.domain > localhost.44556: 48276- 1/0/0 A 172.17.0.79 (76)

Now, that looks better all the way around!

Just as a side note, I also tried to telnet to port 22 and got the same hostname resolution result!  Turns out that with IPv6 configured on the system, I have to either setup a ~/.telnetrc file (I don’t see a /etc/telnetrc on my Droplet) or just remember to pass the “-4” parameter to the telnet binary.