SSH: Could not resolve hostname base.geekmush.org: No address associated with hostname, BUT “ping” works!

January 28, 2014

Lately, I’ve been playing with Docker on my iMac (with Vagrant/Virtualbox) and more recently on a Digital Ocean droplet.

When playing with Docker, you get lots of containers (like a very thinly provisioned VM) being created with random IP addresses.  Using a tool like skydns with Michael Crosby’s skydock can be very helpful!  It watches the Docker events stream and submits new hosts to the skydns server on the fly — AWESOME!

So, on my iMac (Vagrant/VirtualBox), this works as expected.  “docker run -name base -t -d geekmush/base” fires up a new container with <image> loaded in it, and skydock sees the event and tells skydns about the new host “base” (or “base.dev.geekmush.org” in my local dev space that skydns reports for.

Great!  Got a new container running, skydns knows about it, so what’s the problem?

root@geekmush01:~/scripts# ping -c 1 base
PING base.dev.geekmush.org (172.17.0.79) 56(84) bytes of data.
64 bytes from 172.17.0.79: icmp_req=1 ttl=64 time=0.084 ms
--- base.dev.geekmush.org ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.084/0.084/0.084/0.000 ms

Yay!  Again, what’s the problem???

root@geekmush01:~/scripts# ssh base.dev.geekmush.org
ssh: Could not resolve hostname base.dev.geekmush.org: No address associated with hostname

Normally, this error would make you go running and check your /etc/resolv.conf and /etc/nsswitch.conf files (and it *should*).

HOWEVER, I ran into a new one (for me).  I checked all the usual suspects and was really banging my head on the wall.  “ping base” worked, “ping base.dev.geekmush.org” worked, but the “ssh” version did NOT!  I even strace’d the ssh command and it *looked* good!

(strace abridged) 
open("/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3
read(3, "# /etc/nsswitch.conf\n#\n# Example"..., 4096) = 545
read(3, "", 4096) = 0
    :  (OK, it's reading the correct /etc/nsswitch.conf)
open("/root/.ssh/config", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/etc/ssh/ssh_config", O_RDONLY) = 3
read(3, "\n# This is the ssh client system"..., 4096) = 1709
    :  (No funky ssh_config files in the way)
open("/etc/services", O_RDONLY|O_CLOEXEC) = 3
read(3, "# Network services, Internet sty"..., 4096) = 4096
open("/etc/host.conf", O_RDONLY|O_CLOEXEC) = 3
read(3, "# The \"order\" line is only used "..., 4096) = 92
read(3, "", 4096) = 0
open("/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = 3
read(3, "search dev.geekmush.org\nnameserv"..., 4096) = 137
    : (it's reading the correct /etc/resolv.conf)
open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 3
read(3, "127.0.0.1\tlocalhost geekmush01\n\n"..., 4096) = 215
    : (/etc/hosts file, check!)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libnss_dns.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3>\1\21"..., 832) = 832
    : (well, damn!  it looked it up via DNS ... what the heck???)
open("/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = 3
read(3, "search dev.geekmush.org\nnameserv"..., 4096) = 137
read(3, "", 4096) = 0
    : (what??  why is it looking it up again, it just *found* it???)

So, now I’m really scratching my head!  Let’s look at the DNS server log …

2014/01/29 03:16:19 Received DNS Request for "base.dev.geekmush.org." from "172.17.42.1:52768"
2014/01/29 03:16:19 Received DNS Request for "base.dev.geekmush.org." from "172.17.42.1:52768"
2014/01/29 03:16:19 Received DNS Request for "base.dev.geekmush.org.dev.geekmush.org." from "172.17.42.1:34536"
2014/01/29 03:16:19 Error: Service does not exist in registry
2014/01/29 03:16:19 Received DNS Request for "base.dev.geekmush.org.dev.geekmush.org." from "172.17.42.1:34536"

Looks good and confirms that the DNS server is receiving our requests and is responding with the proper IP.  Odd that we see the FQDN with another domain name suffixed, but maybe it’s just the ssh client getting desperate (kind of!  this is a hint to us!).

OK, time to get medieval with this son of a gun … tcpdump, report for duty!

tcpdump -i lo port 53  (my skydns is listening on 127.0.0.1 among other ports)
03:16:19.329048 IP localhost.44883 > localhost.domain: 54926+ A? base.dev.geekmush.org. (39)
03:16:19.330459 IP localhost.domain > localhost.44883: 54926- 1/0/0 A 172.17.0.79 (76)
03:16:19.331190 IP localhost.44883 > localhost.domain: 34613+ AAAA? base.dev.geekmush.org. (39)
03:16:19.331933 IP localhost.domain > localhost.44883: 34613- 0/0/0 (39)
03:16:19.332512 IP localhost.54032 > localhost.domain: 21940+ A? base.dev.geekmush.org.dev.geekmush.org. (56)
03:16:19.333479 IP localhost.domain > localhost.54032: 21940 ServFail- 0/0/0 (56)
03:16:19.333763 IP localhost.54032 > localhost.domain: 6717+ AAAA? base.dev.geekmush.org.dev.geekmush.org. (56)
03:16:19.334214 IP localhost.domain > localhost.54032: 6717- 0/0/0 (56)

Yep, there’s the look up and the A record is returned properly and then … waaaaaait a second!  WTF is it doing subsequently looking for an AAAA record (IPv6)?  Damnit, Jim, we aren’t serving IPv6 AAAA addresses from our skydns — DOH!

OK, now we at least know what the heck ssh’s problem is, let’s smack it around and tell it that we’re not playing with IPv6 right now.

root@geekmush01:~/scripts# vi /etc/ssh/ssh_config 
  Edit the line "AddressFamily" and change it to "inet" from "any" or "inet6", save file.

root@geekmush01:~/scripts# ssh base
Last login: Wed Jan 29 03:04:44 2014 from 172.17.42.1
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.8.0-19-generic x86_64)
root@4a5e455692e6:~#
(tcpdump output)
03:43:41.862269 IP localhost.44556 > localhost.domain: 48276+ A? base.dev.geekmush.org. (39)
03:43:41.863197 IP localhost.domain > localhost.44556: 48276- 1/0/0 A 172.17.0.79 (76)

Now, that looks better all the way around!

Just as a side note, I also tried to telnet to port 22 and got the same hostname resolution result!  Turns out that with IPv6 configured on the system, I have to either setup a ~/.telnetrc file (I don’t see a /etc/telnetrc on my Droplet) or just remember to pass the “-4” parameter to the telnet binary.

Advertisements

Adventures in Docker-land

January 22, 2014

So, after some time away from IT, I’m mucking around in it again to see what seems interesting.

Got a pointer to Docker  from a friend and have been up to my ears ever since.  It’s in a similar vein to the stuff I got to play with when working with Nebula at NASA in the sense that it is a PaaS enabling environment.  It uses thinly provisioned servers (containers) with AUFS (and some support for a few others), so every operation during the image build process resides in a new, thin, COW layer.

This gives great advantages for caching and speeding up the build process.  Also, say you have 2 images that share the same common base build (i.e. apache server), but for one application, you need PHP and the other you need RUBY … build 2 different images from the same base, and instead of wasting 2x the disk space and build time, you have 1x + the difference for PHP in one image and RUBY in the other — and the build time is severely reduced!

Need 5x Apache+PHP servers?  Just run:

docker run -d -t geekmush/apache-php

5x or as many as you want.  There are lots of other options and Docker is being developed actively … and they have some great staff and supporters on #docker@irc.freenode.net!

It’s not worth trying to describe Docker here when they can do a much better job … go check out their website and the Googles!

I’m going to use my space for notes and tips and tricks that I’ve run into during my adventures this week.

Stay tuned!


Expanding a Linux LVM PV and underlying RAID5

January 8, 2009

Background

Normally, we run our servers with RAID1 with a pair of disks added at a time. Since we also use DRBD on top of our LVM LVs, we have 3 servers in play (3-way DRBD replication) — this means adding new disks in groups of 6, which is sort of spendy.

So, with our new servers, we are looking into switching out the RAID1s (grouped into a single VG) to a single RAID5 under the LVM PV (in a single VG). Linux RAID5s can be expanded on the fly, this lets us grow the data disk by adding only 1 disk at a time (3 across the 3 servers). The bulk of the application servers are not really I/O intensive, so we’re not really worried about the RAID5 performance hit.

Here’s the setup on our test of this process:

  • Ubuntu 8.04 LTS Server
  • AMD 64 x2 5600+ with 8GB
  • 3 x 1TB SATA2 drives, 2 x 750GB SATA2 drives
  • Xen’ified 2.6.18 kernel with Xen 3.3.1-rc4
  • Linux RAID, LVM2, etc, etc.

We configured the partitions on the data disks as 2GB partitions, just so the sync doesn’t take *forever*.

Disk /dev/sde: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00046fcd

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1        1217     9775521   fd  Linux raid autodetect
/dev/sde2            1218        1704     3911827+  82  Linux swap / Solaris
/dev/sde3            1705        1946     1959930   fd  Linux raid autodetect

The initial RAID 5 configuration:

md1 : active raid5 sda3[0] sdc3[2] sdb3[1]
      3919616 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

LVM configuration:

root@xen-80-31:~# pvdisplay /dev/md1
  --- Physical volume ---
  PV Name               /dev/md1
  VG Name               testvg
  PV Size               3.74 GB / not usable 3.75 MB
  Allocatable           yes
  PE Size (KByte)       4096
  Total PE              956
  Free PE               956
  Allocated PE          0
  PV UUID               h6qBlQ-RCy3-YeE9-zQXw-j1oa-bg8K-2JULo1

root@xen-80-31:~# vgdisplay testvg
  --- Volume group ---
  VG Name               testvg
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  1
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                0
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               3.73 GB
  PE Size               4.00 MB
  Total PE              956
  Alloc PE / Size       0 / 0
  Free  PE / Size       956 / 3.73 GB
  VG UUID               iOVFKf-8iSV-k1VK-G37u-Ivns-9uqx-vZCduc

Process to expand the data device

Initial RAID5 configuration:

root@xen-80-31:~# mdadm --detail /dev/md1
/dev/md1:
        Version : 00.90.03
  Creation Time : Thu Jan  8 04:33:16 2009
     Raid Level : raid5
     Array Size : 3919616 (3.74 GiB 4.01 GB)
  Used Dev Size : 1959808 (1914.20 MiB 2006.84 MB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Thu Jan  8 17:52:46 2009
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 8fd9e7d9:0dae82af:b836248b:2f509f91
         Events : 0.4

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3
       2       8       35        2      active sync   /dev/sdc3

Add the new 2GB device to the RAID5 — it will show up as a “spare” device initially:

root@xen-80-31:~# mdadm --add /dev/md1 /dev/sdd3
mdadm: added /dev/sdd3
root@xen-80-31:~# mdadm --detail /dev/md1
/dev/md1:
        Version : 00.90.03
  Creation Time : Thu Jan  8 04:33:16 2009
     Raid Level : raid5
     Array Size : 3919616 (3.74 GiB 4.01 GB)
  Used Dev Size : 1959808 (1914.20 MiB 2006.84 MB)
   Raid Devices : 3
  Total Devices : 4
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Thu Jan  8 17:52:46 2009
          State : clean
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 8fd9e7d9:0dae82af:b836248b:2f509f91
         Events : 0.4

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3
       2       8       35        2      active sync   /dev/sdc3

       3       8       51        -      spare   /dev/sdd3

Grow the RAID5 to include the new device:

root@xen-80-31:~# lvs
  LV    VG     Attr   LSize Origin Snap%  Move Log Copy%
  test1 testvg -wi-a- 1.00G
root@xen-80-31:~# mdadm --grow /dev/md1 --raid-devices=4
mdadm: Need to backup 384K of critical section..
mdadm: ... critical section passed.

/proc/mdstat:

md1 : active raid5 sdd3[3] sda3[0] sdc3[2] sdb3[1]
      3919616 blocks super 0.91 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
      [=>...................]  reshape =  7.9% (156192/1959808) finish=1.3min speed=22313K/sec

While the reshape is running, the VG is still active:

root@xen-80-31:~# lvcreate -L 1g -n test2 testvg
  Logical volume "test2" created
root@xen-80-31:~# lvs
  LV    VG     Attr   LSize Origin Snap%  Move Log Copy%
  test1 testvg -wi-a- 1.00G
  test2 testvg -wi-a- 1.00G

After reshaping complete:

root@xen-80-31:~# mdadm --detail /dev/md1
/dev/md1:
        Version : 00.90.03
  Creation Time : Thu Jan  8 04:33:16 2009
     Raid Level : raid5
     Array Size : 5879424 (5.61 GiB 6.02 GB)
  Used Dev Size : 1959808 (1914.20 MiB 2006.84 MB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Thu Jan  8 18:44:32 2009
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 8fd9e7d9:0dae82af:b836248b:2f509f91
         Events : 0.1336

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3
       2       8       35        2      active sync   /dev/sdc3
       3       8       51        3      active sync   /dev/sdd3

The new space is not reflected yet in the PV or VG – grow the PV:

root@xen-80-31:~# pvresize /dev/md1
  Physical volume "/dev/md1" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized
root@xen-80-31:~# pvdisplay
  --- Physical volume ---
  PV Name               /dev/md1
  VG Name               testvg
  PV Size               5.61 GB / not usable 1.44 MB
  Allocatable           yes
  PE Size (KByte)       4096
  Total PE              1435
  Free PE               923
  Allocated PE          512
  PV UUID               h6qBlQ-RCy3-YeE9-zQXw-j1oa-bg8K-2JULo1

root@xen-80-31:~# vgdisplay
  --- Volume group ---
  VG Name               testvg
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  4
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               5.61 GB
  PE Size               4.00 MB
  Total PE              1435
  Alloc PE / Size       512 / 2.00 GB
  Free  PE / Size       923 / 3.61 GB
  VG UUID               iOVFKf-8iSV-k1VK-G37u-Ivns-9uqx-vZCduc

In another test, growing 3 x 200GB partitions by 1 more 200GB partition on our system took around 150 minutes, so the reshaping process is not super speedy. Even though our test showed that you can still perform I/O against the back end data store (RAID5) while it is reshaping, it would probably be best to keep I/O to a minimum.

UPDATE: We repeated the test with 3 x 700GB partitions and added a 4th 700GB partition — reshaping time took about 8.5h with no external I/O performed to the LVM/RAID5 device.


Shrinking a Linux LVM on top of an md RAID1

December 23, 2008

If I don’t start blogging some of the stuff I’m doing, I’ll never remember how I did it in the first place!  So, time to try to come out from the darkness and see if I can keep discipline to post more.

Problem:  One of our servers has a Linux md RAID1 device (md1), which would keep falling out of sync, especially after a reboot.  Upon nearing the end of they re-sync, it would pause, fail, and start over.  Entries from /var/log/messages

Dec 22 11:56:52 xen-33-18-02 kernel: md: md1: sync done.
Dec 22 11:56:56 xen-33-18-02 kernel: ata1: soft resetting port
Dec 22 11:56:56 xen-33-18-02 kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Dec 22 11:56:56 xen-33-18-02 kernel: ata1.00: configured for UDMA/133
Dec 22 11:56:56 xen-33-18-02 kernel: ata1: EH complete
:
Dec 22 11:56:56 xen-33-18-02 kernel: SCSI device sda: 1465149168 512-byte hdwr sectors (750156 MB)
Dec 22 11:56:56 xen-33-18-02 kernel: sda: Write Protect is off
Dec 22 11:56:56 xen-33-18-02 kernel: SCSI device sda: drive cache: write back
Dec 22 11:57:04 xen-33-18-02 kernel: ata1.00: configured for UDMA/133
Dec 22 11:57:04 xen-33-18-02 kernel: sd 0:0:0:0: SCSI error: return code = 0x08000002
Dec 22 11:57:04 xen-33-18-02 kernel: sda: Current: sense key: Aborted Command
Dec 22 11:57:04 xen-33-18-02 kernel:     Additional sense: No additional sense information
Dec 22 11:57:04 xen-33-18-02 kernel: end_request: I/O error, dev sda, sector 1465143264
Dec 22 11:57:04 xen-33-18-02 kernel: sd 0:0:0:0: SCSI error: return code = 0x08000002
:
Dec 22 11:57:14 xen-33-18-02 kernel: sd 0:0:0:0: SCSI error: return code = 0x08000002
Dec 22 11:57:14 xen-33-18-02 kernel: sda: Current: sense key: Medium Error
Dec 22 11:57:14 xen-33-18-02 kernel:     Additional sense: Unrecovered read error - auto reallocate failed
:

Then, the RAID1 re-sync would start all over again … only to fail a few hours later … and again.

Since there appear to be some bad spots on sda at the end of the disk, the solution seemed to be to shrink the partition a bit to avoid the bad spots.  

Our md RAID1 (md1) consists of 2 x 694GB partitions on sda3 and sdb3.  On top of md1 lives an LVM PV, VG, and lots of LVs.  The PV is not 100% allocated, so we have some wiggle room to shrink everything.

Here’s the procedure we followed:

  1. Shrink the LVM PV
    • “pvdisplay” showed the physical disk had 693.98 GB, so we just trimmed it down to an even 693GB.
    • use “pvresize” to shrink the PV
    • # pvresize  -v --setphysicalvolumesize 693G /dev/md1      Using physical volume(s) on command line
          Archiving volume group "datavg" metadata (seqno 42).
          /dev/md1: Pretending size is 1453326336 not 1455376384 sectors.
          Resizing physical volume /dev/md1 from 177658 to 177407 extents.
          Resizing volume "/dev/md1" to 1453325952 sectors.
          Updating physical volume "/dev/md1"
          Creating volume group backup "/etc/lvm/backup/datavg" (seqno 43).
          Physical volume "/dev/md1" changed
          1 physical volume(s) resized / 0 physical volume(s) not resized
  2. Shrink the md RAID1 device
    • If you do NOT do this and shrink the hard drive partitions beneath the md device … you loose your superblock!  Cool, eh?
    • Check the size of the existing md
    • # mdadm --verbose --detail /dev/md1
      /dev/md1:
              Version : 00.90.03
        Creation Time : Mon Sep 29 23:41:30 2008
           Raid Level : raid1
           Array Size : 727688192 (693.98 GiB 745.15 GB)
        Used Dev Size : 727688192 (693.98 GiB 745.15 GB)
         Raid Devices : 2
        Total Devices : 1
      Preferred Minor : 1
          Persistence : Superblock is persistent
      
          Update Time : Mon Dec 22 22:57:31 2008
                State : clean, degraded
       Active Devices : 1
      Working Devices : 1
       Failed Devices : 0
        Spare Devices : 0
      
                 UUID : d584a880:d77024a2:9ae023b4:27ec0db5
               Events : 0.5501090
      
          Number   Major   Minor   RaidDevice State
             0       8        3        0      active sync   /dev/sda3
             1       0        0        1      removed
    • Hey, what a coincidence … the Array Size is the same as the PV was! 🙂  Now, we need to calculate the new size.  The new size of the PV is 1453325952 sectors, which is 726662976 blocks (even *you* can divide by 2!!).
    • Shrink the md:
    • # mdadm --verbose --grow /dev/md1 --size=726662976
      # mdadm --verbose --detail /dev/md1
      /dev/md1:
         :
           Array Size : 726662976 (693.00 GiB 744.10 GB)
    • Woot! Getting closer!
  3. Now, technically, we should shrink the partition, too.  Here’s where I ran into a bit of trouble (ok, I’m too lazy to do the math).  “fdisk” on linux doesn’t seem to want to let you specify a size in blocks or sectors, so you have to keep shrinking the ending cylinder number until you get in the range of the new block size.  I’ll leave this as an exercise for the reader, and feel free to post a comment with the actual procedure.  🙂

    After these steps were completed, we were able to perform the raid1 re-sync successfully.


      How to make fetchmail happy with the server’s SSL cert

      June 29, 2007

      Have you tried running fetchmail against a POP3S server and gotten these messages over and over?

      fetchmail: Server certificate verification error: unable to get local issuer certificate
      fetchmail: Server certificate verification error: certificate not trusted
      fetchmail: Server certificate verification error: unable to verify the first certificate

      Congratulations, you are not alone. Looking around, I see a lot of people having this problem and the answers are usually not as clear as they could be. Let’s see if I can make it less clear, too.

      1. Make sure a recent openssl is installed and your fetchmail is linked against it, etc, etc
      2. Run “openssl s_client -connect pop.gmail.com:995 -showcerts” (hit enter after the output to exit)
        Cut and paste the stuff between the “—–BEGIN CERTIFICATE—–” and “—–END CERTIFICATE—–” lines (inclusive) into a file pop.gmail.com.pem
        Review the rest of the output for the “issuer=” line (in this case, “Equifax Secure Certificate Authority”)
        Go here and grab the “Base-64 encoded X.509” version of the cert for “Equifax Secure Certificate Authority”
        Rename that file with a “.pem” extension
        Make a certs directory somewhere (i.e. /usr/local/etc/fetchmail/certs) and put both files in it
        Run “c_rehash /usr/local/etc/fetchmail/certs”
        Add this to your .fetchmailrc under the “poll” section for this server: “sslcertck sslcertpath /usr/local/etc/fetchmail/certs”
        Run “fetchmail -v” and see if the warnings are gone!

        You will need to do this for each server that you poll with SSL (both the server and its issuer’s PEM).