Shrinking a Linux LVM on top of an md RAID1

If I don’t start blogging some of the stuff I’m doing, I’ll never remember how I did it in the first place! So, time to try to come out from the darkness and see if I can keep discipline to post more.

Problem: One of our servers has a Linux md RAID1 device (md1), which would keep falling out of sync, especially after a reboot. Upon nearing the end of they re-sync, it would pause, fail, and start over. Entries from /var/log/messages

Dec 22 11:56:52 xen-33-18-02 kernel: md: md1: sync done.
Dec 22 11:56:56 xen-33-18-02 kernel: ata1: soft resetting port
Dec 22 11:56:56 xen-33-18-02 kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Dec 22 11:56:56 xen-33-18-02 kernel: ata1.00: configured for UDMA/133
Dec 22 11:56:56 xen-33-18-02 kernel: ata1: EH complete
:
Dec 22 11:56:56 xen-33-18-02 kernel: SCSI device sda: 1465149168 512-byte hdwr sectors (750156 MB)
Dec 22 11:56:56 xen-33-18-02 kernel: sda: Write Protect is off
Dec 22 11:56:56 xen-33-18-02 kernel: SCSI device sda: drive cache: write back
Dec 22 11:57:04 xen-33-18-02 kernel: ata1.00: configured for UDMA/133
Dec 22 11:57:04 xen-33-18-02 kernel: sd 0:0:0:0: SCSI error: return code = 0x08000002
Dec 22 11:57:04 xen-33-18-02 kernel: sda: Current: sense key: Aborted Command
Dec 22 11:57:04 xen-33-18-02 kernel:     Additional sense: No additional sense information
Dec 22 11:57:04 xen-33-18-02 kernel: end_request: I/O error, dev sda, sector 1465143264
Dec 22 11:57:04 xen-33-18-02 kernel: sd 0:0:0:0: SCSI error: return code = 0x08000002
:
Dec 22 11:57:14 xen-33-18-02 kernel: sd 0:0:0:0: SCSI error: return code = 0x08000002
Dec 22 11:57:14 xen-33-18-02 kernel: sda: Current: sense key: Medium Error
Dec 22 11:57:14 xen-33-18-02 kernel:     Additional sense: Unrecovered read error - auto reallocate failed
:

Then, the RAID1 re-sync would start all over again … only to fail a few hours later … and again.

Since there appear to be some bad spots on sda at the end of the disk, the solution seemed to be to shrink the partition a bit to avoid the bad spots.

Our md RAID1 (md1) consists of 2 x 694GB partitions on sda3 and sdb3. On top of md1 lives an LVM PV, VG, and lots of LVs. The PV is not 100% allocated, so we have some wiggle room to shrink everything.

Here’s the procedure we followed:

Shrink the LVM PV

“pvdisplay” showed the physical disk had 693.98 GB, so we just trimmed it down to an even 693GB.
use “pvresize” to shrink the PV

# pvresize  -v --setphysicalvolumesize 693G /dev/md1      Using physical volume(s) on command line
    Archiving volume group "datavg" metadata (seqno 42).
    /dev/md1: Pretending size is 1453326336 not 1455376384 sectors.
    Resizing physical volume /dev/md1 from 177658 to 177407 extents.
    Resizing volume "/dev/md1" to 1453325952 sectors.
    Updating physical volume "/dev/md1"
    Creating volume group backup "/etc/lvm/backup/datavg" (seqno 43).
    Physical volume "/dev/md1" changed
    1 physical volume(s) resized / 0 physical volume(s) not resized

Shrink the md RAID1 device

If you do NOT do this and shrink the hard drive partitions beneath the md device … you loose your superblock! Cool, eh?
Check the size of the existing md

# mdadm --verbose --detail /dev/md1
/dev/md1:
        Version : 00.90.03
  Creation Time : Mon Sep 29 23:41:30 2008
     Raid Level : raid1
     Array Size : 727688192 (693.98 GiB 745.15 GB)
  Used Dev Size : 727688192 (693.98 GiB 745.15 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Mon Dec 22 22:57:31 2008
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : d584a880:d77024a2:9ae023b4:27ec0db5
         Events : 0.5501090

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       0        0        1      removed

Hey, what a coincidence … the Array Size is the same as the PV was! 🙂 Now, we need to calculate the new size. The new size of the PV is 1453325952 sectors, which is 726662976 blocks (even *you* can divide by 2!!).
Shrink the md:

# mdadm --verbose --grow /dev/md1 --size=726662976
# mdadm --verbose --detail /dev/md1
/dev/md1:
   :
     Array Size : 726662976 (693.00 GiB 744.10 GB)

Woot! Getting closer!

Now, technically, we should shrink the partition, too. Here’s where I ran into a bit of trouble (ok, I’m too lazy to do the math). “fdisk” on linux doesn’t seem to want to let you specify a size in blocks or sectors, so you have to keep shrinking the ending cylinder number until you get in the range of the new block size. I’ll leave this as an exercise for the reader, and feel free to post a comment with the actual procedure. 🙂

After these steps were completed, we were able to perform the raid1 re-sync successfully.

This entry was posted on Tuesday, December 23rd, 2008 at 5:36 pm and is filed under Systems Administration, Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

One Response to Shrinking a Linux LVM on top of an md RAID1

Matthijs Kooijman says:

December 16, 2012 at 1:38 pm

> ”fdisk” on linux doesn’t seem to want to let you specify a size in blocks or sectors

FYI: fdisk allows switching between cylinders and sectors using the “u” command 🙂

LikeLike

Reply

Sam Howard – IT Architect