Saturday, February 2, 2008

RAID Expansion and The Beauty of XFS

Existing Hardware Setup

My workstation has 4 SATA2 hard disk drives installed: 3x 320GB & 1x 80GB. The first 3 are on RAID arrays, while the last is a standalone partition with Windows to play games once in a while. The RAID arrays is managed by the Linux kernel, which is known as software RAID.

Existing Storage Capacity

The total storage capacity of the array was 592GB, of which 50GB was free. As I haven't finished my work on the NAS box, I had no place to store my future stuff, so I needed storage urgently, and had to add another disk into the existing array. As I hope to get my NAS box soon, I bought only one 320GB HDD.

Stage 0: HDD Installation & Partitioning

Since I had no RAID controllers, adding the hard disk while the PC was running was out of the question. So, after turning off the box, I added the new hard disk, and booted the box. Then, I partitioned the new hard disk into 2 partitions: boot & data, where boot occupied 100MB only.
This is different from the previous 3 HDDs, since they had a swap partition of 2GB per disk. I didn't add a swap partition in the new one. (And yes, there's 2GB of unused space).
This is how the new disk looked after being partitioned:
   Device Boot      Start         End      Blocks   Id  System
/dev/sde1 * 1 12 96358+ fd Linux raid autodetect
/dev/sde2 13 38913 312472282+ fd Linux raid autodetect
Another way to look at it would be:
Name    Flags    Part Type   FS Type                     Size (MB)
------------------------------------------------------------------
sde1 Boot Primary Linux raid autodetect 98.71
sde2 Primary Linux raid autodetect 319971.62
* The first print was produced using fdisk, the second using cfdisk.

Stage 0.5: Remove Shadowed Mounts

My /boot is a RAID1 array that is mounted over the existing /boot, and it's mounted as read-only. So, sense I'll be expanding the root filesystem (/) I thought of removing all un-needed filesystems, just in case!

Stage 1: Closer Than Veins

Now it's time to add partitions to their RAID arrays. If you don't wish to suffer like I did, change the minimum speed limit as follows:
echo '150000' > /proc/sys/dev/raid/speed_limit_min

Then, I added the boot partition to the array:
mdadm /dev/md0 -a /dev/sde1
mdadm --grow /dev/md0 --raid-disks=4

The first line will add the disk as a spare one only, and not an active part of the array. The second line makes it part of the array and mirrors the data to the new member of the RAID1 array.

Then, I added the data partiton to the RAID5 array, in a similar matter to the previous partition:
mdadm /dev/md1 -a /dev/sde2
mdadm --grow /dev/md1 --raid-disks=4

Because the RAID5 array is big, rebuilding it takes a lot of time. Mine took about 8.5 hours! The rebuilding process starts as soon as you execute the 2nd line. You can monitor the process of rebuilding the array by:
cat /proc/mdstat

You'll get something looking like these:
[=>...................]  reshape =  7.7% (24075264/310472064) finish=583.2min speed=8184K/sec
[===>.................] reshape = 19.8% (61607172/310472064) finish=582.5min speed=7116K/sec

Neat, right? Indeed, but the work isn't done yet!

Stage 2: Command and Conquer

Occupying the newly available free space is the last thing to do. After roughly 8.5 hours of rebuilding the array, it's finally online and ready to be abused by yours truly.
As the title of this post suggests, I use XFS filesystem, and one great feature that it offers is the ability to grow the filesystem online: No need to unmount the filesystem.
Quoting the manual page:
The filesystem must be mounted to be grown (see mount(8)). The existing contents of the
filesystem are undisturbed, and the added space becomes available for additional file
storage.

This is how things went:
root@adrenalin:~# xfs_growfs /
/dev/root: No such file or directory
Usage: xfs_growfs [options] mountpoint

Options:
-d grow data/metadata section
-l grow log section
-r grow realtime section
-n don't change anything, just show geometry
-I allow inode numbers to exceed 32 significant bits
-i convert log from external to internal format
-t alternate location for mount table (/etc/mtab)
-x convert log from internal to external format
-D size grow data/metadata section to size blks
-L size grow/shrink log section to size blks
-R size grow realtime section to size blks
-e size set realtime extent size to size blks
-m imaxpct set inode max percent to imaxpct
-V print version information
root@adrenalin:~# xfs_growfs -t /etc/mtab /
meta-data=/dev/md/1 isize=256 agcount=45, agsize=3449690 blks
= sectsz=4096 attr=0
data = bsize=4096 blocks=155236032, imaxpct=25
= sunit=0 swidth=0 blks, unwritten=1
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=16384, version=2
= sectsz=4096 sunit=1 blks
realtime =none extsz=262144 blocks=0, rtextents=0
data blocks changed from 155236032 to 232854048
root@adrenalin:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/md/1 889G 543G 347G 62% /
/dev/sdd1 67G 22G 46G 33% /crap
/dev/md/0 92M 16M 71M 19% /boot
root@adrenalin:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md1 : active raid5 sde2[3] sdc3[2] sdb3[1] sda3[0]
931416192 blocks level 5, 128k chunk, algorithm 2 [4/4] [UUUU]

md0 : active raid1 sde1[3] sdc1[2] sdb1[1] sda1[0]
96256 blocks [4/4] [UUUU]

unused devices:
root@adrenalin:~# cat /etc/raidtab
raiddev /dev/md0
raid-level 1
nr-raid-disks 4
nr-spare-disks 0
persistent-superblock 1
chunk-size 32

device /dev/sda1
raid-disk 0
device /dev/sdb1
raid-disk 1
device /dev/sdc1
raid-disk 2
device /dev/sde1
raid-disk 3

raiddev /dev/md1
raid-level 5
nr-raid-disks 4
nr-spare-disks 0
persistent-superblock 1
parity-algorithm left-symmetric
chunk-size 128

device /dev/sda3
raid-disk 0
device /dev/sdb3
raid-disk 1
device /dev/sdc3
raid-disk 2
device /dev/sde2
raid-disk 3

No comments: