Friday, December 7, 2007

Home-brew Network Attached Storage Project

Revisions:
Dec. 19th, 2007: Corrected design approach & added cautions
Dec. 8th, 2007: Home-brew considerations, Parts & software, Market evaluation
Dec. 7th, 2007: Intro, Solution, Design


a work in progress



After some unfortunate accidents in the past, where data was *almost* lost, I decided to take certain precautions to make sure my data is safe.
I've been a Slackware Linux user abuser for around 5 years now, and have recently moved to Slamd64, which is a port of Slackware for 64-bit hardware (64-bit OS). I used Linux's software RAID capability to build a RAID5 array of 3x 320GB disks, yielding in 590GB of storage space. (the wasted space is for swap partitions, boot partitions and filesystem size).
As of this writing, I have 96.1GB left, and I still haven't copied my old data!


Why not an OTS?

To understand why I'm building one from scratch, rather than getting one of those off-the-shelf (OTS) Network Attached Storage (NAS) devices, one would have to understand the problems of the OTS devices:
  • OTS NAS devices come with a built-in hardware RAID controller; If that controller got busted/burnt, you can no longer access your data, unless you buy ANOTHER device of the same brand (all hardware RAID controllers use proprietary methods to store data).

  • If one of the disks fail, you have to buy a replacement disk of the same brand AND model number (limitation of the hardware controller), which is most likely not possible, unless you buy another NAS device of the same brand, hoping to have the same old disks.

  • Limited connectivity, by offering usually USB 2.0 or maybe 100Mbit Ethernet, rather than 1Gbit, WiFi, 4x USB 2.0, FireWire, ...etc.

  • To extend your current storage, you have to buy another NAS, as there's no way to expand/stack an existing one.


What does a home-brew NAS offer?

It depends on your approach, and design of the NAS, and of course, budget!
I'm planning an initial 4TB NAS, to withhold ALL my data, and data of other family computers in the house (either directly, or backed to the NAS). So, my budget is kind of flexible, but I'm all for cheap stuff, so I'll squeeze the money out of every item I buy!


Getting to the point, the NAS I have in mind, will offer the following:
  • Easy and cheap disk expansion

  • Multiple ways of delivering data to workstations: Ethernet, Wifi, USB / HTTP, FTP, Samba (Windows share)

  • User-level data separation, by authentication

  • Scheduled backups

  • Uninterruptible Power Supply (UPS) Support

  • Ability to use disks from any vendor

  • Data openness: If the management system fails, a replacement can be made of any typical computer, and data would still be accessible

  • Stream video to PS3 (using VLC) or to TV, using a laptop/set-top box.

  • Early detection of failures: By using software that monitors the disks, whenever something fails, or errors start occurring, send an email.

One could argue that current OTS NAS devices that offer 2TB are cheaper than building one from scratch, which might be true, but most likely would not offer any of the advantages above, and would suffer from all disadvantages. And, on the long run, using a home-brew one, would prove to be cheap, when expanding to extra storage, as demand grows.


Points of caution with a home-brew

If you aren't careful with your design and implementation, you could seriously jeopardize your data.

Before jumping into scenarios, I should mention this info, in case you didn't know it: Linux's RAID arrays can be built on partitions & block devices, that is, you can create a RAID5 array of 5 partitions on the same disk, or of different disks!

Now, consider the following:
  • Scenario 0 - Expansion: Let's say I have a RAID5 array of 5x 500GB disks, and have consumed 90% of it. Time to expand! Unfortunately, the market no longer has 500Gb disks, so let's say we have 750GB disks. When adding the new 750GB disk, I'd have to partition it into 1x 500GB & 1x 250GB, because the smallest disk size in the array is 500GB. The remaining 250GB partition can be treated as a standalone partition and not part of an array. (its contents can be backed up to the array, though).

  • Scenario 1 - Expansion: Instead of buying a new 750GB disk, I bought a 1TB disk. The WORST idea ever is to partition the disk into 2x 500GB partitions and adding BOTH to the RAID5 array; why is that so bad? because if the 1TB disk failed, I'd lose 2 pieces of the array, not just one, and hence, loose ALL my data residing on that array!!
    A workaround, is to join one partition into the array, and put the other as a standalone.

  • Scenario 2 - Replacement: Let's say I have a RAID5 array of 5x 500Gb disks. One disk made a bobo and died on me. The above solutions of the Expansion scenarios apply here, as well.


Design

Fig. 0x0

Fig. 0x1


  • All disks will fit in a rack/case, with a built-in Power Supply Unit (PSU)

  • Next to the disk rack, is a motherboard rack/stack; they'll act as management computers. Each motherboard should have 6-8 SATA2 ports handled by the chipset directly, not by a controller*.

  • SATA data cables are extended from the disk rack to the management stack.

  • All racks are powered through a UPS. The management boxes monitor the UPS's battery status, and put the disks in sleep mode then shutdown automatically when needed. (Or shutdown the disk rack if possible to shutdown the specific UPS port)

  • Management boards will boot the OS from a USB memory stick, because it's cheaper than a hard disk (and faster)

  • Next to the disk rack, is a motherboard rack/stack, which will have multi mini/nano ITX boards; they'll act as management computers

  • Management boxes must have built-in SATA 2 ports and one PCI slot. A PCI-to-SATA adapter will be connected to provide more SATA ports. The max is around 6 ports per board, that is 6 managed disks. This assures that if one management box dies, some data is still accessible.


  • * Because the controller communicates via the PCI bus, which has a lower speed than that of the chipset, and all ports connected to the controller will be sharing a single bus. Ports that are handled by the chipset are independent.
    * 32-bit PCI buses have a max throughput of 133 MB/s, while the common PCIe 1.1 has a max of 250 MB/s, and PCIe 2.0 has a max of 500 MB/s.
    This means, having a PCIe 1.1 adapter/card with 4 SATA2 ports, will result in limiting the max throughput of the disks to 250 MB/s, instead of 300 MB/s per disk, because all the disks connected to these 4 ports are sharing a single bus.
    One could have multiple PCIe cards, but create RAID arrays of one disk from each card; This would work for non-high-load configurations. Under high load.

Figure 0 is the old design, which I scrapped (using mini/nano itx motherboards & PCI).
Figure 1 is the new approach and shows 2 ways to implement a NAS:
  • The first is more suitable for environments where there's no dust accumulation, like server rooms, or if you like the occaisional dust-blowing adventures... It also depends on the space available for future growth.
    Note that those disk-specific racks come with their own power supply, which is different than a typical power supply, because it has power pins for disks only, no motherboard pin.
  • A good reason to go for the 2nd is if the case supports multiple power supplies, for redundancy. Useful for critical machines.
    Pay extra attention to the power supply & make sure it can support the max number of disks your motherboard allows.


Parts & software

  • 1x disk rack, or a CD/DVD duplicator rack, fitted with 5.25"-to-3.5" bay converters.
    Note: duplicator cases' PSU provide 1 power cable per bay; The bay converters take 3 5.25" bays, and provide 4 disk bays. This means, for a 9-bay rack, 4 disks won't have power.

  • 5x 1TB disks (NO Western Digital!!! Very bad history)

  • 2x CD/DVD PSU duplicator for 12 disks (on a 9-bay rack) -- One will be used, and another is a backup

  • mini or nano ITX boards with 2x SATA ports & a PCI slot to house a PCI-to-SATA adapter.

  • UPS: Power rate depends on the number of disks & boards

  • OpenFiler OS to manage the disks.


Market evaluation

The prices of computer hardware in Kuwait are very high, and depending on the total cost of the project, one must see whether it's worth buying from the Internet, or not.
As of this writing, the price of a 1TB disk = 100 K.D., and 500GB = 40 K.D., in Kuwait, while online, 500GB = 25K.D. and 1TB = 74 K.D. ($1 USD = 0.2743 K.D.). So, if you decide to buy 3x 500GB disks, you'd save 45 K.D., take 30 K.D. out for shipping (assuming worst case), you still save 15 K.D.!
Note, that on 1Tb disks, you save 25 K.D. per disk, so for 5 disks, that's 125 K.D., minus 45 for shipping (assumption), you save 80 K.D.!!!! These savings can get you a management computer!
So, as far as I'm concerned, I'll be buying from the Internet, to avoid the absurdly high expenses of the local market.


To be continued

No comments: