Since I built my HTPC last year my movie collection has been growing at a steady rate and with HD content weighing in at between 4-8Gb per movie I knew I would eventually run out of space. But this time I didn’t want to fall victim to the slap another USB drive on the back option. One of the contractors I’d worked with told me that he would religiously copy the entire contents of his media library to a second set of discs for safety. I’ve invested far too much of my time over the last year in building up my collection so I don’t want to risk losing it all. I already had around 1.5Tb of media and another 1Tb of data I needed to hang onto so I needed a better solution…
As an IT professional I’m used to dealing with high end storage systems on a daily basis, so why not at home..? Well first things first, the kind of storage systems I’ve used e.g. NetApp FAS 270 was at the time of purchase was somewhere in the region of £10k with half the bays fully loaded and that’s way too rich for most of our budgets!
In the past I had a server running Windows 2003 hosting an assortment of hard drives, before it gave up the ghost I had something like five internal hard drives (IDE) and three external USB drives. And you know what the scary thing about all this was… there was no redundancy… simply because I couldn’t afford it!
In the course of my research I’d looked at conventional RAID systems but there are drawbacks such as expensive Hardware RAID cards, having to buy identical hard drives and the biggest of all was that most RAID systems will recover from one drive failure but if you lose two then you’ve lost everything! Now with ZFS/2 on OpenSolaris you can survive up to two drive failures but the arrays are not easily (and more importantly – cheaply) expandable.
An alternative solution presented itself in the form of unRaid by Lime Technology, its greatest strength over conventional RAID solutions (including OpenFiler and FreeNAS) is that it doesn’t require proprietary (expensive) hardware cards, it can use a mix of drives (both IDE & SATA) from different manufacturers while still offering complete protection against a drive failure by means of a parity drive. Also, unlike conventional RAID arrays where all the drives are constantly spinning unRaid can power down drives that are not in use saving power. While unRaid is based on Slackware Linux and may put Wintel peeps off, it is surprisingly flexible in what it can run on especially the fact that is installed and runs from a usb flash drive! I found both Systm’s Ep 108 and Robbie Ferguson’s feature (hell, it was a whole program!) on Category5.tv invaluable and it swayed me towards unRaid.
Ever since the main hard drive in my PC crashed I’ve become increasingly aware of how much stock we place in the humble hard drive as a repository for all our information. With life spans of only typically 3-4 years it doesn’t take long before you end up with a collection of assorted drives with you ever expanding downloads and no means of safeguarding your digital collection. It only takes one drive crash and those priceless photos of your family growing up are lost… sometimes forever… it’s just way too risky.
So with some funds and most importantly “Spousal Approval” I set about building myself a new media server that was easily expandable and still provided me with a high level of data protection based on unRaid.
I reused the blue Chieftec case that my old 2003 server was built into, for its size it can easily hold 6 x 3.5″ drives in removable bays as well as additional drives in the 5″ bays above, hence it’s nickname the “Tardis”.
The remainder of the server was built using the following components:
Asus MA478L-M AM3 Mainboard
AMD Sempron 140 CPU
Kingston Value Ram 2Gb PC6400
Lexar Firefly 2Gb flash drive *
2 x Western Digital 2Tb EARS Green Drives **
1 x 1Tb Samsung Spinpoint HD103 (Recycled from the old server)
* Recommended by Lime Technology
** Has some issues with ‘intellipark’ read below for workaround
Building the server is relatively straightforward, to start with I downloaded the “free” version of unRaid (supports 2 x data, 1 x parity) and installed it to the flash drive. When setting up the boot order in the Asus’s BIOS you need to disable all other boot devices and set the USB to “forced floppy” to ensure that the drive receives the same assignment every time.
With the Western Digital “Green Power” drives it is recommended that you first ensure that you have place jumpers across pins 7-8 as these are advanced format drives and will suffer degraded performance due to the drive sectors not being aligned. Once you’ve done this you need to download the “preclear_disk.sh” script from the unRaid forums (you’ll be spending some time there!) and copy it to your flash drive. The script was developed to perform intensive stress testing and zeroing out the drive prior to making it available to unRaid as a “precleared” drive. Doing this at least 3-4 times will absolutely hammer the drive and weed out any failing hardware, bad sectors etc before committing your precious data. Just bear in mind that one pass on a 2Tb drive can take around 28 hours!
It turns out there’s a rather nasty issue concerning Western Digital’s Green Power drives has come to light. As part of the power saving measures the drive’s firmware parks the heads, which is fine in principle that is until you realise that it does this every 8 seconds. What’s worse is that the drives have a maximum load count of 300,000 before the drive mechanism becomes prone to failure. Now you might be wondering “what is so bad about that..?”, well based on the heads being parked every 8 seconds which is equivalent to 29 times each hour your drive should be toast in just over 3 months with continuous usage. Not good for a media server.
Thankfully this issue can be fixed by downloading and running the wdidle3 utility from a bootable CD. Once you’ve booted your server just type in “wdidle3.exe /d” and the utility will scan all your drives and disable intellipark on the Western Digital drives. It’s not a pretty fix, but it is reported to do the job.
With those pesky issues aside I booted up the server and configured the drive assignments. I firstly assigned 2 data drives leaving the parity drive out of the array to speed up the initial transfers. Once that was out of the way it was time to trick out the server with unMenu which offers a wider range of add-ons to enhance managing unRaid 🙂
Before I began transferring data across I wanted a means of verifying the integrity of the original files. I used Checksum from Corz.org as it was perfect for creating unified (md5 and SHA1) hashes of files. After this was done I started moving all the media files from my HTPC to the server using Teracopy across my gigabit network, while doing this I was planning on streamlining my media library’s management by using a more suitable file structure. It doesn’t make much of a difference at the user’s level but it makes things easier for me to manage in the long term… and I’m all for making things easy!
The next task was verifying the data, which was a good thing as I had some issues with a few corrupt files copied using Midnight Commander. Thankfully, it was only a few files which were quickly copied back and re-tested. Once all the tests came back without errors I added the parity drive to the array and left it alone to get on with it for some 10 hours. With the parity drive functional I have protection against any single drive failing. Now I know what you’re thinking… so what if two drives fail..? well all that you’ve lost is the data on the two drives, the remaining drives are unaffected, now try that with conventional Raid 5!
At the moment the only issue I have left is with implementing WOL (Wake-On-Lan) as it is a bit hit and miss so to speak. I’m using a script that monitors the drive activity and puts the server into S3 sleep mode, the strange thing is that if you put the server to sleep via console I can wake it up again using a magic packet. But when I let the script do the same thing it looks like it takes the network interface down with it??? I’m still looking at why this might be but for now I can live without it.
I’m presently using the “free” version of unRaid, but I will be looking to upgrade to the “plus” version which supports up to 6 drives and a cache drive. So now I have a fault-tolerant media library that hosts almost 3Tb of movies, TV Shows (naturally the “Tardis” is hosting my Doctor Who collection – Thanks Ed!) and my Anime collection.
Life is good!
Update: 10th September 2010
The Wake On Lan issue has been resolved after some research, lots of trial and mostly error and the French (Yeah I know…) but if it wasn’t for this handy little Magic Packet Generator I’d still have been wondering what had gone wrong.
To elaborate, I’ve been using the Depicus Wake On Lan command line utility as mentioned in Lime Technology’s forums. It uses a simple batch file to call the following:
C:\utils\wolcmd 485b9548ed03 255.255.255.255 255.255.255.255
Lots of people have claimed that they have had no issues in using this command and that WOL functions perfectly. However, after I brought the issue to light several members quickly discovered that I might be onto something as their systems exhibited the same characteristics e.g. failing to wake up from an S3 sleep state. In the course of my troubleshooting this issue I’ve discovered that WOL works correctly if the server’s IP address and Subnet Mask are entered and not just to allow a broadcast to all networks!
It just goes to show that even the most experienced of us can make mistakes and that you should never assume that just because everyone else says they use it and it works that it is actually right…
Update 1st July 2012
In the past week I’ve recently started seeing errors on my parity drive. The errors started to appear when I was copying some data to the drives. I noticed that there were a number of UNC Media Error messages in the syslog. Running parity checks came back fine. I decided to run a number of SMART tests against the parity drive and after several long tests the errors disappeared.
The specific SMART error was:
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline – 5
This is apparently related to Western Digital drives. Luckily after several long SMART tests the errors disappeared.
The only other thing of note was that I have replaced the “Hi Power” PSU with an OCZ ZS550 psu which I got from PC World for £49.99 (I’m sure it may be slightly cheaper elsewhere but what the hell!). It’s a single +12V rail PSU which is recommended on unRaid’s forums and has no less than a whopping 8 SATA connectors. Perfect for when I add a new drive to the array which will be very soon as I’ve run out of space!