Linux RAID-5 NAS – Setup Guide CREDITS: This document was created by Joe Bishop, and last revised on 10/07/2009. Each of the modules contained in this guide are property of their respective owners. I did not write, compile, or build any modules myself, and this guide is not intended to take ownership of these modules. My role was simply to build a comprehensive document, with proven methods, to build a file server, targeted primarily at those who are strong technical administrators, but weak in Linux. SUPPORT: I recommend requesting help at the Ubuntu forums. There are many others out there that know a lot more about these systems than I do. Linux is open-source, community maintained software; use the power of the community! ABOUT: This document gives step-by-step instructions on how to build a Linux software RAID-5 NAS. The operating system utilizes Ubuntu 8.04 Server Edition, which is a free, open-source distribution, based upon Debian. For additional information about Ubuntu, visit http://www.ubuntu.com TARGET AUDIENCE: This document is targeted to IT professionals and system administrators, who work primarily in a Microsoft Windows environment, and have little or no experience with Linux. The ideal target has the desire to build and maintain a Linux file server, interoperable with Windows PC's, transparent to the Windows' users, while providing a stable, low-resource, fault-tolerant, self-monitoring, self-healing environment. The software used is entirely open-source, and this instruction set should be interoperable with any distribution of Linux based on Debian. FEATURES: • Command-line only, saving resources • Simple setup of shares for RO and R/W access • RAID-5 fault tolerant • Hot-Spare, for automatic rebuild/resync/recovery after a drive failure • Ability to Extend RAID volumes (Add additional disks) while still online • Ability to upgrade all physical disks, to expand storage size, while still online • E-mail notification for critical alerts. • SMART disk monitoring and e-mail alerts • Web administration and monitoring, via Webmin • Real-time system monitoring (CPU, memory, processes, uptime, etc) via HTOP Using SMART monitoring, we are able to do some preventative monitoring of the physical disks. This gives us some advance notice when a disk is likely to fail. This is particularly useful since this server will be used mostly by home IT professionals in their personal testing labs, where we do not have unlimited funding. It is a great benefit to know when you have to order a new hard disk. Most of us do not have spare high-capacity drives laying around! The Smartmontools daemon will e-mail you when things start looking fishy. The smartmontools will do two other important things for our RAID – Most people do not know that all SMART enabled hard disks (every hard disk made in the last 5-10 years), have built-in temperature sensors. Smartmontools will poll this data, and alert you when temperatures get out of spec. The third important feature is the ability to view the hard disk make and serial number. In the event of a drive failure, if you have several drives that are the same make/model, it can be hard to figure out which physical disk failed. What you should do when you are installing your hard disks is, use a labeler, and put the last 7 digits of the serial number of the drive, in a place that is visible when the disk is installed. When a disk fails, you will be able to positively identify it by its’ serial number. HARDWARE REQUIREMENTS: This will work on any hardware platform that is compatible with Ubuntu Server. This is particularly targeted at those wishing to use SOFTWARE RAID, not hardware RAID, figuring if you are using hardware RAID, you will have a third-party solution for managing your RAID already, and would not be in need of this solution. In my personal experience, I started testing this in a Virtual Machine (using the free VirtualBox virtualization software), then built a physical server in my lab, using a Intel P4 2.66Ghz, 1GB RAM, and three IDE 120 GB drives, and a 4th one to test the hot-spare, and disk extend functions. After over a month of testing, I made the jump to build my production server, starting with three 1TB drives, then expanded to five drives, and added a sixth drive as a hot-spare. This does NOT require “Server-class” hardware! LINUX COMPONENTS USED: • Ubuntu 9.04 Server Edition (or 8.04 for long-term support) • MDADM v3.0.2 • Samba • Webmin • Smartmontools • Sendmail • OpenSSH • HTOP **Disclaimer: If you value your data, back it up! RAID-5 is an excellent on-the-fly redundancy, but is in NO WAY a replacement for backups! Don’t EVER make any changes to your RAID without a FULL, VERIFIED backup of your data!!** (that's the articulated version of "Don’t come crying to me if you mess up and lose data!!) All commands are text-based in the terminal screen of the server, so they are all in italic to signify exact syntax. Commands are CASE Sensitive! SETUP: Download Ubuntu 9.04 Server -- from http://www.ubuntu.com/getubuntu/download Note: v9.04 is supported through 2010. If you plan on leaving this server running untouched for many years, I recommend v8.04 LTS (long-term support), because it is supported through 2013. The caveat is that upgrading from 8.04 to higher versions has been painful in my experience, because the upgrade breaks some components. Personally, I would start fresh with v9.04, and keep an eye on this guide around April 2010, when I start testing the next LTS release, v10.04. Burn ISO image, boot to CD from your new server. * Recommend you only have your OS drive in the server at this point. We'll add the storage drives later. Run a default install, select Samba, and OpenSSH packages. No others apply. Once install is completed, logon, and install MDADM: sudo apt-get install mdadm Update to newer version (3.0.2): sudo apt-get install build-essential wget http://www.kernel.org/pub/linux/utils/raid/mdadm/mdadm-3.0.2.tar.gz tar zxvf mdadm-3.0.2.tar.gz cd mdadm-3.0.2 sudo make sudo make install Verify update was successful: sudo mdadm --version It should show version 3.0.2 now. Install Sendmail (for e-mail alerts): sudo apt-get install sendmail Edit /etc/hosts: sudo nano /etc/hosts on the 127.0.0.1 localhost line, add "localhost.localdomain (hostname)" should look like this: 127.0.0.1 localhost localhost.localdomain SERVER01 ctrl+o and enter to save changes, then ctrl+x to exit. Install SmartMonTools: sudo apt-get install smartmontools Install Webmin: sudo apt-get update sudo apt-get install perl libnet-ssleay-perl openssl libauthen-pam-perl libpam-runtime libio-pty-perl libmd5-perl sudo wget http://prdownloads.sourceforge.net/webadmin/webmin_1.420_all.deb sudo dpkg -i webmin_1.420_all.deb Shut down the server; install your physical disks to use in the RAID-5. Keep in mind, all disks need to be the same size, and you must have a minimum of 3 disks for a RAID-5. As far as efficiency goes, in the example of 3 disks, you will get use of 2 out of 3 of those disks. In other words, if you use 3 disks that are 1 TB each, your formatted RAID-5 volume will be 2 TB. CONFIGURE: Samba: Edit /etc/samba/smb.conf so we can add shares. sudo nano /etc/samba/smb.conf Pick a spot, and add in the share, in this format. You can have multiple shares. Make a note of the path = line. [VOL1] comment=Test Samba Share read only = no locking = no path = /var/RAID guest ok = yes This example will give you a share on the server (let's assume you gave your server hostname "SERVER01"), as \\server01\VOL1. Not set to read-only, with guest access and the path on the server is /var/RAID (again, make a note of this for later). Create the mount Point: sudo mkdir /var/RAID Now, it's time to make our RAID-5 volume. First, we need to figure out the identity of our 3 storage drives (for the purpose of this example. you may have more.) sudo cat /proc/partitions This will display all your disks. First thing we want to identify is the system drive. you will see the disks are named "sda, sdb, sdc,.. " and so on. One of the "sd*" drives will show partitions, which are signified by ending in a number. For example, if our system drive is sdc, then below sdc, you will see sdc1, sdc2, and sdc5 (which are /root, /boot, and /swap). Now that you understand this, you should be able to see your other 3 storage drives, and they will show the same "#blocks" which is the size of the disks. Make a note of those drives. For our example, we will call them sda, sdb, and sdd. Make sure you replace your drive names with my examples from here on out!! Ok, enough chatter, let's make the RAID! First, we want to modify some parameters to make the resync go faster: sudo nano /proc/sys/dev/raid/speed_limit_min Edit the value in this file to say 50000 sudo nano /proc/sys/dev/raid/speed_limit_max Edit the value in this file to say 200000 (It should already be correct) Now, let's create the actual RAID: sudo mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sda /dev/sdb /dev/sdd (don't forget to substitute YOUR device names for the sd* names above!!) The initial build of the RAID-5 blocks will take some time. It runs in the background, and we'll check on it later. DON’T restart your server until instructed! Now we make the file system in that RAID-5 volume: sudo mke2fs -j /dev/md0 When that is done, we need to add a line to /etc/fstab so it will auto mount this RAID volume when the server starts: sudo nano /etc/fstab Add the line: /dev/md0 /var/RAID auto defaults 0 0 save and exit: ctrl+o and enter, then ctrl+x Now let's mount that new volume! sudo mount /dev/md0 /var/RAID No response/errors means it mounted. Let's see how the initial RAID-5 build is going: sudo cat /proc/mdstat If it doesn't show a progress bar or percentage complete, then it is done. Let's check the RAID status to be sure: sudo mdadm --detail /dev/md0 Write down the line that says UUID: we'll need this in a second. This will show the full details of the raid, and you are looking to see if it says "Clean" or "Clean, rebuilding". At the bottom of the table, it will show the 3 drives, with the State "Active Sync". We also need to add a few lines to the mdadm.conf to be sure it will auto-mount the RAID volume when it starts: sudo nano /etc/mdadm/mdadm.conf Add the line: ARRAY /dev/md0 uuid=insert the UUID from above here, no spaces If there is a line that says "DEVICE partitions", we need to change it to: DEVICE /dev/sd* (if there is no "DEVICE partitions", still add the line above.) Find the MAILADDR line, and change "root" to your email address of choice, for where you want notifications sent. Save and exit the file with ctrl+o and enter, then ctrl+x Set the mount point to have read/write access: sudo chmod a+wrx -r /var/RAID (Note: I have found that occasionally I will get a folder or file in my share that tells me I do not have rights to read/write, when accessing from a Windows machine. To correct this, you can run the above command on your server, which will recursively (all files/folders) and explicitly set the permissions for all files/folders.) Set Write-Intent Bitmap for the array: sudo mdadm –G /dev/md0 –b internal Note: The bitmap is what makes all the disks in the array aware of what is contained in the parity blocks. This is important if you were to remove a disk from the array for testing, or if it were to drop off for another reason (accidental removal, physical link-layer problem, etc). This will let you re-add the disk to the array without it requiring a full recovery of the array (which, on 1TB drives, takes about 6 hours, at which time your array is susceptible to a real disk failure!). With the bitmap in place, the array will go back to a clean state almost instantly. Configure SmartMonTools: first, Identify the device names of your RAID hard disks: cat /proc/partitions then, edit the smartmontools config file: sudo nano /etc/default/smartmontools and uncomment the “start_smartd=yes” line and the “enable_smart=” line also, you need to change the enable_smart= to be just your RAID disks, separated by spaces. Example is: enable_smart=”/dev/sdb /dev/sdc /dev/sdd” Next, edit your main smartd.conf file: sudo nano /etc/smartd.conf By default, there is a line that is not commented called DEVICESCAN. We want to comment this out, because it can cause trouble. With DEVICESCAN enabled, it will scan all possible fixed drives at startup, and if it encounters something that is not SMART capable, such as a CD ROM or a USB flash, it can take forever to start up while it errors out. It seems to be better to explicitly define what devices should be monitored. Anywhere in the file, you need to add in the two following groups of commands, with your /dev/sd* references: /dev/sdb -a -o on -S on -s (S/../.././02|L/../../6/03) /dev/sdc -a -o on -S on -s (S/../.././02|L/../../6/03) /dev/sdd -a -o on -S on -s (S/../.././02|L/../../6/03) /dev/sdb -a -I 194 -W 4,35,35 -R 5 -m firstname.lastname@example.org /dev/sdc -a -I 194 -W 4,35,35 -R 5 -m email@example.com /dev/sdd -a -I 194 -W 4,35,35 -R 5 -m firstname.lastname@example.org The first group tells it to monitor /dev/sdb with all attributes, and to enable automatic online data collection, autosave the attributes, start a short self-test every day from 2-3am, then run a long self- test every Saturday from 3-4am. The second group tells it to monitor all attributes, and track temperature changes greater than or equal to 4 degrees (Celsius), log temps equal to or greater than 35 degrees C, and e-mail alert to your email address any temps greater or equal to 35 degrees C. To my knowledge, there is no (easy) way to have it alert you of the temperatures in F. The SMART data from the hard disk's temperature sensor is RAW in Celsius. It “is” possible, but you would have to write a script that does the calculation before emailing. Don’t forget to add additional lines to this file when you add more disks in the future! Also you may want to occasionally do a “cat /proc/partitions” to make sure your drive name assignments are the same as what is in this file. You can put a test line in the smartd.conf, so on the next reboot you will get a test email, to show it is working. You should remove or comment this line out after you have tested, otherwise you will get this every time you reboot the server. /dev/sdb -m email@example.com -M test Webmin: Webmin is pretty self-explanatory. To access it, open up your PC's browser and go to http://servername:10000 and you will be brought to a logon screen. You can monitor various server attributes from there, and this is handy for at-a-glance review. We should be done now; we're just waiting for the initial build of the RAID-5 to complete. We can "Watch" the process real-time to wait for it to complete. When it's done you can press ctrl+c to exit the "Watch" mode: watch cat /proc/mdstat When it's all done, go to a Windows PC, and you can access the new share, which again is \\SERVER01\VOL1 you should have a "lost+found" folder in there by default which you can delete if you like. Test to see if you can read/write to the new folder. Reboot the server, and make sure that it comes back up online, and that you are able to access the share from Windows without any intervention on the server. If not, logon to the server, and double- check the /etc/fstab file and the /etc/mdadm/mdadm.conf file. This completes the RAID-5 setup. Extending and Expanding the RAID-5: For the purposes of this guide, I am defining Extending and Expanding as follows: Extending: Your current RAID-5 consists of three 1 TB drives. You need more space, so you want to add another 1 TB drive, for a final total of four 1 TB drives. Expanding: Your current RAID-5 consists of three 1 TB drives. You need more space, so you buy three new 2 TB drives. You can one-by-one replace the drives, to increase the whole volume using three 2 TB drives. A great benefit of this is, your RAID volumes remain 100% online and available to read and write to during the entire process!! (Assuming your physical disks are hot-swappable, and you don’t have to shut down or reboot in order to see the new disks). Steps for Extending your RAID-5: Insert the new drive, and identify its' name (as before) using: sudo cat /proc/partitions (for our purposes, I’m calling it /dev/sdf) Add the drive: sudo mdadm --add /dev/md0 /dev/sdf Then grow the RAID onto the additional drive: sudo madam --grow --raid-devices=4 /dev/md0 This will take some time! Monitor it's progress with: sudo watch cat /proc/mdstat Now we have to extend the file system to consume the now available additional disk space: sudo resize2fs /dev/md0 Once completed, you will notice if you refresh your display of the share from a Windows PC, that the RAID has grown, and it never went offline during the process! Adding a Hot-Spare: This is very similar to the above Extending instructions, so it seemed like a logical place to insert this bit of instruction. A Hot-Spare is a physical disk of the same size as the other disks in the RAID, but its' purpose is to be available to instantly, automatically start rebuilding if one of the RAID-5 disks fails. You will get an e-mail notification that there has been a failure, but you will not have to take any action in order to make the RAID fault-tolerant again, this will happen automatically due to the presence of the Hot-Spare. Here's how: Add the physical disk to the server, identify its' sd* name, then add it (for example, I'm calling it sdh): sudo mdadm /dev/md0 -a /dev/sdh Now when you type in sudo mdadm --detail /dev/md0 it will show your drives in the RAID, as well as a Spare. Testing the hot-spare: It is best to simulate a failure, so you can see the hot-spare in action. First, reboot your server, then do a sudo mdadm --detail /dev/md0 and make sure it still shows your spare listed. Now, let's manually Fail a disk in the RAID, and see what happens: sudo mdadm /dev/md0 -f /dev/sda (For "sda", note from your sudo mdadm --detail /dev/md0 one of the active RAID disks to fail). This simulates one of the RAID disks dropping offline. Now, let's see how the RAID reacted to it. On your Windows PC, you should be able to still access, and read/write the share, and it should still report the same available space; business as usual on the user end. You should have also gotten an e-mail indicating there was a failure, and at the bottom of the email it should show the Recovery has begun. Let's see what is happening real-time: watch cat /proc/mdstat This will show you real-time status, and it should show the Recovery in progress already, percentage completed, and time remaining. This means the hot-spare works just like it's supposed to! Now remember, this is the same as a REAL disk failure, so you HAVE to let the Recovery complete on the hot-spare. Afterward, I would just go back and re-add the "failed" disk back to the array, and it will become the new hot-spare. Of course if this were a real failure, you would replace the failed disk with a new one, then, re-add the new one as a hot-spare. To re-add that “failed” drive, first we have to “remove” it (not physically). Its current status is Failed, and we cannot add a Failed disk back. So, enter: sudo mdadm /dev/md0 –r /dev/sda This will remove it from the array. You can now follow the instructions above for adding the hot-spare, and you will be back in business. Steps for Expanding your RAID-5: **Notes: This will require you to one-by-one follow and repeat the steps until all drives have been replaced. Make sure your physical hardware can support the removal and adding of the drives on-the- fly, otherwise you will have to restart the server between each drive replacement, which is fine, but make sure you plan on the reboots if your server requires a high level of availability. You need to fail each drive one by one, add in the new disk, and allow the RAID to rebuild, verify it, before failing the next one. **PLEASE be 100% sure at each step, that you are doing this correctly!! Don't forget that a RAID-5 volume can only sustain ONE drive failure, so if you do not correctly rebuild at each step, you WILL lose data!!! ** **One last VERY important note: IF you are forced to reboot between drive swaps meaning that your hardware is not capable of a true hot-swap, then you MUST make sure before you fail, and before you add new drives, run: cat /proc/partitions Review, so that you know which drive is which. Linux has a tendency to switch the "sd*" drive name designations randomly at startup, so you don’t want to write down a list of drives, then find out that one of the names changed, and you either removed or added the WRONG disk!! – A side note on this: Some of you may be scratching your head and saying, "well if the sd* names aren’t consistent, how come the RAID doesn’t go all haywire when you reboot the server?" Two reasons: 1. the RAID is defined by the UUID, or “Superblock", which is the hex identifier written as a header to each disk in the RAID, not the sd* name (except in the very beginning where we initially create the RAID, but once the Superblocks are written, this no longer applies) and 2. the line we added earlier that in the mdadm.conf that says "DEVICE /dev/sd*" tells it when it starts up, to scan ALL sd disks (*) for matching UUID's, and assemble them into their RAID, which we defined in the ARRAY line of that .conf file. Just a little tidbit to help further understand what is going on behind-the-scenes there :) Last warning (I promise), during the process of upgrading the disks, the RAID is susceptible to failure. Because we are forcing one of the disks to "fail", then rebuilding on to a new disk, if you were to sustain a REAL disk failure during the rebuild, you could lose data! However, if you DID have a failure, you should be able to (I haven’t tested this) insert the original disk that you "failed", and rebuild to a healthy RAID-5 again before proceeding. Don't rely on that, I don’t KNOW if that will work, so make you a FULL BACKUP! Ok, enough warnings, here we go: For the purposes of this guide, I am assuming the 3 disks are sda, sdb, and sdd, and the 3 new disks are sdf, sdg, and sdh Fail and remove the first disk: sudo mdadm -f /dev/md0 /dev/sda sudo mdadm -r /dev/md0 /dev/sda Add in the first new disk: sudo mdadm --add /dev/md0 /dev/sdf It will rebuild, and it MUST complete fully before repeating with the remaining two smaller disks. Monitor its progress: sudo watch cat /proc/mdstat When it's done, make SURE its' status is clean: sudo mdadm --detail /dev/md0 Repeat the steps above until all disks have been upgraded. Now, we need to "Grow" the RAID to consume the new disk sizes: sudo mdadm --grow /dev/md0 --size=max When that's done, we need to grow the actual file system to the new RAID size: sudo resize2fs /dev/md0 Once that's done, the expansion is complete! In our example of growing from three 1 TB drives (2 TB for the RAID-5 file system), to three 2 TB drives, resulting in a 4 TB RAID-5 file system. Keep in mind when we're dealing with disks this large, the rebuild in between disk upgrades can take several hours, even a full day, so it can take most of a WEEK to complete the full upgrade. But, this is usually of little consequence, because the volume does not go offline at all during the process! MONITORING TOOLS: You can add some simple tools to make administering the server easier. One of my favorite is HTOP, which is a text-based real-time system monitor, displaying CPU load, memory and swap utilization, running processes, etc. This is usually the first thing you go to if you notice any performance-related problems, to determine if the server is getting overloaded, or if something else is going on. sudo apt-get install htop when it'd done loading, you just launch it from the command-line by typing htop. You can exit it any time with ctrl+c. Remote Connection: As with any server, most of the time, you need the ability to work with it remotely. Usually the server is rack mounted in your datacenter/computer room, or in a home user scenario, is a “headless” computer sitting in a closet. Hooking up a monitor and keyboard becomes very inconvenient. You can easily monitor any Linux server from a Windows PC, using a program called Putty. It uses SSH to connect, and give you a text-based terminal view of your server. In our case where we have no GUI at all, this is perfect. This is open-source, and runs as a single .exe, no install required. It can be obtained from the link below. To connect, you just put in the IP or hostname of your Linux server, it will go to a Terminal window, and you will authenticate with you username and password, as if you were physically sitting at the server. http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html From other Linux desktops, you can also install Putty, which should be available through your flavor of Linux’s repository.