RAID is NOT a disaster recovery solution

SenseiSteve

HD Moderator
Staff member
I suspect way too many web hosting clients continue to view RAID configurations as disaster recovery solutions. While different RAID levels do address redundancy and performance factors, they are NOT disaster recovery solutions. Murphy’s Law says that if anything can go wrong, it will go wrong – and at the most inopportune time.

Please share your take on best practice back up solutions.
 
Please share your take on best practice back up solutions.
My old hosting company has suffered its share of disaster situations, and thankfully over the years, we had enough disaster recovery processes in place that when the next one came along, we were prepared.

Steve, I think you weathered through a few servers that were in recovery mode during your time with Hands-on. I know the HD forum went through a disaster recovery when an array failed.

At the time, we used r1soft for backups, and it was awesome. We could take a full image of the server, store it for days and weeks, and have a point to recover back to. It was perfect in every way - EXCEPT ;)

The Exception came to the actual recovery process. It needs to complete the entire recovery of all files and databases before the system would reboot and come online. This means with 200GB of data, that data had to transfer from another server (backups were off-site), then unpack, then go through its self-checks, and then launch. That process could take anywhere from 4 hours to 24+ hours depending on the amount of data.

BUT, we had a backup plan :) Not only did we have full server recovery with r1soft, but we also made individual cPanel user account backups and stored those on different remote servers (we ran rsync to move the files every night for all users on all servers). Again, storing up to 4 weeks of recovery points. This allowed us to offer a quick solution to our clients who were affected by an outage. They could wait for the existing system to restore, OR, if they opened a ticket, we could restore their backup from the previous day to an alternate server, update DNS and they're back online within minutes rather than hours.

Disasters are GOING to happen, do you have a plan in place? Do you have an email in place ready to send to users, and followup templates explaining the process? When we had our disasters, and we had a few over the years, it was always an all-hands on deck. Heck, my first 2 employees were because of a server failure. They offered to jump into Live Chat and communicate with other customers on what was going on, while I continued the restoration process. THAT was the dedication of our staff and the type of clients we hosted!
 
The moral is always to take regular off-site backups. We are using JetBackup and it's working like a charm. We also take full server snapshots on regular basis.
 
I know the HD forum went through a disaster recovery when an array failed.

At the time, we used r1soft for backups, and it was awesome. We could take a full image of the server, store it for days and weeks, and have a point to recover back to.
I remember that day very well when HD crashed. It was an incredible feeling to have been restored very quickly and without losing a single thing. It made me a real fan of R1Soft.
 
Top