CatN unscheduled downtime

By Joe Gardiner Wednesday, 1st September 2010

service-announcements

CatN experienced unscheduled downtime for the majority of yesterday due to a major hardware failure.

Yesterday we suffered a major RAID card failure on a database server, unfortunately we didn’t have the spares in place, resulting in a fix taking much more time than it should have and prolonged downtime. The server has been rebuilding the majority of the evening and is concluding now. The time it takes to repair an array of the size we host is measured in 24hour periods, not single hours, so waiting for this rebuild is not a viable restoration method.

If we experience RAID failures we need to have a plan for immediate recovery, but unfortunately our current graceful failure plan was not successful due to the catastrophic failure of the RAID controller, which left the entire RAID in an inconsistent state.

read more…


Posted in Service Announcements | No Comments »