Sotirov-BG.Net - Downtime for disk replace

Latest news

Downtime for disk replace

Published by Georgi Sotirov at 2017-12-09 04:01:15 UTC

Last night while all students were celebrating, I spent replacing a failing Samsung 830 SSD on the server. The disk was so rotten that it's copy with dd took about 5 hours, which is why the server was offline somewhere between 2017-12-08 22:00 EET and 2017-12-09 05:00 EET. The disk started failing in beginning of September, but recently the number of reallocated sectors become extremely high and I started detecting bad sectors on some system files. The read performance had also dropped and during the copy it fell to 5 MB/s (!), which explains the fore mentioned slow copy of just 64 GB between the old and new SSD. The disk failed only after about 24 000 power on hours (i.e. about 2 years and 9 months), which is rather strange, but maybe this is the normal life span of consumer SSDs?

Anyway, the drive is now replaced with a brand new ADATA SU800 128 GB, which unfortunately is not yet in smartctl database (see ticket 954). The server is back online and fully operational.