in reply to Re^2: [OT] Reminder: SSDs die silently
in thread [OT] Reminder: SSDs die silently

Time to really finish this story:

the SSD will be subject to a nice 4 kV burn-in test

That was spectacularly unspectacular. A few sparks from the 4 kV probe, but no burn marks, no fire, no exploding parts. Our 4 kV supply is just way too limited. It can deliver just a few mA. The next misbehaving SSD will just see plain mains voltage. 230 V with a slow-blow 16 A fuse.

I decided to order another fake RAID controller, using a relatively cheap SATA controller, but from a manufacturer with a good reputation and a lot of RAID experience.

That fake RAID controller is really a nice piece of hard- and software. But it is not completely free of problems. It still had trouble when running more than one VirtualBox VMs at the same time in the factory default configuration, both on my work machine and on my home machine. So I finally called tech support. The manufacturer insists on phone calls, which is a little bit odd, but it took just one phone call to get rid of my problem. The supporter told me, no, that should not happen, not with my machines, and not with any other. I was using the newest firmware and drivers available, and so I was told to try disabling Native Command Queuing for all SSDs right in the controller's BIOS. The drivers will respect that setting. I also disabled sleep mode, just to be sure. Disabling NCQ costs a little bit of performance, but both machines now work fine. I don't care if disk performance goes down by a few percents, the SSDs are sufficiently fast even without NCQ. If the onboard SATA fake RAID had a way to disable NCQ, I would try to go back to the onboard RAID. It is there, it has power, it has a sufficient number of SATA ports, and it does not need a PCIe slot.

A little detail: The RAID software does write a log file, to aid debugging. But that does not help if the log file is written to the RAID volume that has problems and needs to be debugged. The supporter proposed the obvious solution: Add a USB flash drive and have the RAID software log to that drive instead. I don't do that, my problem is solved.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Replies are listed 'Best First'.
Re^4: [OT] Reminder: SSDs die silently
by NERDVANA (Priest) on Jan 12, 2024 at 18:36 UTC

    For local development work, I'd recommend skipping raid altogether. RAID is all about uptime, and development work doesn't really benefit from that, much. Just pop in a 2TB NVMe drive and make daily backups. As for speed, the other week I had a new amazing experience: 1.2GB/s data xfer between NVMe drives, and that was over 700GB, so not just into cache. That's almost double the maximum theoretical SATA III speed, and my NVMe drive was able to *write* that fast.

    On my backups server I started using zfs, which can do raid on its own. I'm doing the equivalent of raid 5 across 3 10TB drives for 20TB of storage.

      For local development work, I'd recommend skipping raid altogether. RAID is all about uptime, and development work doesn't really benefit from that, much. Just pop in a 2TB NVMe drive and make daily backups.

      Handling a failed SSD:

      RAID Backup
      Urgency of getting a new SSD low high
      System downtime half an hour for swapping the SSD hours
      Downtime plannable yes no
      Annoyance low very high

      Yes, I do have backups of all SSDs, at home and at work. But the RAID buys me time to fix the hardware problem. At home, its mostly annoying to have to restore a backup. At work, the delay is simply not acceptable on some days, when projects are on fire.


      So, let's assume a major brand NVMe SSD. c't has recently (issue 1/2024) tested some 2 TB SSDs. I will pick the one with the best sequential write performance for the entire SSD, because I would need to write my backup to a fresh SSD. The write performance for a 5 min run is way higher for all SSDs, but caches and other trincks won't help when writing a large part of the capacity. The fastest one is a "Gigabyte Aorus Gen5 12000 SSD" at 2350 MByte/s. This is a big hunk of metal, two heatpipes, and a tiny PCB, with a street price of about 300 €, more than double of the cheapest tested SSD (138 €). The three slowest SSDs tested can can write only 134 MByte/s. The cheapest SSD tested can write 1140 MByte/s.

      I will also assume that the SSD was filled up to 75% before it died (that's how full my SSDs at home are). So we'll need to write 1.5 TByte = 1_500 GByte = 1_500_000 MByte. Assuming a sufficiently fast backup source (i.e. an equivalent SSD in a PCIe slot), the slowest SSD will finish after 11195 sec = 3 hours, the cheapest one after 1316 sec = 22 min, the fastest one after 638 sec = 11 min. Impressive.

      But unfortunately, my backup is not on another expensive SSD. It's on a cheap harddisk RAID on the network. Both at home and at work, it's on a small server on a switched Gigabit network, so we can't get faster than 1 GBit/s without changing hardware. Completely ignoring any protocol overhead, network usage by other users, and assuming sufficiently fast disks in the server, we'll max out at 125 MByte/s. That's even slower than the slowest SSDs in the test, and needs 12000 sec = 3 hours 20 min. With protocol overhead and real hard disks, that's more like four or five hours, perhaps even more.

      Yes, I could upgrade to 10 GBit/s at home, but we won't get 10 GBit/s ethernet any time soon at work. But let's pretend we would upgrade cabling, switches, and servers, and workstations to 10 GBit/s. Again ignoring protocol overhead, network usage by other users, we'll max out at 1250 MByte/s, 1200 sec = 20 min. I'm quite sure the server harddisks won't be able to deliver 1000 MByte/s, so these numbers are just nonsense.

      I could connect the NVMe SSD using USB 3.0 (that's the limit of the server). USB 3.0 runs at 5 GBit/s. Again, I'm completely ignoring any protocol overhead, so that's 625 MByte/s, 2400 sec = 40 min. This is not supported by the backup software, but let's pretend it could restore that way. Again, the server harddisks probably won't be able to deliver 500 MByte/s.

      Getting a new SSD by driving to the next computer store might take an hour, and I'll have to take any SSD that's available.

      So, to sum up, restoring a 2 TB NVMe SSD that was filled to 75 % will take more than half of a work day, and it needs to be done ASAP, even if other things are burning.

      The RAID solution needs a few clicks on my favorite web shop, the replacement SSD is my favorite model, and it's on my desk within two work days. I can delay that while things are burning. Some time later, I'll have a planned downtime of half an hour for swapping the SSD, and can continue working right after that. Reconstructing the RAID-1 can happen at almost maximum write performance, if I allow that to happen. During that time, disk performance will suffer. With the crazy fast SSD, that would be done entirely within the lunch break. With my existing SATA SSDs, it will take two or three hours, with acceptable remaining disk performance. It does not really matter at all. What matters is that I can continue working even when an SSD fails.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
        That's a thorough and accurate analysis, but I guess workstation uptime matters less to me, and my recovery process is a bit simpler. I would just hop over to any other working system (like my laptop) and checkout the git repo and pick up where I left off, in the worst case waiting for a database snapshot to restore into a docker container. If I did need to rebuild a workstation in a crunch, I would just use a bootable Linux USB drive and a spare slower HDD or SSD and only copy back the files I needed right then from my backups.

        As far as probability and cost, I've only had 2 SSD die on me in 10 years (other than an entire batch of Samsung 870s with the firmware bug that caused them all to hit end-of-life 10x faster than normal, in which case RAID didn't help and two servers went down) and I like to buy the semi-expensive ones for the improved performance at routine daily tasks like running the test suite, and buying those in pairs feels a little too expensive for RAID. Also, there are limited number of NVMe slots on a motherboard, though of course you can buy PCIe riser cards for as many PCIe x4 slots as you have.

        In case you hadn't seen them, there are actually USB 3.2 10Gbit NVMe enclosures, which are pretty awesome. Just stick your old NVMe in that, and you've got a pretty amazing portable drive.