Yesterday, one of the SSDs in my main computer suddenly died, from one second to the other. It simply disappeared from the system, leaving two very confused virtual machines behind that lost access to their virtual disks stored on that SSD. This way, I lost about one hour of work. That would have been annoying, but could have been fixed easily. Shut down, rip out the SSD, replace it with a fresh SSD or a harddisk, and restore the backup.

But: That SSD was added at the beginning of the Covid-19 pandemic, as a quick hack to have room for the VMs needed for working from home. It was never intended to work for more than a few weeks, and so I simply forgot to include that disk in the configuration of the backup software.

I tried about an hour to read the dead SSD using two other computers, but it is dead. It identifies correctly, but reports junk when reading SMART data, and reads not a single bit of user data. I reassembled my computer, added a temporary HDD, ordered a replacement SSD, and started a 17 hours copy job to get the required VMs as huge ZIP files from work to home. It will take another hour or two to unpack and reconfigure the VMs for the new environment. And one or two hours to resync some work data from a cloud service.

This is totally my fault, having no backup for that disk was stupid, period.

So, take this as a warning if you are - like me - used to get an audible warning from a failing disk. SSDs die silently and suddenly. You won't get that nasty metal workshop sounds you know from failing hard disks.

Check your backups, and check your backup configuration.

Updates:

Changed some wording.

https://www.backblaze.com/blog/ssd-edition-2022-drive-stats-review/ does not look very promising for using SMART monitoring. SSD SMART data is messy at best:

[L]et’s talk about SSD SMART stats. [...] we’ve been wrestling with SSD SMART stats for several months now, and one thing we have found is there is not much consistency on the attributes, or even the naming, SSD manufacturers use to record their various SMART data. For example, terms like wear leveling, endurance, lifetime used, life used, LBAs written, LBAs read, and so on are used inconsistently between manufacturers, often using different SMART attributes, and sometimes they are not recorded at all.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

In reply to [OT] Reminder: SSDs die silently by afoken

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.