in reply to Re: (OT) Redundant Backup
in thread (OT) Redundant Backup

I know that DVDs don't have the longevity of magnetic media, but I should have mentioned a few things --

Never trust your backups. You should regularly refresh any important backups (ie, archival storage that you don't have a current copy of) ... I'd normally do a refresh at about 25-40% of the expected media lifetime, if I was only maintaining a single copy. (I'd move to 50% if I had two verified copies, written with different mechanisms, to different brands of media). There are no current standards for 'archival quality' DVDs. Based on one vendor, I wouldn't worry about media for 25+ years, depending on the media and proper storage.

For sake of argument -- magtape's recommended archival lifetime is 10-20 years. And they make a good point -- the bigger problem is change in technology. (ever had someone come to you with a 5 year old tape that no one bothered to keep a drive around to read? I've had it happen more than once (DDS1, 8MM video, DLT, reel-to-reel) ... sometimes the issue isn't the physical media, but the data can't be recovered because they didn't have the right software to restore it.

For the situation described (40GB backup) ... the media costs for mag tape is going to cost him more than the cost of a DVD burner ... and if he does DLT, he's going to need a SCSI or FC card ... I just can't see the justification for it given what was asked for. Hell, we're looking at backing up terabytes for long term archival storage (ie, speed to recover is not a major factor)

Replies are listed 'Best First'.
Re^3: (OT) Redundant Backup
by fundflow (Chaplain) on Mar 27, 2006 at 20:18 UTC
    Thanks again for the long reply, but I feel that my question wasn't answered. What is it that you are refreshing? Could it be that you are making more copies of corrupt data? This can happen if your backup DVDs or hard-disks had some problem. For the hard-disk, this could also be something caused by a virus or mistaken overwrite.

    I want to make absolutely sure that the data is fine. Also, for the case that it is not I want to be able to repair it. Making 2 or more copies does not solve that problem but luckily there are ECC.

    I will post a script when its ready, as it will surely be useful for many people.

      In the situation I described, you refresh the archival copy (ie, you read in, and write it back out), but that wasn't directly talking about your problem, but about the response to use DLT because it supposedly has a longer lifetime.

      ECC does not help you in the case of complete media failure, physical loss, etc. Offsite backups do ... yes, there is a risk of the backups becoming lost or corrupted.

      Normally when archiving, you maintain some sort of a checksum in an attempt to verify that the archive hasn't become corrupted. (yes, checksums can't be used to verify data integrity from malicious behaviour, but the odds of there being a checksum collision purely by chance as the media degrades is very slight).

      For what you're asking, with ECC, I wouldn't want to maintain that information on the same disk -- For the type of scenario that I'm describing, it's much more efficient overall to just make multiple copies, and store them seperately. If the main archive is found to be corrupt, then you go to the backup, then the second backup, etc. After a while, there's a limited return -- but you have to weigh what the cost of backups are against the risk and cost of a given loss.

      From what you've described, the data is kept online -- so you could use something like tripwire to tell you when something's gone wrong, and in that case, restore from the backup. I was trying to provide what I believe to be a better alternative to what you were attempting to do -- of course, I don't know how often your data changes (if you're only keeping 40GB, but it changes daily, my recommendations aren't useful), but if it's just a matter of keeping some pictures in an online repository (ie, they get added to, but not modified).

      Anyway, as I've gotten off on a tangent again -- ECC is for bit level corruption, not catastrophic failure, or even accidential file deletion. Full backups will protect against more types of potential loss. If you're really paranoid about your data, you could combine the two, but you'd have to see if the overall cost is justified for your particular situation.

        Thanks again. I really appreciate your feedback.

        My plan is indeed to have several backups, but also to enhance them with ECC.
        This way, each backup has a better standalone value in the sense that it may be possible to recover small errors. From my experience, disks tend to break random sectors, corrupting one image (1-4 Mb) somewhere in the middle. If a whole image fails, then there will hopefully be another backup which has it. This can be automated via some merging script much like using checksum. Same goes for a complete disk failure.

        The cost of this approach is 50% increase in the data size which is reasonable here. To my understanding, while disks (both magnetic and optical) do use ECC already, they will not use a 50% redundancy. For me (and surely many other photographers) the value of the data is worth this price, and even more. Especially with the low cost of storage nowadays.

        Thanks as well for the link.

Re^3: (OT) Redundant Backup
by aquarium (Curate) on Mar 29, 2006 at 15:01 UTC
    which specific DLT tape format were you unable to read later on a newer tape drive?
    the hardest line to type correctly is: stty erase ^H

      There were two incidents, at two different companies:

      The first problem was with a DLT 7000 mechanism -- we upgraded to DLT 8000 mechanisms in the tape library, and everything went fine 'till we needed a restore -- some of the tapes refused to be read. We tried them in another library that had DLT 7000s -- still no luck. Luckily, the drives hadn't been excessed, and they still had the originals -- and they had to read _every_last_ tape that had been written on the problem mechanism, and write it back out using a different drive. (we didn't have anyone verifying the backups had been written cleanly ... hell, when we had a big disaster years later (complete loss of the mail system for 30k students), we found that the 'full' backups politely stopped at 2GB without throwing warnings, rather than exceeding maximum file size for Solaris 2.6 -- too bad we were running Solaris 7, and backing up 36GB partitions.)

      The second problem with the DLTs weren't that we couldn't read it on a newer tape drive -- it's that we didn't have a tape drive to read it from. The data was migrated to a new system, and the older system was decommissioned and excessed. The DLT drive was given away to another department ... before I ever started working there. Years later, someone went to look through the data repository, and realized some data was missing -- so we called up the offsite storage folks, and had them deliver the backups -- which were DLT-IV. So, we managed to get the DLT mechanism back from the folks who had it ... but we didn't have the necessary software to read the tapes back in. (I wasn't dealing with the restore ... it was either Alpha/VMS or Alpha/Tru64). I know Amy spent a few months on it, but I don't know if she ever got the data restored, or just gave up.

      which specific DLT tape format were you unable to read later on a newer tape drive?

      I had something similar happen with a DLT drive. To be honest I can't remember the version/type but tapes on one drive would not restore on another. Since then I've always been sure to have at least two instances of the backup/restore hardware and to check backups/restores on both regularly.