in reply to Re: Adding cols to 3d arrays - syntax
in thread Adding cols to 3d arrays - syntax

Another step, now that we know how big this image is, would be to preserve the index arrays (@PPN, @LPN) on disk using Storable after loading them. Then you can reload them quickly instead of scanning the entire 273GB image just to build up indexes before being able to actually look at anything.

The patterns in the "bank 32" map pages look suspiciously like flags of some type, likely for the preceding 4096 (=128*32) pages, with 8-bit or 16-bit values being possible fits. Is there a correlation between those values and duplicated LPNs? Perhaps only one of the LPNs in a duplicate set pairs with a particular value, suggesting that it is the valid copy? If "bank 32" holds a flag array, it is possible that the same space in other pages does in fact hold LPNs — whatever fragments of the LPN tables happened to be in the controller's memory when those maps were last written. We already know about the quality of the controller firmware, since the drive is dead.

And to everyone else reading this: since that other thread mentions why we are after this data, I do have to hold this up as an example to others of why you should put backups of important things like baby pictures on write-once optical media from a reputable manufacturer; do not use the cheapest bargain-basement garbage you can find. (The last detail is from a different case of data loss: many of Barack Obama's earliest speeches were recorded only by random members of the audience using direct-to-DVD-R camcorders — on cheap media that was found to be unreadable only a few years later after he had been elected President of the United States.) At the very minimum, store at least some backups on "spinning rust" hard disks; the technology is mature and very reliable. Avoid putting all your data in flash.

Learn from peterrowse's misfortune. The standard backup rule is "3-2-1": at least 3 copies, using at least 2 different storage technologies, with 1 off-site. ("Cloud" can be "off-site", but does not count as a storage technology, since you do not know how the data is actually stored, nor does it count as one of the 3 copies, since it can also disappear without warning.)

Replies are listed 'Best First'.
Re: Adding cols to 3d arrays - syntax
by peterrowse (Acolyte) on Sep 20, 2019 at 10:49 UTC

    First re the backup, and woeful lack of it in my case - this is so so true. So much effort could have been avoided if I had. The sad thing is that this is not the first time I have experienced significant data loss due to hard drive failure. In this case although I had previously used backups, I mistakenly thought SSDs were super robust, and had moved a load of new data (these photos mainly, a few thousand of them) from camera SD cards to the SSD while I organised it ready for moving to the backup machine. I then reused the SD cards :-( The 'organising' project took longer than expected due to family illness and I forgot the vulnerable data sitting on the SSD, thinking it 'safe as houses' anyway. Meanwhile the power supply developed a fault. Only then did I discover SSDs pitfalls.

    Even so my old backup solution was, I can see now, not good enough. A fire or lightning strike would for instance destroy all my data, but also many other scenarios might. I now have a system which keeps 5 copies distributed over 2 locations several miles apart, using different OSes and formatting systems. One is usually offline. I think I still have holes in the system though and am looking to change a few aspects of it. I use spinning rust now, because it can usually be recovered from, certainly a lot easier than SSD. SSD tech I now realise is a complete nightmare, since a power failure at the wrong time can cause what I have - an extremely difficult, if possible at all, to recover drive.

    As for optical media, I am wary of it, having had issues in the past with copies becoming unreadable several years later. Maybe poor quality as you say though. The other problem with it is I have around 3TB of to me very important data, once you figure in the video taken over the last few years, and optical with its small size takes time to use and make several copies of. I like HDDs now because they are very large, and as you say they are extremely mature. Data recovery companies usually have excellent success with them if required, the only severe failure mode is physical damage to the disk and with multiple copies in more than one location the chances of all having physical damage is very low. And they can be tasked to do their job reliably even when I am in periods of life when I am not being reliable!

    A second storage technology would be nice but I wonder about the reliability of the higher density stuff. I must admit I didn't know some can hold up to 128GB before looking it up just now, and its something to look into certainly. A few optical backups on some media which can be trusted would be certainly nice to have.

    Ill get back to the technical now in another post.

      There is one more advantage to optical media — it is the only commonly available format that is (or should be) entirely waterproof. Optical discs should retain data even after a flood — clean the mildew off and the disc reads fine. (Again, quality is important here, since poor quality discs might not be properly sealed, leading to "laser rot" even if not exposed to water.) This may or may not be relevant to your risk model, and 3TB is a very large amount of data.

      I have so far avoided the "unreadable several years later" problem by using good quality media from reputable manufacturers that I buy when the stores put it on sale (usually almost half-off if I am patient). Since I buy the blanks when they are on sale, I have a significant personal stock that I slowly rotate, and I suspect (and hope) that the blanks that will go bad will go bad before I get around to putting data on them. So far, this strategy has worked and I have yet to retrieve a disc from storage and find it unreadable, although I have had many discs fail verification immediately after writing them. Always read back an optical disc immediately after writing it — do not expect the drive to notice that the blank is bad while it is busy writing data.

      It is probably best to rank by importance (favoring more copies on lower-density media) and bulk (requiring fewer copies on higher-density media). This means the data with higher bulk-to-importance ratios (like high-def video) is exposed to greater risk of loss, but one partial mitigation is to store lower-resolution more-compressed copies of those videos in lower-density "bands" in your archive. I still use CD-Rs for some backups, even though I mostly use DVDs now. (But I do not have a significant collection of video.) So you might have full high-def video stored only on spinning hard disks, but lower-resolution "better than nothing" transcoded copies on DVDs or BDs.

      By now, you have probably learned better than to consider SSDs as valid backup storage. :-) (But they could still be a 3rd technology holding a 4th copy.)

Re: Adding cols to 3d arrays - syntax
by peterrowse (Acolyte) on Sep 20, 2019 at 11:28 UTC

    So re the use of Storable, what I currently do, since I have mainly been using C up until now for the lower level stuff like accessing the disk itself, is to use a cut down disk image for holding the data. I'm probably still thinking more in C terms than perl though. So I took all the 16kb LBA blocks (131072 of them), chopped off the last 14kb or so which always contained just 0xFFFFFFFF, and wrote that to a file around 250MB in size. Then I just read it back in when I need it, not always wholly though, I might just scan a few fields using seek and pop them into an array and then seek to disk locations to get the data I need as I process it. Since theres perhaps 40 million values there, whether this approach or storable would be faster I don't know, your opinion would be appreciated. I am running the analysis on an SSD (will I never learn?! Its all backed up:-)) so seek times are short. I've assumed storable uses text to store rather than raw binary and that the overhead of this would be large, but maybe thats a false assumption.

    Now as for the bank_32 business. As you say it does look like it does something significant. It doesn't look random enough for things like block write count, and the pattern seems to suggest its mapping something. I should extract those bank_32 fields and take a closer look, and I'll post some of it up here. The fact that they stored the LBAs in such a logical place, with a bitfield showing validity, a physical block start address etc all is very sensible and simple. If this was the style of the coders doing this it seems they would have put that last little bit of data somewhere accessible too. Thats not to say I am crediting them - I think the firmware choking when it sees the bad block area corrupt is poor, but they do seem to have designed the LBA id area fairly well.

    Perhaps as you say the bank_32 lpn area stores data relevant to the other banks in this superblock. I can't remember the exact layout of the NANDs and the rules for writing them but the rules are restrictive re order. Maybe bank_32 needs to be the last written out of the superblock. If so it would make sense to write up to date validity data in this zone for the superblock (This is what you are saying I think).

    There are 2 other areas which are worthy of thought too. One is the space for LBA 128 in the lba area, IE the last LBA 'slot'. It corresponds to the lba area itself, which is obviously redundant. It is not empty and not FFFFFFFF and it must do something. IIRC there is also another 4 bytes after this which frequently (always?) is a copy of LBA 128. I should poke around with this a bit more to see what its characteristics are.

    And then there is of course this second LBA area, since it does not correlate in the way I hoped with the LBAs, what the heck is it. I wonder if I am looking at it wrong - perhaps there is a fixed offset between its numbering and the one I am using for blocks or something. It seems that stripping down the data for some duplicate LBAs and seeing if I can see any patterns manually again might be worth it at this point.

      Storable is an XS module that (quickly) serializes and unserializes Perl data structures to and from its own binary format. The idea is to build the @PPN and @LPN indexes once and then save those as (presumably much smaller) files alongside the image. Actual usage is to read the index arrays back in full, then open the image file and seek/read/unpack only the data that you need for each analysis from the full image.

      For efficiency, the controller is likely to batch writes until it has a full erase block and only then "flush the buffers" out to the NAND array, and there may even be structures larger than an erase block that are significant to the FTL. The odd "bank 32" data hints at such a structure. How long is that apparent field?

      If the FTL uses a log structure, the "LBA 128" field might be the write sequence number you have been looking for. The nonsensical "LBA" list may simply be garbage, "unused" space that gets written with whatever happened to be in the controller's memory when writing the block. In other words, it may be a list of LPNs, but not LPNs that are relevant to the current state of the NAND array. Or, in C terms, the contents of an uninitialized buffer.

      Also, a small note about this site: there is a "reply" link for each post, and your post appears as a child of that post if you use it, instead of appearing at the top-level in the thread. PerlMonks also notifies the author of the post you replied to when a reply is made in this way. Please use it. I will request to have this subthread reparented, but please try to maintain the threaded nature of the discussion. The "reply" link for this post should appear to the right of this paragraph. --->

        Sorry about replying with the wrong link - I was originally hitting 'reply' but then my posts seems to be hidden so reverted to the other, I see the reasons to use reply now.

        Anyway re storable I might as well try it tomorrow, quick to try and if it speeds things up it will be helpful. Since you mentioned log structured file systems I've been reading up and trying to get my head around how they would appear on disk - theres references to this in the OpenSSD source and it does seem likely (perhaps even inevitable) that this disk will use one. But my understanding of them is thin currently, although a bit better after reading on the topic. Certainly what I see so far seems to fit with a log structure (to me at least), its just missing a sequence number but I think its likely I just haven't found that yet.

        What info on SSD log structure design is available seems to point to a more complex design than the 'classroom' one, to be expected I suppose, with data on wear levelling, write count etc also needing to be stored somewhere. But then also it seems to me that a SSD would likely not want to scan too much on start up for speed reasons, preferring to cache a single map file of around 60MB somewhere for startup. Then the LBA area that I am reading is there for backup in case the cached map is damaged, although the stale page data should also be there in that case. That map file would need to move around the disk a lot I imagine, otherwise the physical block its assigned to would wear quickly. I have binary grepped the disk for some segments of the map file I created but found no match however, but perhaps I need to conduct a more sophisticated search than a simple hex grep. It would certainly be convenient to find such a file.

        The field in bank 32 is short - IIRC 128 bytes of 'active' data, IE data which varies across rows. The next couple of hundred bytes contain a pattern but its the same across all instances of bank 32 so it can't be significant. Today I spent a little time looking at the data in the field but not much, I'll hopefully do a little more tomorrow and can say a bit more. What I did see today though when I looked at it in binary is that in most rows there are just a few bits as 1, or pairs of 1s, with the vast majority being 0's. The numbers range from 0 to very high (IIRC 500M or more) so its not addresses. And theres far too little variation for it to store much data, apart from perhaps as a whole (the entire disks worth of bank 32 data). I'll post a bit of it, and if I can find somewhere to host it I'll put a couple of megs worth up for anyone to look at if they have the time.

        Re the LBA 128 field, although I have not analysed it properly I remember that simple analysis showed a lot of repetition. That put me off the idea of it being a sequence number, my initial thought. But reading up on the log structure there simply must be a sequence number somewhere, and it makes no sense to not put it in the block being written (perhaps elsewhere too), so I think I need to hunt more for that. Perhaps repetition is permissible, if multiple blocks are written in a 'transaction', for instance.

        I'm in the UK so its 1am now and I have to knock off but thanks for the assistance and ideas and I'll hopefully come back with some more info tomorrow.

        Thanks, Pete