in reply to Re^3: Adding cols to 3d arrays - syntax
in thread Adding cols to 3d arrays - syntax

All Storable will do is provide a means for you to recover the @LPN and @PPN arrays quickly, after building them once. If they are as small as I expect, that will give you a large improvement in start-up time, and may allow you to examine less of the main image, therefore keeping more of your working data in the OS caches.

For a simple log structure, the scan on start-up could be a binary search to find the greatest distance between sequence numbers, but that would require that existing data be "moved out of the way" to allow the writes to always proceed sequentially. A cached map could also be in some other storage with better endurance characteristics separate from the main NAND array. Or the drive could depend on the host system's POST latency to hide the delays for the start-up scan, or store a few pointers sufficient to locate the first data that the host will want in only a few locations while storing the "rest" of the log index with more flexibility and continuing the scan while servicing the first few host requests.

Only a few bits set, with the vast majority clear, strongly suggests some kind of flags field. 128 bytes is 1024 bits, or (if I understand the structures you have found correctly) one bit for each 64KiB region in a group containing a "bank 32" field. If I recall correctly, NAND flash erases to all bits set, and most flash permits some number of write cycles, each only clearing additional bits, between erases. Hypothesis: only one of each set of duplicate LPN will have a corresponding bit set in "bank 32" if this is a validity field.

Finding the sequence number depends on guessing the drive's write sequence. For a simple log-structured filesystem that moves data to allow writes to always be sequential, this should be easy, since the sequence numbers will monotonically increase with one break somewhere on the media. Unless the sequence numbers are not write sequence, but something derived from power-cycle count, since the drive could keep its actual write count in the controller's RAM and reconstruct this from the NAND array on start-up. That would explain the repetition, if it is some kind of session ID.

How full was the filesystem and did the host use the TRIM command to release blocks? If the NAND array was mostly (or entirely) allocated, I would expect very little variation in an "in-use" or "valid data" field across the disk.

Replies are listed 'Best First'.
Re^5: Adding cols to 3d arrays - syntax
by peterrowse (Acolyte) on Sep 21, 2019 at 13:33 UTC

    Hmm not sure if I did the right thing replying here, had to click to see your post and the page I am seeing does not offer me a reply link only a comment one. If someone is going to clean up this thread thats great but let me know if my settings are wrong or something with your reply being hidden on the original page.

    Spoiler alert: I successfully mounted the drive today, although its quite 'damaged', but mount accepted it.

    Anyway re your post I haven't had time yet to explore the suggestions re flags, working this morning (while looking after one of my little ones so a bit disjointedly) on examining the field we are talking about in a more basic fashion. The repetition is considerable, but there is another 4 bytes after the LBA 128 field we discussed and I wondered if it might be part of the sequence number but it seems not. I'll have to check what you were saying about the bit fields marking only a single LBA valid - its an interesting thought.

    Re the write repeatedly point I think I remember seeing in the datasheet that this is forbidden - I'll have to check to be sure but I remember thinking its very restrictive. If it was permitted it could allow a validity field with 0 being invalid which would be very good but I don't think so, will check later though.

    I don't quite understand your point re the write count being reconstructed. Sounds interesting though and I will read this throughout the afternoon trying to understand it but might need to come back to you on it later.

    As for the filesystem it was quite full IIRC, about 90% probably (good because there are fewer unused and hence stale pages). I don't think TRIM was used on this drive, it was running OSX (on PC hardware - a hackintosh, stupid experiment I made).

    Now as for the mounting I mentioned earlier. Sleeping on the log structure details and a bit more reading made me wonder about something I saw in the drive some time ago. I made a bitmap image of sector average values to visualise the areas that might contain addresses (since they are 24 bit values occupying 32 bit space). I saw an interesting horizontal darker band about 10% of the drives capacity in size, occupying the space between about 80% and 90% of the drive space. IE towards the bottom. Looking in it did not show anything very significant and I put it aside for now. But the simple log structure file system I think can simply treat the whole drive as a log - at least that was my interpretation - and just keep writing to the head of the log, which would of course move along and wrap around to the bottom of the drive as it was written. I wondered if that dark band in any way marked this since I don't see why its there, the OS would not be aware of physical locations so could not be responsible. Anyway I decided to try writing out the map file starting from offset around 85% and wrapping via offset 0% back to 85%. Loading that map file into the kernel module allowed me to mount the image once I had provided the offset of the partitions starting block (found with HFS rescue).

    Now although this creates almost as many questions as it answers, its an interesting turn. Why is the area darker for instance. I originally thought it should be light (IE all FF) but of course if it was not erased yet because the drive did not know it was unused it should not be. Perhaps OSX writes 0x00 to the whole drive when it formats, although this install was several months old so I imagine I had turned over the whole 256 gig by this time.

    If its possible to upload images here I can upload the bitmap file. The dark band has fuzzy edges and is ill defined but certainly there.

    Many of the folders in the root directory are now accessible, although how many files are readable I haven't checked yet. Some folders yield a ' Input/output error', and as you drill down through working folders you hit further such inaccessible ones. Still this is far further than I have got before so its significant. It might I guess be by chance that one or two significant blocks that hold top level directory information have been correctly selected now rather than a large proportion of blocks I don't know (understanding of FS structure too hazy).

    I think next I should investigate your idea about the bitfields and duplicate LBA correlation now but if anything I have mentioned gives you any further ideas they would be much appreciated.

    Thanks, Pete

      There are options in User Settings for how far into threads replies should appear on a single page in the Note Configuration box. The default values are rather low.

      The idea about reconstructing the write count is simple: the drive does not need actual "transaction sequence numbers" like a multi-user database because there is only one controller accessing the array. The controller only needs to distinguish blocks from "all previous sessions" and blocks being "written now" in the current session, since it need only find the tail of the log at start-up. So there is no need to actually store an incrementing "write sequence number" in the array, only some way to find the tail of the log, which could be a "session number" instead, incremented only on power-up.

      A band about 10% of the drive's capacity in size? 256 (drive size) / 273 (image size) is about .9377, so a ~7% pad area is expected to exist in the NAND array. That dark band may be the tail of the log, possibly with the log pointers and other very useful information in it somewhere. Are there any valid LBA pages in that area?

        Interesting re the 'session number' - in this scenario since there are duplicates, I assume that implies the dupes of the suspected session number can be considered a single transaction so can be restored in any order. Something to try tomorrow or perhaps next week due to weekend time constraints.

        I recently tried a different chips bank sequence a week or so ago which has been successful and brought me to this point and the bitmap although still interesting did not 100% reflect that. I just rearranged it to properly reflect it and have uploaded it to some random file sharing site if you would like a peek.

        http://www.filedropper.com/ppmfile2rearr_1

        The lines which represent the LBA blocks are only 1 pixel wide so you need to zoom in to 100% to see them. They are the 32 vertical black lines. The thick horizontal wavy black band towards the bottom is all 0xFF in contrast to the black LBA lines which although close to 0xFF are not (they range from 0 to 16M or so). Being 8 bit the shades don't show this so I might see if I can write a 16 bit PPM tomorrow. Each pixel is a 16k page (or block depending on your terminology). The 8 bands of more even grey before each pair of black bands are interesting and will need more examination but I need to call it a night today so that will be tomorrow or monday.

        Now I can see it more clearly there is a definite pattern going on there. It very much does look like 32 'heads of logs' with such order in it shared by all 32 vertical bands. Although I don't 100% know the relation between addressing and chips in the SSD there are 32 NANDs so with 32 of these bands it certainly looks like each chip is a vertical band. Towards the right of the image there is some other facets which look like partly written erase blocks at first glance.

        One thing that confuses me is why the black area exists. Why would the drive bother to write 0x00 to these areas if erase resets to 0xFF - unless the OSes view of the drive is inverted perhaps. If so that would make perfect sense, with erased blocks at the head of each chip / log.

        As for are there any valid LBA pages, it will take me some time to check. I might be able to do this tomorrow but depends on what the family want to do, monday otherwise.

        Very pleased with getting this far so thanks for the ideas and assistance!

        Pete