I think your bitmap idea still has merit, but employed slightly diffrently. It's not an algorithmic reduction, but it does reduce overhead by 50% (one buffer scan instead of two). Not going to win any awards, but 50% overhead reduction isn't devoid of merit.
If you build the bitmap during the mandatory pre-merge scans (this would require a parallel, non-lock-step scan as you have already proposed) and use it to merely indicate which source buffer the resulting buffer data will come from (essentially doing the merge decision during the pre-scan), then your merge pass doesn't have to think -- it just has to do.
Buffers:
[AABEFFHHKMOOQRRSSTUV]
[ABCDEHIJKLLNPPRSVVWY]
Bitmap:
[0011011100000111101101001100011000001111]
If you can cheaply convert the length of the alternating runs into an integer
|
<lightbulb> |
|
|
or just store them as (2- or 4-bit?) integers to begin with |
|
</lightbulb> |
Then you can just feed that into your block data move routine and achieve maximum blocking/minimum data move, all from metadata collected on that first mandatory linear pass.
I would presume you'd use the rest of that 10% OOL buffer to perform the actual data migration.
|
<lightbulb> |
|
|
If you performed the actual data migration in reverse, you could shrink the bitmap/runlength data as you use it, and thus increase the amount of linear usable freespace in the buffer for the actual data migration, which you would naturally build from the back of the OOL buffer and stop before you step on your shrinking metadata. By the last pass you could be using almost the entire OOL buffer for data. |
|
</lightbulb> |
On each pass, the OOL buffer looks something like below. I say "something like" because, like a bad car mechanic, I wound up with spare parts at the end -- I clearly messed something up doing it by hand in a text editor LOL. (This is why I write programs to do the work for me. They don't miss. They don't get bored. They just do.)
OOL Buffer:
The front of the buffer contains the runlength index
The back of the buffer contains the data to be migrated
The dots represent freespace in the buffer
[2213541211223254...........................................]
[221354121122325........................................VVWY]
[22135412112232....................................SSTUVVVWY]
[2213541211223...................................RSSSTUVVVWY]
[221354121122.................................QRRRSSSTUVVVWY]
[22135412112................................PPQRRRSSSTUVVVWY]
[2213541211...............................OOPPQRRRSSSTUVVVWY]
[221354121...............................NOOPPQRRRSSSTUVVVWY]
[22135412..............................LMNOOPPQRRRSSSTUVVVWY]
[2213541.............................KLLMNOOPPQRRRSSSTUVVVWY]
[221354.............................KKLLMNOOPPQRRRSSSTUVVVWY]
[22135..........................HHIJKKLLMNOOPPQRRRSSSTUVVVWY]
[2213......................EEFFHHHIJKKLLMNOOPPQRRRSSSTUVVVWY]
[221....................BCDEEFFHHHIJKKLLMNOOPPQRRRSSSTUVVVWY]
[22....................BBCDEEFFHHHIJKKLLMNOOPPQRRRSSSTUVVVWY]
[2...................AABBCDEEFFHHHIJKKLLMNOOPPQRRRSSSTUVVVWY]
Of course you still have all the same other problems to solve: How do you minimize data motion and where do you put data which is going to be displaced by the eventual write from OOL to Main buffer, etc.
I'd have to model it to get a feel for what the real savings would be, if any.
It is dependent on either an inexpensive conversion of bitrun-to-integer, or the ability to use and store reasonably-sized tinyint incrementors to be of use. Would not expect Perl to be your tool of choice. :-)
Update: Removed department of redundancy department.
|