Not sure -- it's been a long time since I've had anything detailed enough to warrant computational analysis of the execution cost (I miss the old days when I worked on fun stuff).

A side trip, though, if a full run length index might have had value (does not look like it here), here's a Perlized representation. Some of the code is sloppy, but I didn't go home like I said I would so I optimized Wife-Deprivation Units at the cost of Code-I-Could-Be-Proud-Of Units.

Okay, so here's a proof-of-concept. The assumption is that a Run-Length table will consume notably less space than the original data, such as might be true in a disk I/O operation. This is not a universally-applicable assumption but may yield thoughts toward optimization in your specific case.

#!/usr/bin/perl use strict; # Constants (sort of) my $MAIN_BUFFER_SIZE = 20; my $POSSIBLE_VALUES = 'ABCDE'; my $NUMBER_OF_VALUES = length $POSSIBLE_VALUES; # Silly bu +t deal with it # Globals my %RunLength_Info = (); my @MainBuffer = (); # Generate random buffers with probable data overlaps, then scan, then + merge { foreach my $bufferPass (0..1) { my $workingBuffer = &generateMainBuffer(); push @MainBuffer, $workingBuffer; &scanBuffer($bufferPass); } &dumpMainBuffer(); &mergeBuffers(); } exit; sub generateMainBuffer { my @genBuffer = (); for (my $bufIndex = 0; $bufIndex < $MAIN_BUFFER_SIZE; $bufIndex++ +) { my $bufCharValue = int(rand($NUMBER_OF_VALUES)); my $bufChar = substr $POSSIBLE_VALUES, $bufCharValue, 1; push @genBuffer, $bufChar; } my @sortedBuffer = sort @genBuffer; my $generatedBuffer = join '', @sortedBuffer; return $generatedBuffer; } sub scanBuffer { my ($bufferNumber, @garbage) = @_; if (!defined $bufferNumber) { die "Found a short-circuit between +the ear buds"; } for (my $bufIndex = 0; $bufIndex < $MAIN_BUFFER_SIZE; $bufIndex++ +) { my $currChar = substr $MainBuffer[$bufferNumber], $bufIndex, + 1; if (!defined $RunLength_Info{$currChar}) { $RunLength_Info{$currChar} = 1; } else { $RunLength_Info{$currChar}++; } } } sub mergeBuffers { my $outBuffer = 0; my $outLength = 0; # Stupid but effective foreach my $bufferClean (0..1) { $MainBuffer[$bufferClean] = ''; } # And the magic foreach my $finalChar (sort keys %RunLength_Info) { my $remCount = $RunLength_Info{$finalChar}; while ($remCount) { if ($outLength < $MAIN_BUFFER_SIZE) { $MainBuffer[$outBuffer] .= $finalChar; $outLength++; $remCount--; } else { $outBuffer++; $outLength = 0; } } } &dumpMainBuffer(); } sub dumpMainBuffer { print "Buffers:\n"; foreach my $aBuffer (@MainBuffer) { print " [$aBuffer]\n"; } } __END__

Results on my test run:

D:\PerlMonks>perl merge1.pl Buffers: [AAAAAABBCCCDDDDEEEEE] [AAAABBBBBBCCCCCDDDDD] Buffers: [AAAAAAAAAABBBBBBBBCC] [CCCCCCDDDDDDDDDEEEEE] D:\PerlMonks>


In reply to Re^5: [OT] A measure of 'sortedness'? by marinersk
in thread [OT] A measure of 'sortedness'? by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.