comment on

Not sure -- it's been a long time since I've had anything detailed enough to warrant computational analysis of the execution cost (I miss the old days when I worked on fun stuff).

A side trip, though, if a full run length index might have had value (does not look like it here), here's a Perlized representation. Some of the code is sloppy, but I didn't go home like I said I would so I optimized Wife-Deprivation Units at the cost of Code-I-Could-Be-Proud-Of Units.

Okay, so here's a proof-of-concept. The assumption is that a Run-Length table will consume notably less space than the original data, such as might be true in a disk I/O operation. This is not a universally-applicable assumption but may yield thoughts toward optimization in your specific case.

#!/usr/bin/perl

use strict;


# Constants (sort of)
my   $MAIN_BUFFER_SIZE   = 20;
my   $POSSIBLE_VALUES    = 'ABCDE';
my   $NUMBER_OF_VALUES   = length $POSSIBLE_VALUES;         # Silly bu
+t deal with it


# Globals
my   %RunLength_Info          = ();
my   @MainBuffer = ();


# Generate random buffers with probable data overlaps, then scan, then
+ merge
{
     foreach my $bufferPass (0..1)
     {
          my $workingBuffer = &generateMainBuffer();
          push @MainBuffer, $workingBuffer;

          &scanBuffer($bufferPass);
     }

     &dumpMainBuffer();

     &mergeBuffers();
}

exit;


sub generateMainBuffer
{
     my @genBuffer = ();

     for (my $bufIndex = 0; $bufIndex < $MAIN_BUFFER_SIZE; $bufIndex++
+)
     {
          my $bufCharValue = int(rand($NUMBER_OF_VALUES));
          my $bufChar = substr $POSSIBLE_VALUES, $bufCharValue, 1;
          push @genBuffer, $bufChar;
     }
     my @sortedBuffer = sort @genBuffer;
     my $generatedBuffer = join '', @sortedBuffer;

     return $generatedBuffer;
}


sub scanBuffer
{
     my ($bufferNumber, @garbage) = @_;
     if (!defined $bufferNumber) { die "Found a short-circuit between 
+the ear buds"; }

     for (my $bufIndex = 0; $bufIndex < $MAIN_BUFFER_SIZE; $bufIndex++
+)
     {
          my $currChar = substr $MainBuffer[$bufferNumber], $bufIndex,
+ 1;
          if (!defined $RunLength_Info{$currChar})
          {
               $RunLength_Info{$currChar} = 1;
          }
          else
          {
               $RunLength_Info{$currChar}++;
          }
     }
}


sub mergeBuffers
{
     my $outBuffer = 0;
     my $outLength = 0;

     # Stupid but effective
     foreach my $bufferClean (0..1)
     {
          $MainBuffer[$bufferClean] = '';
     }

     # And the magic
     foreach my $finalChar (sort keys %RunLength_Info)
     {
          my $remCount = $RunLength_Info{$finalChar};

          while ($remCount)
          {
               if ($outLength < $MAIN_BUFFER_SIZE)
               {
                    $MainBuffer[$outBuffer] .= $finalChar;
                    $outLength++;
                    $remCount--;
               }
               else
               {
                    $outBuffer++;
                    $outLength = 0;
               }
          }
     }
     &dumpMainBuffer();
}


sub dumpMainBuffer
{
     print "Buffers:\n";
     foreach my $aBuffer (@MainBuffer)
     {
          print "     [$aBuffer]\n";
     }
}


__END__
[download]

Results on my test run:

D:\PerlMonks>perl merge1.pl
Buffers:
     [AAAAAABBCCCDDDDEEEEE]
     [AAAABBBBBBCCCCCDDDDD]
Buffers:
     [AAAAAAAAAABBBBBBBBCC]
     [CCCCCCDDDDDDDDDEEEEE]

D:\PerlMonks>
[download]

In reply to Re^5: [OT] A measure of 'sortedness'? by marinersk
in thread [OT] A measure of 'sortedness'? by BrowserUk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.