in reply to Re^4: [OT] A measure of 'sortedness'?
in thread [OT] A measure of 'sortedness'?
Not sure -- it's been a long time since I've had anything detailed enough to warrant computational analysis of the execution cost (I miss the old days when I worked on fun stuff).
A side trip, though, if a full run length index might have had value (does not look like it here), here's a Perlized representation. Some of the code is sloppy, but I didn't go home like I said I would so I optimized Wife-Deprivation Units at the cost of Code-I-Could-Be-Proud-Of Units.
Okay, so here's a proof-of-concept. The assumption is that a Run-Length table will consume notably less space than the original data, such as might be true in a disk I/O operation. This is not a universally-applicable assumption but may yield thoughts toward optimization in your specific case.
#!/usr/bin/perl use strict; # Constants (sort of) my $MAIN_BUFFER_SIZE = 20; my $POSSIBLE_VALUES = 'ABCDE'; my $NUMBER_OF_VALUES = length $POSSIBLE_VALUES; # Silly bu +t deal with it # Globals my %RunLength_Info = (); my @MainBuffer = (); # Generate random buffers with probable data overlaps, then scan, then + merge { foreach my $bufferPass (0..1) { my $workingBuffer = &generateMainBuffer(); push @MainBuffer, $workingBuffer; &scanBuffer($bufferPass); } &dumpMainBuffer(); &mergeBuffers(); } exit; sub generateMainBuffer { my @genBuffer = (); for (my $bufIndex = 0; $bufIndex < $MAIN_BUFFER_SIZE; $bufIndex++ +) { my $bufCharValue = int(rand($NUMBER_OF_VALUES)); my $bufChar = substr $POSSIBLE_VALUES, $bufCharValue, 1; push @genBuffer, $bufChar; } my @sortedBuffer = sort @genBuffer; my $generatedBuffer = join '', @sortedBuffer; return $generatedBuffer; } sub scanBuffer { my ($bufferNumber, @garbage) = @_; if (!defined $bufferNumber) { die "Found a short-circuit between +the ear buds"; } for (my $bufIndex = 0; $bufIndex < $MAIN_BUFFER_SIZE; $bufIndex++ +) { my $currChar = substr $MainBuffer[$bufferNumber], $bufIndex, + 1; if (!defined $RunLength_Info{$currChar}) { $RunLength_Info{$currChar} = 1; } else { $RunLength_Info{$currChar}++; } } } sub mergeBuffers { my $outBuffer = 0; my $outLength = 0; # Stupid but effective foreach my $bufferClean (0..1) { $MainBuffer[$bufferClean] = ''; } # And the magic foreach my $finalChar (sort keys %RunLength_Info) { my $remCount = $RunLength_Info{$finalChar}; while ($remCount) { if ($outLength < $MAIN_BUFFER_SIZE) { $MainBuffer[$outBuffer] .= $finalChar; $outLength++; $remCount--; } else { $outBuffer++; $outLength = 0; } } } &dumpMainBuffer(); } sub dumpMainBuffer { print "Buffers:\n"; foreach my $aBuffer (@MainBuffer) { print " [$aBuffer]\n"; } } __END__
Results on my test run:
D:\PerlMonks>perl merge1.pl Buffers: [AAAAAABBCCCDDDDEEEEE] [AAAABBBBBBCCCCCDDDDD] Buffers: [AAAAAAAAAABBBBBBBBCC] [CCCCCCDDDDDDDDDEEEEE] D:\PerlMonks>
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^6: [OT] A measure of 'sortedness'?
by BrowserUk (Patriarch) on Mar 19, 2015 at 03:31 UTC | |
by marinersk (Priest) on Mar 19, 2015 at 15:18 UTC | |
by marinersk (Priest) on Mar 19, 2015 at 20:33 UTC | |
by BrowserUk (Patriarch) on Mar 19, 2015 at 21:46 UTC | |
by marinersk (Priest) on Mar 20, 2015 at 14:01 UTC |