comment on

I've attempted to structure this so that what gets displayed when you use the d/l code link at the bottom, will be a working program, slightly modified from the original. I won't know if it worked until I submit it.

Hopefully, this is a slightly clearer version than the original.

#! perl -slw
use strict;
[download]

OK, the basic mechanism is this:

Given the input ranges

my @ranges = (
    [17, 19], [34, 39], [26, 29], [53, 57], [43, 47], [58, 59], [40, 4
+5],
    [30, 33], [20, 24], [10, 15], [ 6,  9], [ 1,  4], [35, 45], [ 7, 1
+5],
);
[download]

If we sort these numerically and lay them out graphically, we can easily project the 1's down vertically and see the resultant ranges on the bottom line. That's exactly what my code is doing.

=pod comment
                  1         2         3         4         5         6
Range     123456789012345678901234567890123456789012345678901234567890
[ 1, 4]  1111
[ 6, 9]       1111
[ 7,15]        111111111
[10,15]           111111
[17,19]                  111
[20,24]                     11111
[26,29]                           1111
[30,33]                               1111
[34,39]                                   111111
[35,45]                                    11111111111
[40,45]                                         111111
[43,47]                                            11111
[53,57]                                                      11111
[58,59]                                                           11

         1111 1111111111 11111111 1111111111111111111111     1111111
Coaleced
ranges   [1-4] [6-15]     [17-24]  [26-47]                    [53-59]

#! initialise a string  of null as long as the string the ranges perta
+in to.
my $coal = "\0" x 100; #! Only need be 60 in ths example but the exces
+s bytes don't effect the results.

Using substr as an lvalue, assign a string of range[high]-range[low] x
+ '1' over the null bytes if the result string.

#! fill all the ranges with 1's
substr($coal, $_->[0], $_->[1] - $_->[0] + 1) = 1 x ($_->[1] - $_->[0]
+ + 1) for @ranges;
=cut
[download]

That might have been clearer written as


#! Only need be 60 in this example but excess bytes don't effect the r
+esults.
my $result = "\0" x 100;

use constant LOW    =>0;
use constant HIGH    =>1;
for my $range (@ranges) {
    substr( $result, $range->[LOW], $range->[HIGH] - $range->[LOW] +1 
+)
    =       '1' x                  ($range->[HIGH] - $range->[LOW] +1 
+)
}
[download]

Then, it's just a matter of scanning allong the string linearly and recording the starts and stops of the contguous runs of '1''s.


#! Build the new AoA of ranges by scanning the string records starts a
+nd length of string of 1's
my (@results, $start);
for (0 .. length $result) { #! Looking at each character in the result
+ string in turn
    my $c = substr($result, $_, 1); #! Grab the char

    #! skip to the next char until we find a 1.
    next if not $start and $c ne '1';

    #! Record the start position if we found a 1 and we haven't alread
+y got a start.
    $start = $_ if $c eq '1' and not $start;

    #! Skip to the next char if we have start pos until we find the en
+d (ne '1')
    next if $start and $c eq '1';

    #! We have a start and we found a non-'1', so we have an end...
    #! so push the start end pair onto the results array
    push @results, [$start, $_-1];

    #! and undef $start to continue the search.
    $start = undef;
}

#! Print out the inputs sorted for human consumption, though that isn'
+t necessary of the algorithm
print map{local $"='-'; "[@$_] "} sort{ $a->[0] <=> $b->[0] } @ranges;
+  #!"
print '';

#! And the results
print map{local $"='-'; "[@$_] "}@results; #!"
print '';

__END__
c:\test>227155-3
[1-4] [6-9] [7-15] [10-15] [17-19] [20-24] [26-29] [30-33] [34-39] [35
+-45] [40-45] [43-47] [53-57] [58-59]

[1-4] [6-15] [17-24] [26-47] [53-59]


c:\test>
[download]

Note: If your string are of really extreme length and space at a premium, you could use a bit-string instead of bytes ("\0" and "1"), by using vec inplace of substr but I don't get on with vec, and my test string is short.

BTW. If you don't use one yet, the code is a lot clearer to read in a syntax highlighting editor than b&w. Though that's a matter of taste and configuration.

Examine what is said, not who speaks.

The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

In reply to Re: Re: Re: Overlapping portions of sub strings by BrowserUk
in thread Overlapping portions of sub strings by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.