in reply to Matching data against non-consecutive range

The memory consumption of this is fixed to the total size of the domain regardless of the number of subranges it contains. Construction and addition of ranges is trivial. Lookup is O(1).

If the total range is somewhere in the bounds of sanity, then this is quick and easy and will often use less memory than the related hash or tree structure.

If the range is large, then you can use vec and reduce the memory requirement to 1/8 th.

It doesn't work well for real numbers--though it can approximate them.

#! perl -slw use strict; use List::Util qw[ reduce ]; $a = $b; my $zones = reduce { my( $first, $last ) = split '-', $b; $last ||= $first; ## 1; corrected per [bmann]'s post below. substr( $a, $first, $last - $first + 1 ) = 'x' x ( $last - $first ++ 1 ); $a } chr(0) x 1000, split ',', '10-374,376-379,382-385,388-499,530-534,541-543,618,619,700-704,707-70 +9'; print $zones =~ m[^.{$_}x] ? "Found $_!" : "$_ isn't there!" for 9, 374, 375, 376, 999; __END__ [12:03:42.74] P:\test>425464 9 isn't there! Found 374! 375 isn't there! Found 376! 999 isn't there!

Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.

Replies are listed 'Best First'.
Re^2: Matching data against non-consecutive range
by bmann (Priest) on Jan 27, 2005 at 22:29 UTC
    Slick! I wouldn't have thought of using reduce for this. I'll ++ you tomorrow when I get more votes ;)

    There is a small bug in the implementation, though: single number ranges fail. Add 618 or 619 to the test data, you get "618 isn't there!".

    Here's a patch to fix it:

    # $last ||= 1; # needs to be $last ||= $first;
    And here's another patch that disposes of pre-allocating the 1000 character string for $zones. I'm sure it's slower, but it'll work for an arbitrary range (ie, > 1000):
    my $zones = reduce { my( $first, $last ) = split '-', $b; $last ||= $first; #allocate $a here $a .= chr(0) x ( $last ); substr( $a, $first, $last - $first + 1 ) = 'x' x ( $last - $first ++ 1 ); $a } '', split ',', '10-374,376-379,382-385,388-499,530-534,541-543,618,619,700-704,707-70 +9';

    Update: added second patch

      Patch applied. Many thanks++


      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.
        That was fast! Too fast. Or maybe I'm too slow ;)

        I was still finishing composing my node - maybe I should have just posted a second node. I added a second bit to get around preallocating the string for $zones, if you're interested.