comment on

Other monks have suggested a binary search solution, but did not provide an example. I took writing an optimized binary search for this as a challenge and wrote this:

#!/usr/bin/perl
# -*- CPerl -*-

use strict;
use warnings;

# sparse range structure:
#  array of arrayref/number
#   single integers represented as number
#   contigous ranges as arrayref [LOW,HIGH] inclusive

# given: string "A,B,C,X-Y,Z"
# return: sorted array [A, B, C, [X, Y], Z]
sub parse ($) {
  my @elements = split /,/, shift;

  foreach (@elements) {
    s/\s+//g;                # prune whitespace
    next unless m/\d+-\d+/;        # skip conversion if single integer
    $_ = [split /-/, $_];        # convert ranges to arrayrefs
  }

  # sort range set
  @elements = sort {(ref($a)?$a->[0]:$a) <=> (ref($b)?$b->[0]:$b)} @el
+ements;

  # merge overlapping loose elements into preceding ranges
  for (my $i = 0; $i < $#elements; $i++) {
    next unless ref $elements[$i];    # skip single integers
    while ($i+1 <= $#elements and $elements[$i+1] <= $elements[$i][1])
      { splice @elements, $i+1, 1 }    # remove elements included in r
+ange
  }

  # coalesce contiguous integers into ranges
  for (my $i = 0; $i < $#elements; $i++) {
    next if ref $elements[$i];        # skip ranges
    if ($elements[$i]+1 == $elements[$i+1]) {
      my $j = 1+$i;
      $j++ while !ref($elements[$j]) && $elements[$j]+1 == $elements[$
+j+1];
      splice @elements, $i, 1+$j-$i, [$elements[$i], $elements[$j]];
    }
  }

  # merge adjacent loose elements into succeeding ranges
  for (my $i = 0; $i < $#elements; $i++) {
    next if ref $elements[$i];        # skip ranges
    next unless ref $elements[$i+1];    # but next element is a range
    # There can be at most one such element, since contiguous integers
+ were
    # coalesced into ranges above.
    if ($elements[$i]+1 == $elements[$i+1][0])
      { splice @elements, $i, 2, [$elements[$i], $elements[$i+1][1]] }
  }

  # merge adjacent loose elements into preceding ranges
  for (my $i = 0; $i < $#elements; $i++) {
    next unless ref $elements[$i];    # skip single integers
    next if ref $elements[$i+1];    # but next element is a single int
+eger
    # There can be at most one such element, since contiguous integers
+ were
    # coalesced into ranges above.
    if ($elements[$i][1]+1 == $elements[$i+1])
      { splice @elements, $i, 2, [$elements[$i][0], $elements[$i+1]] }
  }

  # merge overlapping ranges
  for (my $i = 0; $i < $#elements; $i++) {
    next unless ref $elements[$i] and ref $elements[$i+1];
    splice @elements, $i, 2, [$elements[$i][0], $elements[$i+1][1]]
      if $elements[$i][1] >= $elements[$i+1][0];
  }

  # merge adjacent ranges
  for (my $i = 0; $i < $#elements; $i++) {
    next unless ref $elements[$i] and ref $elements[$i+1];
    splice @elements, $i, 2, [$elements[$i][0], $elements[$i+1][1]]
      if $elements[$i][1]+1 == $elements[$i+1][0];
  }

  return \@elements;
}

# given: sorted array from sub parse and integer
# return true if the integer is in the sorted array
sub search ($$) {
  my $set = shift;
  my $num = shift;

  my $left = 0; my $right = $#$set;
  my $i = $#$set >> 1;            # bitshift for integer /2
  while ($left < $right) {
    if (ref($set->[$i])) {        # evaluate a range
      return 1        # number within this range
    if $num >= $set->[$i][0] && $num <= $set->[$i][1];
      if ($num > $set->[$i][0])    { $left  = $i+1 }
      else            { $right = $i-1 }
    } else {                # evaluate a single integer
      return 1        # number matched
    if $num == $set->[$i];
      if ($num > $set->[$i])    { $left  = $i+1 }
      else            { $right = $i-1 }
    }
    $i = ($left + $right) >> 1;        # bitshift for integer /2
  }


  # last check
  if (ref($set->[$i])) {
    return $num >= $set->[$i][0] && $num <= $set->[$i][1]
  } else {
    return $num == $set->[$i]
  }
}

my $Set = parse shift;

use Data::Dumper;
print Data::Dumper->new([$Set],[qw(Set)])->Indent(0)->Dump,"\n";

foreach (@ARGV) {
  print $_, search($Set, $_) ? ' is' : ' is not', " in the set\n";
}
[download]

The script expects a set of numbers as its first command line argument, and then various numbers to test against the set as additional arguments. Examples:

$ ./bsearch.pl 1,2,5,6,9,10,41-56 1 4 42 17
$Set = [[1,2],[5,6],[9,10],[41,'56']];
1 is in the set
4 is not in the set
42 is in the set
17 is not in the set
[download]

$ ./bsearch.pl 1,2,11-16,6,7,19,9,5-8,13,14,15,4 1 2 3 4 5 8 9 10 11 1
+2 16 17 18 19 20
$Set = [[1,2],[4,9],[11,16],19];
1 is in the set
2 is in the set
3 is not in the set
4 is in the set
5 is in the set
8 is in the set
9 is in the set
10 is not in the set
11 is in the set
12 is in the set
16 is in the set
17 is not in the set
18 is not in the set
19 is in the set
20 is not in the set
[download]

This latter example was used for developing the set-optimization logic.

In reply to Re: Comparing a value to a list of numbers by jcb
in thread Comparing a value to a list of numbers by g_speran

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.