in reply to Need Set-like Regex for Perl and Java (golf OK)

I'm not much in favor of regexes to solve problems like this in one swell foop. So long as the number of elements is sane, I'd build a hash and check for existence of a key. A regex is used to recognise ranged data and parse the boundary elements:

#!/usr/bin/perl my $set = [qw/A000 A123-A456 A999-B000 B789-B888/]; my %elements; for (@$set) { /-/ or $elements{$_} = undef or next; if ( /([[:alpha:]]\d{3})-([[:alpha:]]\d{3})/ ) { my $start = $1; do { $elements{$start} = undef } while $start++ ne $2; } } while (<DATA>) { chomp; print "$_ is ", (exists $elements{$_} ? '' : 'not '), 'in the set.', $/; } __DATA__ A000 A001 A122 A123 A124 A154 A320 A455 A456 A457 A530 A779 A932 A998 A999 B000 B001 B123 B666 B788 B789 B790 B820 B887 B888 B889 B900
That's a typical trade of memory for speed. The loop populating the %elements hash makes use of magical string increment to roll A over to B.

I should call attention to the difference between

do { $elements{$start} = undef } while $start++ ne $2;
and
$elements{$start} = undef while $start++ ne $2;
The latter will test and increment $start before any processing is done, while the do {} while cond; construction does the first pass unconditionally.

There is also a tricky bit of logic in the single element handling. . . . or next; is used to go to the next iteration unconditionally because the preceeding expression is always false.

After Compline,
Zaxo

Replies are listed 'Best First'.
Re^2: Need Set-like Regex for Perl and Java (golf OK)
by Roy Johnson (Monsignor) on Jun 09, 2004 at 15:02 UTC
    You could initialize your %elements has quite a bit more easily:
    my @set = ('A000','A123'..'A456','A999'..'B000','B789'..'B888'); my %elements; @elements{@set} = (undef)x@set;

    The PerlMonk tr/// Advocate