in reply to Need Set-like Regex for Perl and Java (golf OK)
I would not accomplish this via a regexp because using a hash is much easier. It is not an overly complex regex, but it takes patience and attention to detail to maintain. Assuming we want to match A000, A123-A456, A999-B000, B789-B888 inclusively, I would run the following code with:
prompt> matchrange.pl | more
The code could be optimized more by adding grouping non-remembering parentheses for A around the 100 section and "factoring" out the 1 and around the 400 section and "factoring" out the 4 and for B around the 700s and 800s. I decided not to do this because it would hamper maintainability and comprehensibility.#!/usr/bin/perl use warnings; use strict; foreach ('A000' .. 'B999') { print "$_ "; if (/^(?: A(?:000|12[3-9]|1[3-9]\d|[2-3]\d\d|4[0-4]\d|45[0-6]|999) | B(?:000|789|79\d|8[0-7]\d|88[0-8]) )$/x) { print "match\n"; } else { print "does not match\n"; } }
Furthermore, each alternative should be inserted left to right from largest range covered to smallest range. Hence, for A the 000 and 999 should come last; whereas the 200-399 range should come first. However, this makes the regex harder to read. I decided to keep the alternates in numerical order.
I would use a hash if the dataset were not too big (large being relative to the amount of RAM you have):
my %match; my @num = ('A000','A123'..'A456','A999','B000','B789'..'B888'); @match{ @num } = (); foreach my $key('A000' .. 'B999') { print "$key "; if ( exists $match{$key} ) { print "match\n"; } else { print "does not match\n"; } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Regex is possible but I prefer a hash.
by QM (Parson) on Jun 09, 2004 at 23:50 UTC | |
|
Re: Regex is possible but I prefer a hash.
by BrowserUk (Patriarch) on Jun 10, 2004 at 01:10 UTC |