Re: Matching date range with pure regex
by Abigail-II (Bishop) on Feb 17, 2004 at 11:21 UTC
|
local $" = "|";
/^(?:@{[1950 .. 2050]})$/;
Abigail | [reply] [d/l] |
|
By way of explanation, this builds a long regexp through basic string interpolation like the following:
/^(?:1950|1951|1952|1953|...|2049|2050)$/;
After generating a regex like that, if you will be using it often, you might want to optimize it for common prefixes or suffixes.
use Regex::PreSuf;
my $re = presuf(1950..2050);
-- [ e d @ h a l l e y . c c ]
| [reply] [d/l] [select] |
|
If the OP wanted something optimized, he wouldn't have
used a regex to begin with.
Abigail
| [reply] |
Re: Matching date range with pure regex
by grinder (Bishop) on Feb 17, 2004 at 11:08 UTC
|
I might be missing part of your question, but this appears (applying the KISS principle) to do what you ask:
#! /usr/bin/perl -wl
use strict;
while( <DATA> ) {
chomp;
print "$_ ", /^(19[5-9]\d|20([0-4]\d|50))$/ ? 'ok' : 'nok';
}
__DATA__
1949
1950
1951
1999
2000
2001
2010
2049
2050
2051
2102
22102
19534
19080
20010
| [reply] [d/l] |
Re: Matching date range with pure regex
by Roger (Parson) on Feb 17, 2004 at 11:10 UTC
|
use strict;
while (<DATA>) {
chomp;
print "$_ is ", /^(?:19|20)
(?:
(?:(?<=19)[5-9]|(?<=20)[0-4])[0-9] |
50)$
/x ? "Ok" : "not ok", "\n"
}
__DATA__
1050
1950
2050
2054
1980
2004
3100
Updated: Added the trailing '$' in the regex to limit the length to 4 characters. Thanks MCS for pointing that out. :-)
| [reply] [d/l] |
Re: Matching date range with pure regex
by posix_guy (Novice) on Feb 17, 2004 at 11:21 UTC
|
Try /^(19[5-9]\d)|(20([0-4]\d)|50)$/ | [reply] [d/l] |
|
You need to add an outermost set of parentheses. /^((19[5-9]\d)|(20(([0-4]\d)|50)))$/
I don't know why, but I'm always paranoid when using | in a regex, primarily because I don't know exactly how it works. (I need to read Mastering Regular Expressions, I know ...)
------
We are the carpenters and bricklayers of the Information Age.
Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
| [reply] [d/l] [select] |
Re: Matching date range with pure regex
by MCS (Monk) on Feb 17, 2004 at 15:04 UTC
|
Kudos to the voters... awarding Abigail (with the slowest regex) the most number of votes. (at least as I write this) There is a difference between wanting to do it with a regex and wanting to do it as slowly as possible. I took the above regex's from grinder, roger, posix_guy, and Abigail and ran them through a benchmark doing 10,000 iterations each.
First, Roger: might want to add a $ on the end there because yours matches 19534 and 20010 etc... and posix_guy: yours matched 19534 for some reason. Anyway, I took grinder's data set and put it in an array (as with the multiple subroutines all accessing __DATA__ it wasn't properly displaying things). Then I seperated each regex into a different subroutine with the name of the author. I've attached the code if anyone wants to run it themselves. All results are from my Powerbook G4 667Mhz.
Benchmark: timing 10000 iterations of abigail, grinder, posixguy, roger...
grinder: 1 wallclock secs ( 1.06 usr + 0.03 sys = 1.09 CPU) @ 9174.31/s (n=10000)
roger: 2 wallclock secs ( 1.02 usr + 0.07 sys = 1.09 CPU) @ 9174.31/s (n=10000)
posixguy: 1 wallclock secs ( 1.37 usr + 0.03 sys = 1.40 CPU) @ 7142.86/s (n=10000)
abigail: 57 wallclock secs (51.45 usr + 0.38 sys = 51.83 CPU) @ 192.94/s (n=10000)
| [reply] [d/l] |
|
Benchmark: timing 10000 iterations of abigail, grinder, posixguy, roge
+r...
abigail: 55 wallclock secs (51.42 usr + 0.12 sys = 51.54 CPU) @ 19
+4.02/s (n=10000)
grinder: 1 wallclock secs ( 0.60 usr + 0.00 sys = 0.60 CPU) @ 16
+666.67/s (n=10000)
posixguy: 1 wallclock secs ( 0.96 usr + 0.00 sys = 0.96 CPU) @ 10
+416.67/s (n=10000)
roger: 1 wallclock secs ( 0.61 usr + 0.00 sys = 0.61 CPU) @ 16
+393.44/s (n=10000)
Updated code is in a readmore block (I removed the prints and added a $)
| [reply] [d/l] [select] |
Re: Matching date range with pure regex
by podian (Scribe) on Feb 17, 2004 at 15:52 UTC
|
I like the following one. If that does not work please let me know. I am not an expert in regex but want to become one!
while (<DATA>)
{
chomp;
if (/^(19|20)([5-9]|[0-5])([0-9])$/)
{
print "it matches for $_\n";
}
}
__DATA__
1950
2050
2001
2009
2000
It says: first two digits should be 19 or 20, third digit can go from 5 to 9 or 0 to 5 and the fourth digit is 0-9.
| [reply] [d/l] |
|
(5-9|0-5) That doesn't really make sense... it's the same as 0-9. Your regex can be simplified to: /^(19|20)\d\d$/ Which doesn't meet the requirements.
If you really want to become a regex master, the first step to enlightenment is to read "Mastering Regular Expressions" by Jeffrey Friedl. The link is to the publishers site (O'Reilly) but you can get it just about anywhere. That book taught me everything I know about regular expressions.
| [reply] [d/l] |
Re: Matching date range with pure regex
by Anonymous Monk on Feb 18, 2004 at 11:46 UTC
|
#!/usr/bin/perl
use strict;
use warnings;
while(<DATA>) {
chomp(my $is_valid = $_);
print $is_valid, "\n" if grep {/^$is_valid$/} (1950..2050);
}
__DATA__
1949
1950
1951
1999
2000
2001
2010
2049
2050
2051
2102
22102
19534
19080
20010
best regards,
Ronnie | [reply] [d/l] |