Random_Walk has asked for the wisdom of the Perl Monks concerning the following question:

I have a file containing 'regex' where number ranges are specified like this [21-43] (match any number from 21..43) I have a bit of code to translate this to Perl regex. It works but the generated regex is not so efficient as it could be. What would be much nicer is if for the example above I could get it to generate /(2[123456789])|(3\d)|(4[0123])/. Can anyone think of non-clunky way to do this ? BTW these numbers are IP address ranges with all that implies.

# Expand ranges of numbers so [21-25] becomes (21|22|23|24|25) # Ugh! # if ( $ip =~ /-/ ) { # foreach (split(/\./, $ip) { my @i=(split(/\./, $ip) foreach (@i) { if ( /\[(\d+)-(\d+)\]/ ) { $_ = "(" . join("|", ($1..$2)) . ")"

Janitored by Arunbear - balanced code tag

Replies are listed 'Best First'.
Re: converting user friendly regex to perl friendly
by ikegami (Patriarch) on Oct 15, 2004 at 18:17 UTC

    Generally speaking, it's better to match any number, then make sure it's in range.

    # Supports ranges in any number of bytes. sub parse_ip_mask { my ($ip_mask) = @_; my $checks = ''; my $check_num = 0; my $ip_re = join('\\.', map { if (/\[(\d+)-(\d+)\]/) { $check_num++; $checks .= " && \$$check_num >= $1"; $checks .= " && \$$check_num <= $2"; '(\\d+)' } else { $_ } } split(/\./, $ip_mask)); return eval "sub { (\$_[0] || \$_) =~ /^$ip_re\$/$checks }"; } my $ip_check = parse_ip_mask('131.202.1.[3-4]'); foreach (qw( 131.202.1.2 131.202.1.3 131.202.1.4 132.202.1.4 )) { print(&$ip_check() # Uses $_ if no arguments are specified. ? "$_ matches.$/" : "$_ doesn't match.$/" ); } __END__ output ====== 131.202.1.2 doesn't match. 131.202.1.3 matches. 131.202.1.4 matches. 132.202.1.4 doesn't match.

    It's possible to do this without eval. Give me a few minutes.

      A version without eval:

      # Supports ranges in any number of bytes. sub parse_ip_mask { my ($ip_mask) = @_; my @checks; foreach (split(/\./, $ip_mask)) { if (/\[(\d+)-(\d+)\]/) { push(@checks, [ 0+$1, 0+$2 ]); } else { push(@checks, [ 0+$_, 0+$_ ]); } } return \@checks; } sub ip_check { my ($ip, $checks) = @_; my @ip = split(/\./, $ip); my $i; for ($i=0; $i<4; $i++) { return undef if ($ip[$i] < $checks->[$i][0]); return undef if ($ip[$i] > $checks->[$i][1]); } return 1; }

      Example usage #1, check if an IP is in range:

      my $ip_mask = parse_ip_mask('131.202.1.[3-4]'); foreach (qw( 131.202.1.2 131.202.1.3 131.202.1.4 132.202.1.4 )) { print(ip_check($_, $ip_mask) ? "$_ matches.$/" : "$_ doesn't match.$/" ); } __END__ output ====== 131.202.1.2 doesn't match. 131.202.1.3 matches. 131.202.1.4 matches. 132.202.1.4 doesn't match.

      Example usage #2, finding matching IPs in a line:

      my $ip_mask = parse_ip_mask('131.202.1.[3-4]'); foreach ( 'Connection established to 131.202.1.2', 'Connection established to 131.202.1.3', 'Connection established to 131.202.1.4', 'Connection established to 132.202.1.4', ) { my ($ip) = /(\d+\.\d+\.\d+\.\d+)/; print("Found matching ip $ip.$/") if (ip_check($ip, $ip_mask)); } __END__ output ====== Found matching ip 131.202.1.3. Found matching ip 131.202.1.4.
Re: converting user friendly regex to perl friendly
by abitkin (Monk) on Oct 15, 2004 at 17:57 UTC
    Why convert it, when you can easily just change it to a set of if statements and eval that?

    Note, completely untested code, written from memory

    if($_ =~ s/\[(\d+)-(\d+)\]/){ s/\[(\d+)-(\d+)\]/command if(\$x >= $1 && \$x <= $2);/; eval $_; }
    Most likely this code won't compile, but it's a starting point.

    ==
    Kwyjibo. A big, dumb, balding North American ape. With no chin.
Re: converting user friendly regex to perl friendly
by Random_Walk (Prior) on Oct 15, 2004 at 19:29 UTC

    Thanks abitkin and ikegami, looks like cunnning ideas. I have been at work 12 hours now though and I am flying early tomorrow morning so I have printed them out and am going to the pub to sit with a quiet beer and read through in more detail. Shame I can't get Perl on my PalmPilot.

    Cheers,
    R.

      You can, although I can't remember where. You might want to ask people in the CB.