Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Re^4: Regex AND

by tmoertel (Chaplain)
on Dec 03, 2004 at 18:14 UTC ( #412216=note: print w/replies, xml ) Need Help??

in reply to Re^3: Regex AND
in thread Regex AND

(Update: See Re: Ways to implement a closure for more on using closures for this kind of thing.)

ady wrote:

Well, i'd have to open the perl program and change the !~ op to the =~ op each time i want filtering on a "negated domain".

I could do that, but i prefer a way to express the regex complement directly as a new regex (to be fed to the program). -- And the way to do that was shown by Corion above.

Another option would be use "regex matchers" instead of hand-coded regex operations. The matchers can be inverted, and so you can change the matching logic of your worker code by passing in normal or inverted matchers.

One possible implementation:

# The following small library lets us create regex-matchers # and inverted regex-matchers. sub make_regex_matcher { my $regex = shift; return sub { local $_ = $_[0]; /$regex/g; } } sub invert_regex_matcher { my $matcher = shift; sub { wantarray ? die "inverted matchers are only for scalar context" : ! $matcher->(@_) } }

Then we can parameterize our code's matching behavior by using matchers instead of regex operators:

# With the above library, we can write our worker code without # having to specifiy whether we are interested in matching (=~) # or non-matching (!~). Instead, we can parameterize this # behavior by allowing our worker to accept a matcher as an # argument: my @candidates = map {chomp;$_} <DATA>; sub do_work { my $matcher = shift; foreach (@candidates) { if ($matcher->($_)) { # instead of regex op # do something with candidate in $_ print "$_$/"; } } }

Here is a sample run:

# To demonstrate this approach, let us create a matcher for # your example pattern: my $matcher = make_regex_matcher('(CX36(5|6))|(JA30[0-2])|(JA3(([2 +-8]\d)|(9[0-4])))|(JA5.*)|(JA6((0\d)|(1[0-3])))|(JA64[7-9])|(JA687.*) +|(JA74[0-3])|(JB5.*)|(JY(((1|2)\d\d)|(3[0-3]\d)))|(JY[3-9][5-9]\d)|(J +Z51(3|4)00.*)'); # Now we can process matching candidates: print "Matches:$/"; do_work($matcher); # And we can process non-matching candidates without # having to change a line of worker code: print "$/Non-matches:$/"; do_work(invert_regex_matcher($matcher)); ### OUTPUT: ### ### Matches: ### CX365-CX366 ### JA300-JA302 ### JA320-JA394 ### ### Non-matches: ### I do not match! ### Nor do I match, my non-matching brother! __DATA__ CX365-CX366 I do not match! JA300-JA302 Nor do I match, my non-matching brother! JA320-JA394

I hope that this helps.


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://412216]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2023-10-04 00:37 GMT
Find Nodes?
    Voting Booth?

    No recent polls found