in reply to Re: strange behavior of grep with global match
in thread strange behavior of grep with global match [resolved]

Thanks Marshall.

Sorry for not providing more context. I am trying to match 'excpetion:' not followed by 'tex' without using a negative lookahead assertion. See Match with line without a word for the motivation/context. I have alternatives that produce the desired result. At the moment I am trying to understand the behavior of this particular construct, which I don't understand.

I am also interested to learn better/alternative ways of achieving the objective. I would be interested in solutions that exclude strings with 'tex' immediately after 'exception:' and those that exclude 'tex' anywhere after 'exceptions:'. In either case, strings with 'tex' before and not after 'exceptions:' should not be excluded. This is why /exception:/ && !/tex/, though simple, isn't a solution.

  • Comment on Re^2: strange behavior of grep with global match

Replies are listed 'Best First'.
Re^3: strange behavior of grep with global match
by moritz (Cardinal) on Aug 07, 2009 at 09:31 UTC
    and those that exclude 'tex' anywhere after 'exceptions:'.

    You can achieve that easily with adding .* (or maybe (?s:.)* before the tex, so either (?!.*tex) or /exception:/g && !/\G.*tex/s (after which you have to reset pos, as explained in my reply below).

Re^3: strange behavior of grep with global match
by alexm (Chaplain) on Aug 07, 2009 at 10:49 UTC
    I would be interested in solutions that exclude strings with 'tex' immediately after 'exception:'
    /exception:(?:[^t]|t[^e]|te[^x])/
Re^3: strange behavior of grep with global match
by Marshall (Canon) on Aug 07, 2009 at 11:05 UTC
    The more detail we know about the problem, the great the likely hood of success! I tried again with a couple of simple approaches...shown below..

    Approach #1: All solutions must have the sequence of "tex" followed by "exception" or "exceptions". So the first grep (read these grep "stacks" from the bottom up), takes care of that situation. Then the next grep says if "tex" occurs more than once, then this is a bad line. I don't know if "tex:tex:exception" occurs or not? If so then this approach would filter that out. But if 'tex' only can occur once then this works fine.

    Approach #2: Starts the same as a Approach #1, but the second grep says that if "tex" follows exception(s) then this is a "bad line" (remember the first grep{} assured that we are looking at a line that has a "tex..blah..exceptions", here we are looking to see if some 'tex' follows that exception part, and if so filter it out.

    I don't see the need for any fancy look ahead/behind voodoo. Yes, there is a place and a situation for that, but I would go with something simple to understand. If this doesn't do what you want, then modify the @data and the #desired is comment section to more accurately describe what you need.

    #!/usr/bin/perl -w use strict; my @data = ('exception:mex', 'qwerty', 'tex:exception:mex', 'exception:mex', 'exception:tex', 'tex:exception', 'tex:exception:tex', 'tex : exceptions:mex', 'tex:exception:mex:tex', 'asdf'); #desired is: tex:exception:mex # tex:exception # tex : exceptions:mex my @matches = grep{ #approach #1 my @texes = m/tex/g; @texes <=1 } grep{/tex.*?exception(s)?/} @data; print join ("\n",@matches),"\n\n"; @matches = grep{ !/exception(s)?.*?tex/} #approach #2 grep{/tex.*?exception(s)?/} @data; print join ("\n",@matches),"\n"; __END__ Prints: tex:exception:mex tex:exception tex : exceptions:mex tex:exception:mex tex:exception tex : exceptions:mex