Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello very smart monks. I'm trying to match the following text in a text file: "/newsearch/detail/96603" I actually have a bunch of these and I can fish most things out and get them into another text file but this string is stumping me. I think it's the '/' that's messing me up. The text starts the same way but the trailing number are all different. Here is my code:

 #push @matches, $_ if /^.*?(?:\b|_)$parse1(?:\b|_).*?(?:\b|_)$parse2(?:\b|_).*?$/m;

I tried "/newsearch/detail/" for $parse1 and a blank $parse2 and several other things but no luck. Thanks in advance for any help you can provide this monk in training!

Replies are listed 'Best First'.
Re: seeking expression to match "/mysearch/detail/966031"
by AnomalousMonk (Archbishop) on Nov 18, 2016 at 18:50 UTC

    The basic problem stems from a too-liberal use of the  (?:\b|_) assertion.

    The  $parse1 string begins with a  / (forward slash) character. This is not a \w character. The  (?:\b|_) assertions in  (?:\b|_)$parse1(?:\b|_) require that  $parse1 be preceded by a \w character or a  _ (underscore) (note that  _ is a member of the \w class), and followed by a non-\w character or a  _ (underscore).

    What the OPed regex will match depends on the string it is matching against. I would advise giving us examples of typical strings. Below are some examples of mine that show various matches/non-matches:

    c:\@Work\Perl>perl -wMstrict -le "print qq{perl version: $] \n}; ;; my $parse1 = '/newsearch/detail/96603'; my $parse2 = ''; ;; for ( qq{xxx \n/newsearch/detail/96603\n xxx}, qq{xxx \n/newsearch/detail/96603x\n xxx}, qq{xxx \nx/newsearch/detail/96603x\n xxx}, qq{xxx \nx/newsearch/detail/96603\n xxx}, qq{bla vla/newsearch/detail/96603 x}, ) { print qq{<<$_>>}; print qq{MATCH '$&'} if /^.*?(?:\b|_)$parse1(?:\b|_).*?(?:\b|_)$parse2(?:\b|_).*?$/m; print ''; } " perl version: 5.008009 <<xxx /newsearch/detail/96603 xxx>> <<xxx /newsearch/detail/96603x xxx>> <<xxx x/newsearch/detail/96603x xxx>> <<xxx x/newsearch/detail/96603 xxx>> MATCH 'x/newsearch/detail/96603' <<bla vla/newsearch/detail/96603 x>> MATCH 'bla vla/newsearch/detail/96603 x'

    Update: Because you give no examples of the typical strings you're matching against, I cannot understand the implications of the  ^.*? and  .*?$ and  .*?(?:\b|_)$parse2(?:\b|_) portions of the OPed regex (which I suspect may be superfluous), or why you are matching in  /m "multi-line" mode.


    Give a man a fish:  <%-{-{-{-<

Re: seeking expression to match "/mysearch/detail/966031"
by Discipulus (Canon) on Nov 18, 2016 at 15:56 UTC
    You must escape / in regexes:
    perl -e "print $1 if $ARGV[0]=~/(\/newsearch\/detail\/96603)/" "bla vl +a/newsearch/detail/96603 x" /newsearch/detail/96603

    You can also choose a different delimiter instead of / see perlrequick for details and this way you prevent the leaning toothpick syndrome

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
      You must escape / in regexes ...

      A literal delimiter character in a regex like  /...\/.../ must be escaped, but if the delimiter is within an interpolated string (as in the OPed regex), the regex compiler is quite happy with it:

      c:\@Work\Perl>perl -wMstrict -le "my $re = '.../...'; ;; print 'match' if 'xxx/xxx' =~ /$re/; " match


      Give a man a fish:  <%-{-{-{-<