Hello:

For a long time I was using grep with the option "-P" to turn on Perl pattern matching to detect if a file contained a given text. The pattern could be very simple or very complicated so I needed regular expressions.

Then I ran into some trouble where grep would complain that "PCRE's backtracking limit is exceeded". I could never find a working solution so I decided that instead of using grep to emulate Perl, I could just use a Perl one-liner and cut the middle man.

Now I'm running into some problems with Perl because for some reason it is refusing to match some files even thought everything seems correct and if I take the same expression and use it with 'grep -P', it has no trouble finding the match.

Here is a portion of my Perl one-liner (on linux). If there was a match, it would exit with code 0 (trying to emulate grep's behavior)

perl -ne "/(?s)<\?php.+?[\\$]{1}/ and exit 0; exit 1" filename.txt ; e +cho $?
Here is a partial content of the file "filename.txt":
<?php $t60="
Here is the same partial content in hex (because there are some not visible chars there):
0000000 3c 3f 70 68 70 20 20 0d 0a 24 74 36 30 3d 22 0a 0000020
For the life of me I can not find a way to make Perl to match that. If I use the same expression with Grep, it has no problem matching it:
grep -Pzo "(?s)<\?php.+?[\\$]{1}([0-9a-zA-Z]+)=['\"]+" filename.txt

What am I missing? I would really appreciate your help as every Regex program and editor and debugger says the regex is correct and like I said, grep has no problem at all matching it.


In reply to Unable to match newline by robinson

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.