Hi Monks,

I've been banging my head against this for a week, and I'm pretty stuck. I have some regexs that run completely fine on my local machine, taking mere milliseconds. They work for both ASCII and unicode and cause me no problems whatsoever (assuming correct conversion to unicode when needed, etc...). This all happens locally on a Windows box using cygwin and Perl 5.10.1.


Now, as soon as I take that exact same code and throw it on a Linux server running Perl 5.8.8, everything still works in about the same length of time except for seven regular expressions (out of hundreds that I'm using) which for one reason or another wind up hanging and taking a long time (seconds v.s. milliseconds) for Japanese UTF8. I don't see anything special in these 7 regexs, or understand why there might be a problem in only Japanese (other high Unicode, like Chinese or Thai works fine), or why this problem only exists remotely on Perl 5.8.8 and not locally on Perl 5.10.1.


Upgrading the server to 5.10.1 is out of the question, so I need to find a work around for these. A few of the problematic regexs are:


1. ^\s*SPECIFIC\sDECISIONS\s*$
2. ^\s*Important\s+\S+\s+Decisions\s*$
3. ^General\s+Decisions$
4. ^E-mail:\s*replies@x.com$

Has anyone ever experienced anything similar?


In reply to Weird Perl 5.8.8 Regex Problems for Japanese UTF8 by ruski86

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.