Dear Monks,

I have (what I thought was) a really simple RegExp working under Perl 5.8.8 but it breaks when tested under 5.10.0. It is part of Lingua::Stem::Es and it is guilty of a lot of the failures reported for the current version in CPAN.

The ofending code is:

if ( ($suffix) = $R2 =~ /(uciones|ución)$/ ) { # ución uciones # replace with u if in R2 $word =~ s/$suffix$/u/; print "Step 1 case 4: $word\n" if $DEBUG; }

I expect it to match when $R2 ends in either "uciones" or "ución", but it fails to match when $R2='ución'. There are 15 such failures in the test suite, related to these words:

and other ten words all ending in "ución".

When $R2 contains "uciones" the RegExp works OK; there are 10 such examples in the test suite.

I would appreciate it if someone could offer some insight into why this is happening. If you'd like to try the module, there is an undocumented $DEBUG global var that, if set, will display the different steps where the word is being stemmed.

(The other reason why some tests failed is because I forgot to declare Test::Exception as a requirement).

Thanks in advance,

Julio


In reply to RegExp breaks in Perl 5.10 by jfraire

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.