jfraire has asked for the wisdom of the Perl Monks concerning the following question:
Dear Monks,
I have (what I thought was) a really simple RegExp working under Perl 5.8.8 but it breaks when tested under 5.10.0. It is part of Lingua::Stem::Es and it is guilty of a lot of the failures reported for the current version in CPAN.
The ofending code is:
if ( ($suffix) = $R2 =~ /(uciones|ución)$/ ) { # ución uciones # replace with u if in R2 $word =~ s/$suffix$/u/; print "Step 1 case 4: $word\n" if $DEBUG; }
I expect it to match when $R2 ends in either "uciones" or "ución", but it fails to match when $R2='ución'. There are 15 such failures in the test suite, related to these words:
and other ten words all ending in "ución".
When $R2 contains "uciones" the RegExp works OK; there are 10 such examples in the test suite.
I would appreciate it if someone could offer some insight into why this is happening. If you'd like to try the module, there is an undocumented $DEBUG global var that, if set, will display the different steps where the word is being stemmed.
(The other reason why some tests failed is because I forgot to declare Test::Exception as a requirement).
Thanks in advance,
Julio
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: RegExp breaks in Perl 5.10
by almut (Canon) on Mar 06, 2008 at 19:42 UTC | |
by eserte (Deacon) on Mar 06, 2008 at 19:44 UTC | |
by almut (Canon) on Mar 06, 2008 at 20:05 UTC | |
|
Re: RegExp breaks in Perl 5.10
by grinder (Bishop) on Mar 06, 2008 at 20:24 UTC | |
by almut (Canon) on Mar 06, 2008 at 21:13 UTC | |
by eserte (Deacon) on Mar 06, 2008 at 20:52 UTC | |
|
Re: RegExp breaks in Perl 5.10
by eserte (Deacon) on Mar 06, 2008 at 19:38 UTC | |
|
Re: RegExp breaks in Perl 5.10
by jfraire (Beadle) on Mar 06, 2008 at 22:54 UTC | |
by almut (Canon) on Mar 07, 2008 at 02:29 UTC | |
by jfraire (Beadle) on Mar 07, 2008 at 07:06 UTC | |
by almut (Canon) on Mar 07, 2008 at 18:21 UTC | |
by jfraire (Beadle) on Mar 07, 2008 at 20:02 UTC |