comment on

Hmm, I was sufficiently surprised by this behaviour (that I've not heard of before) that I went looking. First off, your code fragment is not much use, as it does not define what $R2 contains. So I went and looked at the source, and ripped the following out of its guts:

use strict;
use warnings;

my @word = qw(
    constituci\xf3n contribuci\xf3n destituci\xf3n devoluci\xf3n dismi
+nuci\xf3n
    constituciones contribuciones destituciones devoluciones disminuci
+ones
    foo
);

my $vowels     = 'aeiou\xe1\xe9\xed\xf3\xfa\xfc';
my $consonants = 'bcdfghjklmn\xf1pqrstvwxyz';

my $revowel      = qr/[$vowels]/;
my $reconsonants = qr/[$consonants]/;
my $R2;
my $suffix;

for my $word (@word) {
    ($R2) = $word =~ /^.*?$revowel$reconsonants.*?$revowel$reconsonant
+s(.*)$/;
    $R2 ||= '';
    if ( ($suffix) = $R2 =~ /(uciones|uci\xf3n)$/ ) {
        # uci\xf3n uciones
        # replace with u if in R2
        $word =~ s/$suffix$/u/;
        print "Step 1 case 4: $word\n";
    }
}
[download]

(Those \xnn characters really are Latin-1 characters, that's just a direct cut'n'paste from my shell introducing the artifact).

And that runs just fine here, all the way up to "perl, v5.11.0 DEVEL33323 built for i386-freebsd-64int". So there's something else going on. Both "ución" and "uciones" match just fine. Perhaps the tester platforms are running in a different locale. To play it safe, I suggest you encode your program in UTF-8 and slap a use utf8 at the top and be done with it. At least I think that's the correct best practice. Thinking about encoding makes my head explode.

• another intruder with the mooring in the heart of the Perl

In reply to Re: RegExp breaks in Perl 5.10 by grinder
in thread RegExp breaks in Perl 5.10 by jfraire

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.