Which, fortunately, is why we've got the marvellous \X sequence: s/\X$// will do what you want.
--
Tommy
Too stupid to live.
Too stubborn to die.
| [reply] [d/l] |
\X fixes the utf8 problem, but there's still a problem with this regex...
#!/usr/bin/perl -w
use strict;
use utf8;
my($a,$b,$c);
$a=$b=$c="123\n456\n";
print chop $a; # prints "\n"
print $b =~ s/\X$//; # prints "1"
print $a; # prints "123\n456"
print $b; # prints "123\n45\n" ## OOPS!!!
i believe s/\X\z// will do what you want, although it still won't return the character removed. instead, use substr EXPR,OFFSET,LEN,REPLACEMENT (i.e. substr $_,-1,1,'').
~Particle *accelerates*
| [reply] [d/l] [select] |
Doesn't the dot handle multi-byte UTF-8 characters when the string is of the character persuasion and "use utf8" is in scope?
Update Perhaps you meant multiple codepoints used to "compose" one glyph, rather than multiple bytes to form one codepoint. The former is what \X does. Perl5 regex only does the latter; Perl6 is said to do the former too (u0, u1, and u2 levels if memory serves). | [reply] |
that depends on your version of perl5 (i wish i give a specific example, but i don't have all those installs in front of me.)
~Particle *accelerates*
| [reply] [d/l] |
...but will not handle multi-byte characters.
It will in Perl 6, and already does under the utf8 pragma in Perl 5.6+. Besides, as a Perl 5 regex, it doesn't make sense for $ matches before a trailing \n.
- Yes, I reinvent wheels.
- Spam: Visit eurotraQ.
| [reply] |