Re^2: a farewell to chop

Replies are listed 'Best First'.
Re: Re^2: a farewell to chop by tommyw (Hermit) on Sep 11, 2002 at 15:56 UTC
Which, fortunately, is why we've got the marvellous \X sequence: `s/\X$//` will do what you want. -- Tommy Too stupid to live. Too stubborn to die.	[reply] [d/l]
Re^4: a farewell to chop by particle (Vicar) on Sep 11, 2002 at 18:29 UTC
\X fixes the utf8 problem, but there's still a problem with this regex... `#!/usr/bin/perl -w use strict; use utf8; my($a,$b,$c); $a=$b=$c="123\n456\n"; print chop $a; # prints "\n" print $b =~ s/\X$//; # prints "1" print $a; # prints "123\n456" print $b; # prints "123\n45\n" ## OOPS!!!` [download] i believe `s/\X\z//` will do what you want, although it still won't return the character removed. instead, use `substr EXPR,OFFSET,LEN,REPLACEMENT (i.e. substr $_,-1,1,'')`. ~Particle accelerates	[reply] [d/l] [select]
Re^3: a farewell to chop by John M. Dlugosz (Monsignor) on Sep 11, 2002 at 16:06 UTC
Doesn't the dot handle multi-byte UTF-8 characters when the string is of the character persuasion and "use utf8" is in scope? Update Perhaps you meant multiple codepoints used to "compose" one glyph, rather than multiple bytes to form one codepoint. The former is what \X does. Perl5 regex only does the latter; Perl6 is said to do the former too (u0, u1, and u2 levels if memory serves).	[reply]
Re^4: a farewell to chop by particle (Vicar) on Sep 11, 2002 at 17:41 UTC
that depends on your version of perl5 (i wish i give a specific example, but i don't have all those installs in front of me.) ~Particle accelerates	[reply] [d/l]
Re: Re^2: a farewell to chop by Juerd (Abbot) on Sep 11, 2002 at 22:48 UTC
...but will not handle multi-byte characters. It will in Perl 6, and already does under the utf8 pragma in Perl 5.6+. Besides, as a Perl 5 regex, it doesn't make sense for $ matches before a trailing \n. - Yes, I reinvent wheels. - Spam: Visit eurotraQ.	[reply]