Re: a farewell to chop

I can well understand them getting rid of chop, although I do think the main problem is having it named so similarly to chomp.

When I was beginning Perl I came across chop and chomp, and whilst I could remember the fact that one wasn't fussy what it removed and one only removed end-of-line characters it took me a remarkable amount of time to learn which was which. Caused some nasty bugs, too.

Since I learnt the names properly I don't think I've ever touched chop for anything. So far you've only said you've been able to find one example, and people can do it with substr's (substr ($foo,-1) = '', admittedly messy), or a simple regexp $foo =~ s/.$//; which to me is perfectly readable. I really don't see the advantage of keeping chop paying off against the risk of having the confusing (and easily mis-typed) chomp/chop pair.

I also can't see many people are going to go and write their own version of chop, to be honest. It's a simple enough thing to 'just do' and the function call imposes a much higher overhead than the operation itself.

Comment on Re: a farewell to chop Select or Download Code

Replies are listed 'Best First'.
Re^2: a farewell to chop by particle (Vicar) on Sep 11, 2002 at 15:50 UTC
or a simple regexp $foo =~ s/.$//; which to me is perfectly readable ...but will not handle multi-byte characters. chop will. ~Particle accelerates	[reply] [d/l]
Re: Re^2: a farewell to chop by tommyw (Hermit) on Sep 11, 2002 at 15:56 UTC
Which, fortunately, is why we've got the marvellous \X sequence: `s/\X$//` will do what you want. -- Tommy Too stupid to live. Too stubborn to die.	[reply] [d/l]
Re^4: a farewell to chop by particle (Vicar) on Sep 11, 2002 at 18:29 UTC
\X fixes the utf8 problem, but there's still a problem with this regex... `#!/usr/bin/perl -w use strict; use utf8; my($a,$b,$c); $a=$b=$c="123\n456\n"; print chop $a; # prints "\n" print $b =~ s/\X$//; # prints "1" print $a; # prints "123\n456" print $b; # prints "123\n45\n" ## OOPS!!!` [download] i believe `s/\X\z//` will do what you want, although it still won't return the character removed. instead, use `substr EXPR,OFFSET,LEN,REPLACEMENT (i.e. substr $_,-1,1,'')`. ~Particle accelerates	[reply] [d/l] [select]
Re^3: a farewell to chop by John M. Dlugosz (Monsignor) on Sep 11, 2002 at 16:06 UTC
Doesn't the dot handle multi-byte UTF-8 characters when the string is of the character persuasion and "use utf8" is in scope? Update Perhaps you meant multiple codepoints used to "compose" one glyph, rather than multiple bytes to form one codepoint. The former is what \X does. Perl5 regex only does the latter; Perl6 is said to do the former too (u0, u1, and u2 levels if memory serves).	[reply]
Re^4: a farewell to chop by particle (Vicar) on Sep 11, 2002 at 17:41 UTC
that depends on your version of perl5 (i wish i give a specific example, but i don't have all those installs in front of me.) ~Particle accelerates	[reply] [d/l]
Re: Re^2: a farewell to chop by Juerd (Abbot) on Sep 11, 2002 at 22:48 UTC
...but will not handle multi-byte characters. It will in Perl 6, and already does under the utf8 pragma in Perl 5.6+. Besides, as a Perl 5 regex, it doesn't make sense for $ matches before a trailing \n. - Yes, I reinvent wheels. - Spam: Visit eurotraQ.	[reply]