Redefining chomp()

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Redefining chomp() by Limbic~Region (Chancellor) on Mar 24, 2004 at 14:09 UTC
Anonymous Monk, As has already been pointed out, chomp does probably does not work the way you think. It removes trailing $/ from the end of the line It returns the number of chars removed It can work on a list In the case of a list, it will return total number removed I do not believe just modifying $/ will work for you. For one, it will likely mess up reading in new files. Secondly, I am under the impression you want it to auto-detect if $/ should be "\n" or "\r\n" depending on what it is working on. Here is a start: `BEGIN { *CORE::GLOBAL::chomp = sub { my $count; for ( @_ ) { $count += $_ =~ s/[\r\n]$//g; } return $count; } }` [download] This breaks in a lot of ways. It will remove all trailing newlines instead of just one in the case of my $string = "foo\n\n\n" It doesn't see where it is supposed to stop working (without helping parens) in the case of print chomp $foo, "\n"; Probably a lot of others I didn't find Once fixed appropriately, this could be stuck in a module and then you could just use Chomp; Cheers - L~R	[reply] [d/l]
Re: Re: Redefining chomp() by ryantate (Friar) on Mar 24, 2004 at 18:28 UTC
As there is no quantifier on your character class, I have trouble understanding how this will remove multiple newlines as you say. Am I missing something? Also, if the Windows newline is in fact "\r\n", this will not work, again because there is no quantifier. It seems to me if you instead do `$count += $_ =~ s/\r?\n$//g;` [download] ... then you alleviate the problem of killing all trailing newlines and, assuming "\r\n" is what all windows newlines are, you are matching both windows and unix newlines.	[reply] [d/l]
Re: Re: Re: Redefining chomp() by Limbic~Region (Chancellor) on Mar 24, 2004 at 18:42 UTC
ryantate, My regex fu is non-existant as you can see. That does not really matter much as I said it would need to be fixed. I was on my way to a meeting so I didn't get to spend a lot of time on it. After thinking about it some more, I think the following would work a lot better. package Chomp; use Scalar::Readonly ':all'; BEGIN { *CORE::GLOBAL::chomp = sub { readonly_on( $/ ); my ($count, $fix) = (0, ''); local $/ = "\r\n"; for ( @_ ) { my ($first, $second) = (0, 0); eval { $first = chomp }; if ( $@ ) { die $@ if $_ !~ /^\r?\n$/; $fix = $_; last; } if ( ! $first ) { local $/ = "\n"; $second = chomp; } $count += $first + $second; } readonly_off( $/ ); return $fix ? ($count , $fix) : $count; }; } 42; # Then a script that uses it #!/usr/bin/perl use strict; use warnings; use Chomp; my $foo = "foo\n\r\n"; my $bar = "bar\n\n\n"; print chomp $foo, $/; # prints 2 print chomp $bar, "\n"; # prints 1 print chomp ($foo, $bar), "\n"; # prints 2 [download] This does have the unfortunate side effect of not allowing someone to do: `chomp($/); # $/ = undef;` [download] I know this is ugly and there are probably a few more gotchas in there, but It was kind of fun to work on. Note: This can be done without a module, but it is much uglier. Anyone wanting to see that should say so. Cheers - L~R	[reply] [d/l] [select]
Re^3: Redefining chomp() by Roy Johnson (Monsignor) on Mar 24, 2004 at 18:48 UTC
I have trouble understanding how this will remove multiple newlines as you say. Am I missing something? If the trailing /g on an anchored s///g actually had some effect, maybe it would strip off all trailing newlines. But as it is, it doesn't. The PerlMonk `tr///` Advocate	[reply]
Re: Re: Redefining chomp() by Anonymous Monk on Mar 24, 2004 at 14:19 UTC
This was exactly what I was looking for. I don't anticipate that it will break anything that I am doing. Though, testing will be in order. Modifying $/ global would, indeed, be a very bad thing to do. I knew this was a possible solution, but i would have rather done it locally to each chomp().	[reply]
Re: Re: Re: Redefining chomp() by dragonchild (Archbishop) on Mar 24, 2004 at 14:58 UTC
If that's all you wanted to do, a simple `perl -pi -e 's/chomp([^;]);/{local $/="\r\n";chomp$1;}/gm;' <your files here>` would have sufficed ... wouldn't it? ------ We are the carpenters and bricklayers of the Information Age.* Then there are Damian modules.... sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon.* - flyingmoose	[reply] [d/l]
Re: Re: Re: Redefining chomp() by amw1 (Friar) on Mar 24, 2004 at 15:18 UTC
Wouldn't this be a good place to use local? Something like `{ local $/ = "\n\r"; chomp; }` [download] protect the global state of $/ and let you deal with changing it's behavior right before the chomp. The scope could be increased until you've captured all of your chomp calls. Any new chomps you write, unless they're in the same scope as the local call would use the default value of $/ code is untested.	[reply] [d/l]
Re: Re: Re: Re: Redefining chomp() by biosysadmin (Deacon) on Mar 24, 2004 at 17:49 UTC
Re: Redefining chomp() by dragonchild (Archbishop) on Mar 24, 2004 at 13:45 UTC
If you read the description of chomp, you will notice that ... it deletes the terminating string corresponding to the current value of $/ .... Further down, it says With version 5.6, the meaning of chomp changes slightly in that input disciplines are allowed to override the value of the $/ variable and mark strings as to how they should be chomped. This has the advantage that an input discipline can recognize more than one variety of line terminator ... If you can, I'd look at that. If you can't, look at CORE::chomp(). ------ We are the carpenters and bricklayers of the Information Age. Then there are Damian modules.... sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon.* - flyingmoose	[reply]
Re: Redefining chomp() by jaa (Friar) on Mar 24, 2004 at 14:08 UTC
Why not pass your data files through dos2unix before letting them near your script? or transfer them into the *nix world with ASCII mode ftp?	[reply]
Re: Redefining chomp() by matija (Priest) on Mar 24, 2004 at 13:46 UTC
Chomp removes any trailing string that corresponds to `$/` or `use English; $INPUT_RECORD_SEPARATOR`, so my guess would that you just need to modify that...	[reply]