chomp not working

djbryson has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: chomp not working by toolic (Bishop) on Dec 06, 2007 at 21:56 UTC
The chomp documentation specifies that the function "removes any trailing string that corresponds to the current value of $/ (also known as $INPUT_RECORD_SEPARATOR)", which defaults to the newline character, `\n`. I'm not sure what those characters are, but when I am trying to identify unusual characters, I use the ord function: `#!/usr/bin/env perl use warnings; use strict; my $str = 'abcde'; for (split //, $str) { print "$_:", ord $_, ":\n"; }` [download] Once you identify the characters, it will be easier to figure out a way of eliminating them.	[reply] [d/l] [select]
Re^2: chomp not working by djbryson (Beadle) on Dec 07, 2007 at 14:55 UTC
Good idea. I shortened the file to 2 lines to shorten the output, here's the output: `t:116 e:101 s:115 t:116 :13 :13 :10 t:116 e:101 s:115 t:116 :13 :13 :10` [download] So there's 3 chars at the end of each line? 13,13,10 if you're curious here's where i got the file: link It's xml. I use a "get" to throw the content into a var, then write the var to a text file.	[reply] [d/l]
Re: chomp not working by johngg (Canon) on Dec 06, 2007 at 22:05 UTC
If you are on a *nix-like environment you could look at the file using `od` to get an idea of what line terminator is actually being used. Failing that, write a quick script to read the first couple of hundred characters from your file into a buffer and then do something like `print qq{@{ [ ord $_ ] } => $_\n} for split m{}, $buffer;` [download] This will give you the ordinal value for each character and the character itself, one per line. You should be able to spot the line terminator from that. Then you could set `$/` (see perlvar) to what you have found so that chomp will work. I hope this is useful. Cheers, JohnGG	[reply] [d/l] [select]
Re^2: chomp not working by Fletch (Bishop) on Dec 07, 2007 at 00:13 UTC
. . . and if you're not on a *NIX-y environment you can always look for `od` from something like Cygwin, or use the Perl version from ppt. The cake is a lie. The cake is a lie. The cake is a lie.	[reply]
Re^3: chomp not working by djbryson (Beadle) on Dec 07, 2007 at 15:10 UTC
Good idea. I shortened the file to 2 lines to shorten the output, here's the output: `t:116 e:101 s:115 t:116 :13 :13 :10 t:116 e:101 s:115 t:116 :13 :13 :10` [download] So there's 3 chars at the end of each line? 13,13,10 if you're curious here's where i got the file: link It's xml. I use a "get" to throw the content into a var, then write the var to a text file.	[reply] [d/l]
Re: chomp not working by Joost (Canon) on Dec 06, 2007 at 21:51 UTC
You're probably dealing with a unix formatted file, which has "\012" as the newline sequence, while windows uses "\015\012". By default perl only recognizes the standard OS newline sequence. A quick fix would be to set $/ to "\012" prior to reading the file. See also perlport chomp and perlio "What should it profit a man, if he should win a flame war, yet lose his cool?"	[reply]
Re^2: chomp not working by ikegami (Patriarch) on Dec 06, 2007 at 22:09 UTC
That's not going to help. `$/` already is `"\012"` on Windows. When reading, `"\015\012"` is converted to `"\012"` in one of the PerlIO layers, before `readline` gets a hold of the data, and therefore before `$/` is applied. When writing, `"\012"` is converted to `"\015\012"` in one of the PerlIO layers, after `print` sends off the data, and therefore after `$\` is applied. That's why `"\n"` (`"\012"`) is used instead of `"\r\n"` (`"\015\012"`) when ending a line. Maybe `binmode` is on, in which case the `"\015\012"` wouldn't be changed to `"\012"`. Maybe the file is in the old Mac format (`"\015"`). Maybe the file is in some corrupted format (`"\015\015\012"`, which would look like `"\015\012"` when `$/` is applied).	[reply] [d/l] [select]
Re: chomp not working by HeatSeekerCannibal (Beadle) on Dec 07, 2007 at 17:59 UTC
Hi If you're passing files back and forth between *nix and windows, it may be that the end-of-lines are messed up. Have you tried dos2unix? Best of luck! Heatseeker Cannibal	[reply]
Re^2: chomp not working (resolved) by djbryson (Beadle) on Dec 07, 2007 at 20:05 UTC
I think I know why the chomp wasn't working. chomp removes the newline character from the end of the string. My file was all 1 string... so it's only removing the last newline char, not the new line chars throughout. `solution: $content =~ s/\r\n//g;` [download] Thanks guys	[reply] [d/l]