Re: chomp not working
by toolic (Bishop) on Dec 06, 2007 at 21:56 UTC
|
The chomp documentation specifies that the function
"removes any trailing string that corresponds to the current value of $/ (also known as $INPUT_RECORD_SEPARATOR)", which
defaults to the newline character, \n.
I'm not sure what those characters are, but when I am trying to identify unusual characters, I use the ord function:
#!/usr/bin/env perl
use warnings;
use strict;
my $str = 'abcde';
for (split //, $str) {
print "$_:", ord $_, ":\n";
}
Once you identify the characters, it will be easier to figure out a way of eliminating them. | [reply] [d/l] [select] |
|
|
Good idea.
I shortened the file to 2 lines to shorten the output, here's the output:
t:116
e:101
s:115
t:116
:13
:13
:10
t:116
e:101
s:115
t:116
:13
:13
:10
So there's 3 chars at the end of each line? 13,13,10
if you're curious here's where i got the file:
link
It's xml. I use a "get" to throw the content into a var, then write the var to a text file. | [reply] [d/l] |
Re: chomp not working
by johngg (Canon) on Dec 06, 2007 at 22:05 UTC
|
If you are on a *nix-like environment you could look at the file using od to get an idea of what line terminator is actually being used. Failing that, write a quick script to read the first couple of hundred characters from your file into a buffer and then do something like
print qq{@{ [ ord $_ ] } => $_\n}
for split m{}, $buffer;
This will give you the ordinal value for each character and the character itself, one per line. You should be able to spot the line terminator from that. Then you could set $/ (see perlvar) to what you have found so that chomp will work.
I hope this is useful. Cheers, JohnGG | [reply] [d/l] [select] |
|
|
| [reply] |
|
|
Good idea.
I shortened the file to 2 lines to shorten the output, here's the output:
t:116
e:101
s:115
t:116
:13
:13
:10
t:116
e:101
s:115
t:116
:13
:13
:10
So there's 3 chars at the end of each line? 13,13,10
if you're curious here's where i got the file:
link
It's xml. I use a "get" to throw the content into a var, then write the var to a text file. | [reply] [d/l] |
Re: chomp not working
by Joost (Canon) on Dec 06, 2007 at 21:51 UTC
|
You're probably dealing with a unix formatted file, which has "\012" as the newline sequence, while windows uses "\015\012". By default perl only recognizes the standard OS newline sequence.
A quick fix would be to set $/ to "\012" prior to reading the file.
See also perlport chomp and perlio
| [reply] |
|
|
That's not going to help. $/ already is "\012" on Windows.
When reading, "\015\012" is converted to "\012" in one of the PerlIO layers, before readline gets a hold of the data, and therefore before $/ is applied.
When writing, "\012" is converted to "\015\012" in one of the PerlIO layers, after print sends off the data, and therefore after $\ is applied. That's why "\n" ("\012") is used instead of "\r\n" ("\015\012") when ending a line.
Maybe binmode is on, in which case the "\015\012" wouldn't be changed to "\012".
Maybe the file is in the old Mac format ("\015").
Maybe the file is in some corrupted format ("\015\015\012", which would look like "\015\012" when $/ is applied).
| [reply] [d/l] [select] |
Re: chomp not working
by HeatSeekerCannibal (Beadle) on Dec 07, 2007 at 17:59 UTC
|
Hi
If you're passing files back and forth between *nix and windows, it may be that the end-of-lines are messed up.
Have you tried dos2unix?
Best of luck!
| [reply] |
|
|
I think I know why the chomp wasn't working.
chomp removes the newline character from the end of the string. My file was all 1 string... so it's only removing the last newline char, not the new line chars throughout.
solution: $content =~ s/\r\n//g;
Thanks guys | [reply] [d/l] |