in reply to Re: Re: char count windows vs linux
in thread char count windows vs linux

Hi, thanks for your help. I've tried both these methods and the results are still not the same! Linux appears to be counting an extra character per line. If I strip out \n or chomp it makes no difference.
  • Comment on Re: Re: Re: char count windows vs linux

Replies are listed 'Best First'.
Re: Re: Re: Re: char count windows vs linux
by derby (Abbot) on Dec 17, 2002 at 13:41 UTC
    It sounds like the carriage return is still in the file on the linux side. If you ftp'ed the file from windows to linux in binary mode, that would be the case. Do a little test case such as this:

    #!/usr/bin/perl -wd while(<>) { chomp; print $_, "\n"; }

    While in the debugger, display $_ (x $_) after the chomp. Do you see something like this: "blah blah blah\cM"? That control-M is the carriage return. Some editors may also show the carriage return (vi) if configured properly. chomp removes any trailing string that corresponds to the current value of $/. In this case only the unix newline will be removed. You could be more destructive and remove all whitespace at the end of a line with a regex such as s/\s+$//. That would work on both platforms and you wouldn't have to worry. Or you could ensure your transfer process does the correct translation for you.

    -derby

Re: Re: Re: Re: char count windows vs linux
by jaa (Friar) on Dec 17, 2002 at 15:10 UTC
    Sounds like you are doing a binary transfer to get your Win text file onto Linux.

    Win uses two chars for end-of-line, and Linux uses one. If Linux returns an extra char per line, it is probably counting the ^M or \r that windows ignores as part of the line delimiter.

    Option 1) use ASCII or TEXT transfer

    Option 2) don't count the "\r" characters at the end of each line e.g.
    $count= length($line); $count-- if substr($line,-1) eq "\r";
Re: Re: Re: Re: char count windows vs linux
by gjb (Vicar) on Dec 17, 2002 at 13:21 UTC

    Are you sure the files are the same, i.e. if you transfered them by FTP, did you use ASCII mode? Something might have gone wrong in that stage somewhere.

    Just my 2 cents, -gjb-