in reply to sprintf is printing unexepected output

What's going on here?

Just a guess here... your data file was created on a Windows box. Hence, its lines end with CRLF. But you are working on Unix/Linux and so your chomp removes the LF but not the CR. You are printing the carriage return (which brings you back to the beginning of the line.)

-sauoq
"My two cents aren't worth a dime.";
  • Comment on Re: sprintf is printing unexepected output

Replies are listed 'Best First'.
Re^2: sprintf is printing unexepected output (chomp)
by tye (Sage) on Dec 29, 2006 at 19:07 UTC
    But you are working on Unix/Linux and so your chomp removes the LF but not the CR.

    By default, on all platforms, chomp only ever removes "\n" (linefeed) never "\r" (carriage return). In particular, even on Windows, chomp (by default) does not remove "\r". The reason that this doesn't cause a problem is because if you haven't used binmode on Windows, then reading from a file will transform "\r\n" into "\n".

    I find that it is always best to ignore trailing whitespace1 (too many things can add it in and most things don't let you know that it is there). So I almost always use s/\s*$// instead of chomp. This practice prevents the above type of problem as well.

    1 Which is one reason why I no longer ever user <<HERE_DOCs, since they can break in the face of trailing whitespace.

    - tye        

      By default, on all platforms, chomp only ever removes "\n" (linefeed) never "\r" (carriage return).

      My understanding is a bit different. I agree that chomp removes "\n" by default as $/ defaults to "\n". However, "\n" is not a linefeed, but a "logical newline". On MacPerl, for instance, "\n" means "\015".

      The rest of what you said is true. On Windows "\n" is equal to "\012" just as it is on Unix/Linux but standard IO does the conversion from CRLF to LF if the file is opened in text mode. I didn't mean to imply otherwise; I was just taking a guess at how the situation arose for the OP. Thanks for making it clear though.

      -sauoq
      "My two cents aren't worth a dime.";

        Don't let the confusion of perlport and old Macs addle your brain too much. "A" isn't an uppercase A but is merely a logical uppercase A that stands in for the actual bit pattern that applies on that particular platform. When you talk to a SMTP server it doesn't want to hear "HELO" but to hear "HELO" in ASCII so, by the logic of perlport you should never send "HELO" but instead should hard-code the ASCII bit patterns for that. Sounds rediculous, doesn't it? The fact that people don't see it as rediculous when applied to "\n" are several.

        "\n" is quite simply the newline character on the current platform.

        The fact that old Macs claimed to be ASCII systems and so don't provide any translation layers to real ASCII and yet defined "\n" as something other than ASCII newline (but "\n" is still the newline character for old Macs), has caused a lot of broken thinking in the Perl world. I have a few nodes where I go into this in more detail.

        Following the advice in perlport means that you write code that isn't portable unless you are on an ASCII system while just writing "\n" means that your code is portable to every system in the world except for old Macs. The popularity of old Macs vs. the popularity of Perl scripts on non-ASCII systems makes the balance here switch toward favoring portability to one single (somewhat broken) system over just coding portably. But the mental hoops that people have jumped through to try to rationalize the practice of making things that work on old Macs (while ignoring the existance of non-ASCII systems) are impressive and have caused tons of confusion.

        Sorry, I don't have time to go into this further at this time. I'll try to throw in links to my prior discussions of this as I find time.

        - tye        

      That's the reason I was looking for -- thanx.

      Where do you want *them* to go today?
Re^2: sprintf is printing unexepected output
by thezip (Vicar) on Dec 29, 2006 at 18:54 UTC

    That was it -- when I did a substitution to remove both CR and LF (ie. s/[\012\015]//g), it printed correctly.

    Thanks for saving my sanity!

    Where do you want *them* to go today?
Re^2: sprintf is printing unexepected output
by thezip (Vicar) on Dec 29, 2006 at 19:14 UTC

    BTW,

    Why doesn't chomp remove both CR/LF characters? To me, it would be a reasonable and natural thing to do.

    Wha'ts the rationale for *not* removing '\r' in this world of mixed Unix, Windows, and Macs?
    Where do you want *them* to go today?
      Why doesn't chomp remove both CR/LF characters?

      It removes $/. By default, $/ is "\n" which is a logical character that may differ from one platform to another. In MacPerl, "\n" equals "\015". On both Windows and Unix platforms "\n" equals "\012". But, on Windows, standard IO translates between "\n" and "\015\012" when the file is opened in text mode. (Which is why you have to use binmode for binary files... where you don't want that translation to take place.)

      I don't really disagree with you. It seems that there'd be a lot less questions about it if chomp just removed all control characters from the end of the string. But changing its behavior now wouldn't really be doable, of course. And, it's easy enough to do what you want with a s///, so it's not really a big issue.

      -sauoq
      "My two cents aren't worth a dime.";
      In Unix world, the end-of-line is simply "\012" (one character). So unless you ask for it, the default behavior is to guess you have a well-behaved Unix file and do the most efficient operation. I think there is support with PerlIO layers to do what you want — but you have to ask for this and take the burden of its cost.