Lukin4Love has asked for the wisdom of the Perl Monks concerning the following question:

I am using perl v. 5.10.0 on a Vista 64 bit computer, trying to create a few lines in a file that I will later copy and paste into an XML file that uses UTF-8.

The XML file when viewed with "notepad" on a windows XP system shows what looks like text and arrow signs with a square between opposing arrow signs like this,,,,with [] representing the square.

___________________________________________________

lots of text>[]< lots of text>[]<lots of text

___________________________________________________

When viewed on a Vista machine using "notepad" you don't see the [] character and see instead a newline carriage return is displayed like this.

_________________________________________________

lots of text>

< lots of text>

<lots of text

__________________________________________________

I think the [] represents a char(13) character(10) i.e. \x0D \x0A i.e. I think x000D x000A . But maybe it needs to be sent differently because it is for a UTF-8 xml file.

I tried to copy just the [] character into a text file and then used

my @odaoutput= <ODOA>; print OUTFILE " some text>@odaoutput";
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Everything I try ends up looking like a line feed carriage return, even on the XP system. Never does it look like the original [] when viewed on the XP system. Can you help me please?

Replies are listed 'Best First'.
Re: adding 0D 0A to a UTF-8 file.
by cdarke (Prior) on Feb 23, 2011 at 07:58 UTC
    Simple, don't use Notepad.

    Line endings on Windows are generally "\r\n", and this is what Notepad on XP expects. Other platforms just use "\n", and that is what Notepad cannot cope with and displays your rectangle. Try Wordpad, or one of the myriad of other, better text editors.
Re: adding 0D 0A to a UTF-8 file.
by ikegami (Patriarch) on Feb 23, 2011 at 15:31 UTC
    Write has come with Windows for a long as I can remember, and it has always handled 0A line endings for as long as I can remember. Write is now known as WordPad, but I still call it Write because there's an alias to the editor named write in the path.
Re: adding 0D 0A to a UTF-8 file.
by elef (Friar) on Feb 23, 2011 at 16:53 UTC
    I'm not sure what you really want to do here (why generate some output with a perl script and then copy-paste it into a file instead of just generating the final file with perl?), but here's some info:

    The squares you see are definitely not the Windows line ending, which all Notepad versions display correctly as a line break (0D 0A, or, as perl coders tend to misleadingly call it, \r\n, or, as I like to call it, CRLF). It might be the Unix newline, i.e. LF, which Notepad can't handle and displays as a rectangle. Or it might be the old mac newline, CR.
    BTW I'm sure the "arrows" you are seeing are just the opening and closing angle brackets of tags of some sort.
    Perl always uses the platform's line ending as default, so, as you're running perl on Windows, the line ending generated by the line break character \n will be CRLF (this is why calling CR \r and calling LF \n is confusing: on Windows, \n generates CRLF, not just LF as it does on *nix).
    Basically, if you use defaults on a file with Unix newlines on a Windows computer, you'll either generate a file with mixed LF/CRLF endings, or inadvertently convert the whole file to CRLF. The latter might actually be preferable to keeping it with LF, esp. if you'll open it with Notepad, but if you want to keep it with LF, you'll have to resort to trickery. Either redefine the newline character in your script or print LF "manually". Not sure how to do that, as this code seems to print CRLF on my computer:
    open(LINE, ">:encoding(UTF-8)", "line.txt") or die "\nCan't create fil +e: $!\n"; print LINE "text" . chr(13) . "more text";

    ...as do \012, \x0A and
    $/ = LF; open(LINE, ">:encoding(UTF-8)", "line.txt") or die "\nCan't create fil +e: $!\n"; print LINE "text\nmore text";

    As far as I understand, these should all produce a file with a Unix newline, but all tests show that the file actually has CRLF. I'm lost here, but I'm sure somebody will shed light on this.
    Further reading: http://perldoc.perl.org/perlport.html#Newlines