This explains a few things that were not clear in your original question.

You never mentioned what OS this is running on, nor what sort of tool you were using when you saw "boxes". As to the first point, I would expect you were using unix or linux; the data, having come from the web, presumably has "CRLF" ("\r\n", aka "\x0d\x0a") line termination. But in this script you posted, the "CR" character does not get removed on input (this would only happen if the perl script were running on a windows machine). Then, your use of "." (period) in the various regexes causes the CR to be included in the various strings that are captured and assigned to variables (period matches everything except "LF" = "\n" = "\x0a", so it matches CR).

It was actually those residual CR characters that were showing up as boxes in your display. Some unix tools for viewing text data will do this, because if CR is rendered "literally", the resulting display can be misleading -- esp. if there are additional characters "on the same line" following the CR (i.e. between the CR and the next LF).

Try running this one-liner in a normal terminal window, and see what the output looks like. Then run it again and redirect the output to a file, and view that file using whatever tool was displaying boxes in your other data. That should help you understand.

perl -e 'print " passed the test\r failed \n"'

In reply to Re^3: Remove new line characters by graff
in thread Remove new line characters by simatics

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.