You never mentioned what OS this is running on, nor what sort of tool you were using when you saw "boxes". As to the first point, I would expect you were using unix or linux; the data, having come from the web, presumably has "CRLF" ("\r\n", aka "\x0d\x0a") line termination. But in this script you posted, the "CR" character does not get removed on input (this would only happen if the perl script were running on a windows machine). Then, your use of "." (period) in the various regexes causes the CR to be included in the various strings that are captured and assigned to variables (period matches everything except "LF" = "\n" = "\x0a", so it matches CR).
It was actually those residual CR characters that were showing up as boxes in your display. Some unix tools for viewing text data will do this, because if CR is rendered "literally", the resulting display can be misleading -- esp. if there are additional characters "on the same line" following the CR (i.e. between the CR and the next LF).
Try running this one-liner in a normal terminal window, and see what the output looks like. Then run it again and redirect the output to a file, and view that file using whatever tool was displaying boxes in your other data. That should help you understand.
perl -e 'print " passed the test\r failed \n"'
In reply to Re^3: Remove new line characters
by graff
in thread Remove new line characters
by simatics
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |