So the question is, what are these extra characters in the html data, which are not newlines and are not displayable characters? Here's a way to find out:
Assuming that your $array_value has not been flagged as containing utf8 character data, the substitution above will replace all "invisible" byte values (including those between 128 and 255) with their hexadecimal numerics (e.g. linefeed will show up as "\x0a", carriage-return as "\x0d", "delete" as "\x7f", non-breaking space as "\xa0" and so on).$line = $array_value; # but where does $array_value come from? $line =~ s/([^\x20-\x7e])/sprintf( "\\x%02x", ord( $1 ))/eg; print $line;
If the string does contain utf8 characters (and perl has flagged it as such), it should still work, but some of the hexadecimal values may be 3- or 4-digit numbers.
Once you know what sorts of characters you're dealing with, you'll have a better idea of how to handle them.
In reply to Re: Remove new line characters
by graff
in thread Remove new line characters
by simatics
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |