in reply to Remove new line characters
So the question is, what are these extra characters in the html data, which are not newlines and are not displayable characters? Here's a way to find out:
Assuming that your $array_value has not been flagged as containing utf8 character data, the substitution above will replace all "invisible" byte values (including those between 128 and 255) with their hexadecimal numerics (e.g. linefeed will show up as "\x0a", carriage-return as "\x0d", "delete" as "\x7f", non-breaking space as "\xa0" and so on).$line = $array_value; # but where does $array_value come from? $line =~ s/([^\x20-\x7e])/sprintf( "\\x%02x", ord( $1 ))/eg; print $line;
If the string does contain utf8 characters (and perl has flagged it as such), it should still work, but some of the hexadecimal values may be 3- or 4-digit numbers.
Once you know what sorts of characters you're dealing with, you'll have a better idea of how to handle them.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Remove new line characters
by cdarke (Prior) on Apr 17, 2007 at 07:18 UTC | |
by graff (Chancellor) on Apr 17, 2007 at 14:51 UTC | |
|
Re^2: Remove new line characters
by Anonymous Monk on Apr 17, 2007 at 06:20 UTC | |
by simatics (Initiate) on Apr 18, 2007 at 05:27 UTC | |
|
Re^2: Remove new line characters
by simatics (Initiate) on Apr 18, 2007 at 05:34 UTC | |
by graff (Chancellor) on Apr 18, 2007 at 05:37 UTC |