in reply to Re^3: Parsing a text file
in thread Parsing a text file
[\r\n] does appear to finesse the problem nicely.
Unfortunately, when this last came up, I looked at all the relevant documentation I could find, but I did not see any guarantee that "\r" will be "\x0A" if "\n" is "\x0D" (or vice versa) in not-EBCDIC land. Or even that "\r" and "\n" are in general guaranteed to be duals of each other.
As brother ikegami says, you know and I know that these days, with the exception of EBCDIC systems, "\r\n" is exactly "\x0D\x0A". If an authoritative position were taken that as of (say) 5.8.0:
"\r\n" eq "\x0D\x0A" except for EBCDIC.
any system using line endings other than "\n" will support, and will by default use, a PerlIO layer than maps those line endings to/from "\n"
then we could consign worrying about this piece of magic to the bin. I don't know what the position is with MacPerl, but perlmacos suggests that the above could be back-dated to 5.8.0 including MacPerl.
FWIW, socket handling can (of course) be simplified by applying binmode $sock, ':crlf', which is nice. Nevertheless, chomp is a snare and a delusion if you think it's handling Internet CRLF line endings (unless you're futzing about with $/ at the same time). Wouldn't it be nice to have a chompnl equivalent to s/\x0D?\x0A$// ? And, perhaps, chomps equivalent to s/\s+$// ?
BTW, I note that \R is defined in perlreref as (?>\v|\x0D\x0A). Shouldn't that be (?>\x0D\x0A|\v) ? And I wonder what the EBCDIC folk make of this !
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: Parsing a text file
by graff (Chancellor) on Jan 14, 2009 at 13:20 UTC | |
by gone2015 (Deacon) on Jan 14, 2009 at 15:04 UTC |