Thanks for pointing out the error.
The distinctions among read, sysread, and getc are useful, but any of them serve to point out the issue in the original node. I used read specifically because it is more flexible than getc and does interact properly with buffering if there is any. That may be preferable in some situations, but sysread may be preferable if bypassing buffers is desired as a rule.
As an aside, although there is for perl 5.10.1 and 5.12.0 a note at the bottom of the entry for sysread in the perldocs about UTF and characters, the documentation for read is clearer at a glance about handling characters and not just bytes. That's not anything against the function, and it will probably be fixed in the docs at some point. It's worth noting if you're pointing the function out to people in the meantime, though, because you might want to mention that note stuck at the bottom of the entry.
| [reply] |
I used read specifically because it is more flexible than getc
I don't see how read is more flexible than getc at reading available characters in the pipe. sysread is more flexible since it can read multiple characters in addition to reading a single character. This also makes it much faster IIRC.
I used read specifically because it [...] does interact properly with buffering if there is any.
That means you specifically didn't chose sysread because it doesn't interact properly with buffered IO, yet one is much more likely to encounter unbuffered IO than buffered IO when dealing with pipes (because the former solves the OP's problem and it's required by select).
but sysread may be preferable if bypassing buffers is desired as a rule.
That's backwards. I didn't use sysread to bypasses buffers, buffers aren't used because I used sysread. In the rare circumstance that other code reads from the pipe, non-buffered IO will have to be used there too, but that's not likely to be a problem. In the much more likely circumstance that select is used on the pipe, you'd need sysread anyway.
As an aside, although there is for perl 5.10.1 and 5.12.0 a note at the bottom of the entry for sysread in the perldocs about UTF and characters,
Basically, it says sysread will work with character streams as well as byte streams. I don't see why I should be pointing that out.
| [reply] [d/l] [select] |
Well, it needs to be made more clear in the docs, as I already said. The whole man entry for sysread says it deals with bytes. That's often not the case. Only a note at the very end of the entry even mentions multi-byte characters. If someone is expecting from the rest of the entry to read only bytes and misses that point, they'll be surprised to get multi-byte characters. If they are wanting to get characters and just skim the first part of the doc entry, it could seem the function won't do what they need. Compare it to the docs for read, which mention characters much earlier and more clearly.
| [reply] |