in reply to Re^6: How do I display only matches
in thread (SOLVED) How do I display only matches

You may not know this (many folks don't), but no matter what the OS platform, when writing to a network socket, "\n" means <CR><LF>. ... So, yes, even on Unix, a write to network socket will be <CR><LF>, while a write to a disk file will be just <LF>.

The way you worded this actually annoyed me enough that I double-checked my knowledge on this.

serv.pl:

use warnings; use strict; use IO::Socket::INET; use Devel::Peek; my $sock = IO::Socket::INET->new(Listen => 5, LocalPort => 9000, LocalAddr => 'localhost', Proto => 'tcp') or die $!; my $cli = $sock->accept(); $cli->read(my $in, 5); Dump($in);

cli.pl:

use warnings; use strict; use IO::Socket::INET; my $sock = IO::Socket::INET->new("localhost:9000") or die $!; print $sock "foo\n\0\0";

Output of serv.pl:

SV = PV(0x573aeb89cfe0) at 0x573aeb8e2b50 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x573aeb8ed890 "foo\n\0"\0 CUR = 5 LEN = 10

And, a Wireshark capture of the raw packet on the wire shows:

66:6f:6f:0a:00:00

On Linux and on Windows. You're wrong on both of those counts as well. (You can add binmode $sock, ':crlf'; to the client, and it'll be 0d0a on both platforms, as expected.)

I hope this post adds more clarity to the issue.

No, it spreads false and confusing information.

Replies are listed 'Best First'.
Re^8: How do I display only matches
by jcb (Parson) on Sep 26, 2019 at 01:13 UTC

    To hopefully add a bit more clarity (or at least improve my own understanding), POSIX has a binary model of files and "\n" is always <LF> in user programs. But sometimes the kernel can translate <LF> to <CR><LF> and this usually means that a terminal driver is involved somewhere — this is frequently seen when using Expect, which uses the pty facilities, which emulate a terminal and therefore involve the kernel terminal driver.

    As far as I know, sockets are always binary and no such translation ever occurs, so I am unsure where this misinformation about network line endings originated. Perhaps STREAMS had such a translation module?

      I don't know enough about the details of the kernel's behavior to say for sure, but what you write sounds correct to me. I have my guesses about where the misinformation originated, but of course I can't say what another person is thinking.

Re^8: How do I display only matches
by Marshall (Canon) on Sep 26, 2019 at 04:05 UTC
    I can see that you are annoyed. I don't want to annoy you or anybody!
    Let's relax and get to the facts where we agree...

    We both agree that "\n" prints something different under Unix than on Windows.

    I further claim that all network communication uses <CR><LF> as the transmitted line ending.
    You do not believe that.

    I need to find a Unix machine to test upon.
    I am curious what these nulls mean? "foo\n\0"\0

      Accurate wording is important.

      We both agree that "\n" prints something different under Unix than on Windows.

      No. What gets written to a handle by a print "\n" can be the same on both platforms, or different, depending on which I/O layers are in effect.

      I further claim that all network communication uses <CR><LF> as the transmitted line ending. You do not believe that.

      No, I didn't say that either. You claimed that "no matter what the OS platform, when writing to a network socket, "\n" means <CR><LF>", which I proved to be incorrect. This is completely separate from the fact that many network protocols do indeed use CRLF as their standard line ending (Update: and certainly not "all" network protocols use CRLF).

      I am curious what these nulls mean? "foo\n\0"\0

      I added the \0s for two reasons: the server is hard-coded to expect five bytes, and I wanted to make sure that the client always sends at least that many bytes, second, I wanted it to be clear that the server isn't reading too few bytes and cutting off something relevant. The \0 after the quote is AFAIK Perl's way of saying the string is null-terminated (ASCIIZ).

        No. What gets written to a handle by a print "\n" can be the same on both platforms, or different, depending on which I/O layers are in effect.

        I don't have any quibble about that. I was just talking about what Perl will do by default on various platforms. Perl is one of the most amazingly configurable languages that I've ever worked with. If it normally "barks" and you want it to "meow", it can do that!

        As far as network sockets go, I need to gain access to a Unix machine for testing. Again we are talking about default print statements of strings without any special I/O layer being specified or writing binary to the socket.

        I think that it is fair to say that this whole line ending subject is complicated. There are a lot of "yeah but's".

      I believe that the confusion here is that the standards specify use of <CR><LF> (as far back as RFC139, page 3, on a quick check) but *nix has a far simpler I/O model than was contemplated for the systems on the early ARPAnet. (ASCII itself is RFC20, by the way.)

      So for correct and portable network usage, you are supposed to use "\015\012" rather than "\n" anyway, although Perl's ":crlf" PerlIO layer should cause "\n" to be emitted as "\015\012". Confused enough yet?

        Thank you for the RFC reference.
        I believe that you are correct and that Perl's default text I/O layer will do what you say.
        I currently don't have a UNIX system to test with, but yes, this is "the way it is supposed to work".

        I someone is writing both ends of a client/server application, of course you can do whatever you want.You can even just send a binary packet.