in reply to Re^10: How do I display only matches
in thread (SOLVED) How do I display only matches

On Windows, when you print "\n", there will be 2 characters, 0x0D 0x0A. ... I had thought that "\n" and "\n" had the same meaning whether write or read. It turns out that is NOT true.

As I've said several times now, to be completely accurate (which is important here), on Windows and *NIX, the Perl string "\n" means "\x0A" (I think Perl could be complied differently, but I'm not aware of any current builds that actually do this). What gets written or read depends on which PerlIO layers are in effect. This can even be changed dynamically while a handle is open using binmode, and it works the same on *NIX and Windows, except that :crlf is one of the defaults on Windows. To check which layers are in fact in effect, use my @layers = PerlIO::get_layers($handle);

On Windows this will print <CR><CR><LF>. When read back via text mode, only one <CR> will be deleted. The regex fails because there is still another <CR> there and "$" is looking for a 0x0A. Correct?

Correct, yes - I would nitpick that in the examples I showed, there is no reading/writing going on, so there is no need to think about what translations might be happening. Once a string has been read into Perl (and its contents verified with a dumper module), regexes behave the same on both platforms, which is where this subthread started.

Update: Also, note that "text mode" is somewhat misleading: technically, there is just the :crlf layer, which can either be active or not. Again, see PerlIO (and binmode).

Replies are listed 'Best First'.
Re^12: How do I display only matches (updated)
by Marshall (Canon) on Sep 27, 2019 at 23:17 UTC
    As I've said several times now, to be completely accurate (which is important here), on Windows and *NIX, the Perl string "\n" means "\x0A"

    That is not completely correct.
    When using Perl's default I/O layer, print "\n" will emit <LF> on Unix and <CR><LF> on Windows.

    When reading a text line (Unix or Windows), the <CR> will be deleted, if it exists.
    Update: When using "standard default I/O methods"

      When using Perl's default I/O layer, print "\n" will emit <LF> on Unix and <CR><LF> on Windows.

      Yes, you're just repeating back to me what I said several times now.

      When reading a text line (Unix or Windows), the <CR> will be deleted, if it exists.

      For *NIX that is once again wrong.

      $ hexdump -C test.txt 00000000 46 6f 6f 0d 0a |Foo..| $ perl -wMstrict -MData::Dump -e 'dd <>' test.txt "Foo\r\n"
      Update: When using "standard default I/O methods"

      But that's not the only thing this discussion iswas about. It's also about you insisting that (essentially) $ matches before \r, insisting that (essentially) "\n" eq "\r\n" and that somehow all network communication magically uses CRLF, or stuff like the above. Sorry, but all the evidence appears to point to a fundamental misunderstanding of the topic.

      That is not completely correct.

      Are you just trolling now? You clearly still haven't studied the subject matter enough, I'm not sure if you're even reading my posts, and I'm tired of repeating myself and running tests that you could be running, so for now, I'm out.

        At this point, I need to get a Unix account and run some tests and compare Windows vs Unix. This will take some weeks.

        I did look at Re^7: How do I display only matches and I am perplexed.

        Somehow, print $sock "foo\n\0\0"; produces 66:6f:6f:0a:00:00. On both Windows and Unix? That does indeed surprise me! But, if true, that does indeed indicate that on Windows, printing to a socket is different than printing to a file.

        let's stop this subthread. I will investigate further.