in reply to Re^5: How do I display only matches
in thread (SOLVED) How do I display only matches
I copied the text for "$" verbatim from the Perl regex docs. Normally that is good enough. However in this case I see that some further "yeah but" explanation is required!
Problem 1: What "\n" is can be both platform and sometimes context dependent! If I write a "\n" on my Windows machine, that means 2 characters: <CR><LF> (Carriage Return, Line Feed). So when you write "foo\r\n", on Windows that means <CR><CR><LF>. This extra <CR> means that the line doesn't end in "\n", <CR><LF> (Carriage Return, Line Feed). There is indeed something else between the "o" and the line ending and your regex doesn't match - this is correct behavior.
You may not know this (many folks don't), but no matter what the OS platform, when writing to a network socket, "\n" means <CR><LF>. <CR><LF> is the network standard for line endings. So, yes, even on Unix, a write to network socket will be <CR><LF>, while a write to a disk file will be just <LF>. Windows uses the network standard for disk writes - so everywhere on Windows \n means the 2 characters <CR><LF>.
Problem 2: Not every cross platform case and every platform direction is handled automatically by Perl. If you are on a single platform, then "$" will work as the docs describe as will chomp(). I have one program that needs to work on: a) old Mac "\n" means <CR> in files, b)Windows "\n" means <CR><LF> in files, c)Unix, "\n" means <LF> in files. When I write code that has to work with all 3 platforms, I use regex instead of chomp to delete the line endings. s/\s*$//; deletes all whitespace at the end of the line (including line endings like <CR><LF> which are considered "whitespace".
Another thought: I told the OP that there was no need to "chomp" if you are just going to add the line ending back in. That is true as long as you are processing a file and writing a file for the same platform. There are some cases where you'd "chomp" and then print "$_\n" to change the line endings.
I hope this post adds more clarity to the issue. But it probably raises more "yeah, but what if..." questions than it answers. This is all more complex than our OP asked about. I suggest starting a new thread if there is interest in discussing the "dirty details".
Update: To allow for this <CR><CR><LF> situation:
use warnings; use strict; use Data::Dump; for my $str ( "foo", "foo\n", "foo\r\n" ) { dd $str, scalar $str=~/o\s*$/; } __END__ ("foo", 1) ("foo\n", 1) ("foo\r\n", 1)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^7: How do I display only matches
by haukex (Archbishop) on Sep 25, 2019 at 21:20 UTC | |
by Marshall (Canon) on Sep 25, 2019 at 21:58 UTC | |
by haukex (Archbishop) on Sep 25, 2019 at 22:08 UTC | |
by Marshall (Canon) on Sep 26, 2019 at 00:52 UTC | |
by haukex (Archbishop) on Sep 26, 2019 at 06:04 UTC | |
| |
|
Re^7: How do I display only matches
by haukex (Archbishop) on Sep 25, 2019 at 21:52 UTC | |
by jcb (Parson) on Sep 26, 2019 at 01:13 UTC | |
by haukex (Archbishop) on Sep 26, 2019 at 05:50 UTC | |
by Marshall (Canon) on Sep 26, 2019 at 04:05 UTC | |
by haukex (Archbishop) on Sep 26, 2019 at 05:47 UTC | |
by Marshall (Canon) on Sep 28, 2019 at 04:28 UTC | |
by haukex (Archbishop) on Sep 28, 2019 at 07:22 UTC | |
by jcb (Parson) on Sep 26, 2019 at 22:46 UTC | |
by Marshall (Canon) on Sep 27, 2019 at 23:04 UTC | |
by jcb (Parson) on Sep 28, 2019 at 04:32 UTC | |
|