How to match last character of string, even if it happens to be a newline?

Allasso has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How to match last character of string, even if it happens to be a newline? by AnomalousMonk (Archbishop) on May 12, 2019 at 14:56 UTC
By default, the `.` (dot) regex metacharacter matches everything except a newline. Use the `/s` modifier to make dot match everything. `c:\@Work\Perl\monks>perl -wMstrict -le "use Data::Dump qw(pp); ;; for my $s (qq{yz}, qq{yz\n}) { $s =~ m{ (.) \z }xms; printf qq{in %s matched %s \n}, pp($s), pp($1); } " in "yz" matched "z" in "yz\n" matched "\n"` [download] See also `\z` for "absolute end of string" anchor. Update: You're also running into an interaction with `$` ~~which matches~~ \| which by default matches at "the end of the line (or before newline at the end)", so even with the `/s` modifier, the first position at which `.$` can possibly match (scanning from left to right) is before a newline, if present; remember that the matching rule is leftmost longest. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^2: How to match last character of string, even if it happens to be a newline? by Allasso (Monk) on May 12, 2019 at 15:11 UTC
Yes, the \z anchor does, the trick, thanks!	[reply]
Re^2: How to match last character of string, even if it happens to be a newline? by Allasso (Monk) on May 12, 2019 at 15:17 UTC
Using \z, I don't seem to need multiline, simply `m@(.)\z@s`	[reply] [d/l]
Re^3: How to match last character of string, even if it happens to be a newline? by AnomalousMonk (Archbishop) on May 12, 2019 at 15:40 UTC
Using \z, I don't seem to need multiline ... That's because `\z` is always the absolute end-of-string anchor; no modifiers apply. I always use `\A \z \Z` because they have invariant behavior. For the same reason, I nail down the `^ $` operators by always using the `/m` modifier. (I then use the `^ $` operators only with newlines embedded within a string.) Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^2: How to match last character of string, even if it happens to be a newline? by Allasso (Monk) on May 12, 2019 at 15:04 UTC
er... I _did_ use s modifier :-/	[reply]
Re^3: How to match last character of string, even if it happens to be a newline? by AnomalousMonk (Archbishop) on May 12, 2019 at 15:12 UTC
Please see my update to Re: How to match last character of string, even if it happens to be a newline?. Also consider using `\z` to ~~represent~~ \| assert absolute end of string. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re: How to match last character of string, even if it happens to be a newline? by jwkrahn (Abbot) on May 12, 2019 at 18:48 UTC
You don't really need a regular expression just to get the last character in a string: `$ perl -e'use Data::Dumper; $Data::Dumper::Useqq = 1; my $text_1 = "a\ +nb\nc\n"; print Dumper $text_1, "Last character: " . substr $text_1, +-1' $VAR1 = "a\nb\nc\n"; $VAR2 = "Last character: \n";` [download]	[reply] [d/l]
Re^2: How to match last character of string, even if it happens to be a newline? by LanX (Saint) on May 12, 2019 at 19:03 UTC
On a side note : `chop` does the same job but is destructive and requires an lvalue. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice}	[reply]
Re^2: How to match last character of string, even if it happens to be a newline? by Allasso (Monk) on May 12, 2019 at 20:58 UTC
indeed!	[reply]
Re: How to match last character of string, even if it happens to be a newline? by AnomalousMonk (Archbishop) on May 12, 2019 at 15:27 UTC
My regex Best Practices (lifted whole from TheDamian's Perl Best Practices — highly recommended in general) include using an `/xms` modifier tail on every `qr// m// s///` I write. This reduces the degrees of freedom of the `^ $ .` operators and clarifies their function, at least for me. Coupled with the use of `\A \z \Z` as string start/end anchors, I find I can think a bit more clearly about the highly counterintuitive operation of regular expressions. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^2: How to match last character of string, even if it happens to be a newline? by Allasso (Monk) on May 12, 2019 at 15:43 UTC
This reduces the degrees of freedom of the ^ $ . operators Maybe that hints at the following inconsistency I see with the "the end of the line (or before newline at the end)" rule? Example `print('String: a\nb\nc\n' . "\n"); $text_1 = "a\nb\nc\n"; $text_1 =~ s@(\n)$@@s; print("----------\n>" . $1 . "<\n"); print("----------\n>" . $text_1 . "<\n");` [download] Gives: `String: a\nb\nc\n ---------- > < ---------- >a b c<` [download] In this case, the $ behaved like \z. Or another way to say it, in this case explicit \n matches where dot with s mod doesn't.	[reply] [d/l] [select]
Re^3: How to match last character of string, even if it happens to be a newline? by AnomalousMonk (Archbishop) on May 12, 2019 at 16:56 UTC
... in this case explicit \n matches where dot with s mod doesn't. That's because `\n$` requires a match with newline, but `.$` allows the leftmost position of a match with dot (with `/s`) to be before the newline. Dot will match newline in the presence of `$` if it is the only match available: `c:\@Work\Perl\monks>perl -wMstrict -le "use Data::Dump qw(pp); ;; for my $s (qq{yz}, qq{yz\n}, qq{\n}) { $s =~ m{ (.) $ }xms; printf qq{in %s matched %s \n}, pp($s), pp($1); } " in "yz" matched "z" in "yz\n" matched "z" in "\n" matched "\n"` [download] (The `/m` modifier makes no difference in these example strings.) The thing to remember about regular expressions is that there are a lot of things to remember about regular expressions. If you have a chance to reduce the amount of stuff to remember, even if only by a little, take it. That's why I advise (per TheDamian's regex PBPs) using `\A \z \Z` for all your start- and end-of-string anchoring needs, and using `^ $` only for embedded newline matching. ... inconsistency ... For me, it's not so much inconsistency as mind-boggling complexity. And again, I come back to the point that if you can reduce the complexity of what you're dealing with even a little, you're ahead of the game. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re: How to match last character of string, even if it happens to be a newline? by LanX (Saint) on May 12, 2019 at 15:27 UTC
That's what you want? :) `$ perl -e' m/.*(.\|\n)/,print "<$1>" for "123","ab\ c\n"' <3>< >$` [download] Please note that \n is often not one but two characters, like on Unix `CR LF` Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice}	[reply] [d/l] [select]
Re^2: How to match last character of string, even if it happens to be a newline? by BillKSmith (Monsignor) on May 12, 2019 at 21:17 UTC
Mac or Windows newlines seldom cause a problem. I think of \n as a perl newline. Perl strings always use it. Translation between it and your OS's representation is done by an I/O "layer". (In Unix, the "translation" does not actually change anything.) The only exception is when we change I/O behavior by specifying non-standard layers or binmode on input. Bill	[reply]
Re^3: How to match last character of string, even if it happens to be a newline? by LanX (Saint) on May 12, 2019 at 21:26 UTC
I didn't say it's a problem in general, I said it's not always just one character like the OP suggested. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice}	[reply]
Re^4: How to match last character of string, even if it happens to be a newline? by haj (Vicar) on May 13, 2019 at 05:46 UTC
Re^5: How to match last character of string, even if it happens to be a newline? by LanX (Saint) on May 13, 2019 at 16:49 UTC
Some notes below your chosen depth have not been shown here
Re^2: How to match last character of string, even if it happens to be a newline? by Allasso (Monk) on May 12, 2019 at 15:58 UTC
Not quite: `$text_1 = "abc\nd"; $text_1 =~ m/.*(.\|\n)/; print("----------\n>" . $1 . "<\n");` [download] Prints: `---------- > <` [download] Should print d	[reply] [d/l] [select]
Re^3: How to match last character of string, even if it happens to be a newline? by AnomalousMonk (Archbishop) on May 12, 2019 at 17:58 UTC
`$text_1 = "abc\nd";` `$text_1 =~ m/.(.\|\n)/;` ... Should print d A narration of `m/.(.\|\n)/` might be: `.` From the start of the string, grab as much as possible of anything that's not a newline (no `/s` modifier for dot); `(.\|\n)` Then match and capture the first thing that's either not-a-newline or a newline. Looked at this way, the only thing that could possibly be captured in the given string would be a newline. Indeed, if your regex has no operator introduced after Perl version 5.6, this kind of narration is what YAPE::Regex::Explain will give you: c:\@Work\Perl\monks>perl -wMstrict -le "use YAPE::Regex::Explain; print YAPE::Regex::Explain->new(qr/.(.\|\n)/)->explain(); " The regular expression: (?-imsx:.(.\|\n)) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- . any character except \n (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- . any character except \n ---------------------------------------------------------------------- \| OR ---------------------------------------------------------------------- \n '\n' (newline) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- [download] (There are newer and better regex parser/explainers around, but I like this one, limited as it is, for its explanatory style.) Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^3: How to match last character of string, even if it happens to be a newline? by LanX (Saint) on May 12, 2019 at 16:20 UTC
> Not quite: ... `"abc\nd"` in this case two options with /s modifier `DB<11> m/.(.\|\n)/s,print "<$1>" for "123","abc\n","abc\nd" <3>< ><d> DB<12> m/.(.)/s,print "<$1>" for "123","abc\n","abc\nd" <3>< ><d> DB<13>` [download] HTH! Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice}	[reply] [d/l] [select]