Re: process multiline text and print in desired format

It is very encouraging to see the overwhelming response received for the subject. I will go through all responses and try to write code & come back if any questions.

B/W i have seen an example in the web where one user removes all the leading and trailing space right after reading from the input data file. I tend to believe if that approach is taken one can avoid the "^\s*" repetition for all such match cases. Is that a good approach in general, does anyone see any hidden problems

ie, AFTER READING to FH something like below

s/^\s+|\s+$//g
[download]

To avoid "^\s*"

m/^\s*color
m/^\s*overseas
m/^\s*shipping
[download]

Comment on Re: process multiline text and print in desired format Select or Download Code

Replies are listed 'Best First'.
Re^2: process multiline text and print in desired format by kcott (Archbishop) on Mar 18, 2021 at 02:16 UTC
"... removes all the leading and trailing space ... avoid the "^\s" repetition ..."* Your posted data had no trailing spaces. In my example code, I did remove all leading spaces for this very reason: "`... map /^\s(.+)$/ ...`". Subsequent regexes looked like "`/^color ...`"; not "`/^\scolor ...`". Removal of potential trailing spaces may well be a good idea. I don't know the source of your input, but trailing spaces are often impossible to spot by inspection; e.g. `$ cat > fred abc def $ cat fred abc def $ cat -vet fred abc$ def $` [download] If you want to do this, you can modify my regex; however, be aware of a subtle gotcha. You cannot simply tag another `\s` on the end; you'll also need to change `.+` to `.+?`. Compare these examples: `$ perl -E 'my $x = " xyz "; say "\|$_\|" for map /^\s(.+)$/, $x' \|xyz \| $ perl -E 'my $x = " xyz "; say "\|$_\|" for map /^\s(.+)\s$/, $x' \|xyz \| $ perl -E 'my $x = " xyz "; say "\|$_\|" for map /^\s(.+?)\s$/, $x' \|xyz\|` [download] — Ken	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^2: process multiline text and print in desired format
by kcott (Archbishop) on Mar 18, 2021 at 02:16 UTC

"... removes all the leading and trailing space ... avoid the "^\s*" repetition ..."

Your posted data had no trailing spaces. In my example code, I did remove all leading spaces for this very reason: "... map /^\s*(.+)$/ ...". Subsequent regexes looked like "/^color ..."; not "/^\s*color ...".

Removal of potential trailing spaces may well be a good idea. I don't know the source of your input, but trailing spaces are often impossible to spot by inspection; e.g.

$ cat > fred
abc
def 

$ cat fred
abc
def 

$ cat -vet fred
abc$
def $
[download]

If you want to do this, you can modify my regex; however, be aware of a subtle gotcha. You cannot simply tag another \s* on the end; you'll also need to change .+ to .+?. Compare these examples:

$ perl -E 'my $x = "  xyz  "; say "|$_|" for map /^\s*(.+)$/, $x'
|xyz  |

$ perl -E 'my $x = "  xyz  "; say "|$_|" for map /^\s*(.+)\s*$/, $x'
|xyz  |

$ perl -E 'my $x = "  xyz  "; say "|$_|" for map /^\s*(.+?)\s*$/, $x'
|xyz|
[download]

— Ken

[reply]
[d/l]
[select]