How to split, join and trim leading / trailing white space

thanos1983 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How to split, join and trim leading / leading white space by kcott (Archbishop) on Sep 06, 2017 at 04:47 UTC
G'day thanos1983, "Is it possible to be done in one step?" If you're using Perl `5.14`, or later, you can chain those operations using the '`r`' modifier. See "perl5140delta: Non-destructive substitution". It's somewhat unclear what you're actually trying to achieve here. The use of Chinese characters seems superfluous to the actual question asked. The use of the '`g`' modifier on the substitution, together with the '`^`' and '`$`' assertions, makes me wonder if you're perhaps dealing with multiline strings; however, the absence of the '`m`' modifier suggests otherwise. Here's some guesses as to the type of thing you might want: $ perl -Mutf8 -C -E 'say join " ", split //, "北亰"' 北亰 $ perl -Mutf8 -C -E 'say join " ", split //, " 北亰 " =~ s/^\s+\|\s+$//r' 北亰 `$ perl -E 'say join(" ", split /(..)/, "e58c97e4bab0")' e5 8c 97 e4 ba b0 $ perl -E 'say join(" ", split /(..)/, "e58c97e4bab0") =~ s/^\s+\|\s+$/ +/r' e5 8c 97 e4 ba b0` [download] If you're simply unfamiliar with what's going on with split, that's explained at the end of that documentation: "If the PATTERN contains capturing groups, ...". `$ perl -E 'my @x = split /(..)/, "1234"; say "\|$_\|" for @x' \|\| \|12\| \|\| \|34\| $ perl -E 'my $x = join "_", split /(..)/, "1234"; say $x' _12__34` [download] Update (additional information): As an additional example, to extend that last chaining example, you could do this to reduce multiple embedded spaces to a single space: `$ perl -E 'say join(" ", split /(..)/, "e58c97e4bab0") =~ s/^\s+\|\s+$/ +/r =~ y/ / /rs' e5 8c 97 e4 ba b0` [download] See "perlop: y/SEARCHLIST/REPLACEMENTLIST/cdsr" for more about that. Update (further discussion): See my subsequent response (below) for further discussion and "some clarifications and corrections". — Ken	[reply] [d/l] [select]
Re^2: How to split, join and trim leading / leading white space by thanos1983 (Parson) on Sep 06, 2017 at 08:28 UTC
Hello kcott, That is perfect, thanks a lot for your time and effort. :) Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re^3: How to split, join and trim leading / leading white space by kcott (Archbishop) on Sep 07, 2017 at 05:38 UTC
"That is perfect, ..." Well, not quite! :-) I saw your meditation after I read, and responded to, your OP in this thread. I now see where the Chinese characters come from; although, I still think they're superflous in the context of this specific question. The focus of my answer was the '`r`' modifier (in response to your "possible ... in one step?"). I probably should have paid more attention to your regex (`/^\s+\|\s+$/`), rather than just copying it verbatim. With the Chinese issue out of the way, and having spent some time looking more closely at what I wrote, here's some clarifications and corrections. The substitution example with the Chinese characters should have included a '`g`' modifier. I'm now reasonably certain that wasn't what you wanted; however, it should have been written like this: $ perl -Mutf8 -C -E 'say join " ", split //, " 北亰 " =~ s/^\s+\|\s+$//gr' 北亰 I was correct in not using the '`g`' modifier in the other two substitution examples; however, I should have also removed the alternation. As the two examples splitting `"1234"` clearly demonstrate, there's no trailing whitespace: you only need to remove the leading whitespace. For those examples, these would have been better: `$ perl -E 'say join(" ", split /(..)/, "e58c97e4bab0") =~ s/^\s+//r' e5 8c 97 e4 ba b0 $ perl -E 'say join(" ", split /(..)/, "e58c97e4bab0") =~ s/^\s+//r =~ + y/ / /rs' e5 8c 97 e4 ba b0` [download] Now, hopefully, it's "perfect". :-) — Ken	[reply] [d/l] [select]
Re^4: How to split, join and trim leading / leading white space by Anonymous Monk on Sep 07, 2017 at 05:48 UTC
Re^5: How to split, join and trim leading / leading white space by kcott (Archbishop) on Sep 07, 2017 at 06:15 UTC
Some notes below your chosen depth have not been shown here
Re: How to split, join and trim leading / leading white space by Your Mother (Archbishop) on Sep 06, 2017 at 00:41 UTC
Your posts on the topic make it so much harder and weirder than it should be– moo@cow~>perl -CSD -le '$_ = " \N{U+5317}\N{U+4EB0} "; print "<$_>"; s/\A\s+\|\s+\z//g; print "<$_>"; print join " ", split //;' < 北亰 > <北亰> 北亰	[reply]
Re^2: How to split, join and trim leading / leading white space by haukex (Archbishop) on Sep 06, 2017 at 08:37 UTC
On Perl v5.22 and higher, splitting on `\b{gcb}` (extended grapheme cluster boundary) might be better: $ perl -CSD -le 'print map "-$_- ", split //, "u\x{0308}ber"' -u- -̈- -b- -e- -r- $ perl -CSD -le 'print map "-$_- ", split /\b{gcb}/, "u\x{0308}ber"' -ü- -b- -e- -r- $ perl -CSD -le 'print map "-$_- ", split //, "k\x{0301}u\x{032D}o\x{0304}\x{0301}n"' -k- -́- -u- -̭- -o- -̄- -́- -n- $ perl -CSD -le 'print map "-$_- ", split /\b{gcb}/, "k\x{0301}u\x{032D}o\x{0304}\x{0301}n"' -ḱ- -ṷ- -ṓ- -n- (If the 2nd and 4th outputs above aren't displaying correctly, like in my browser, they should be "`-ü- -b- -e- -r-`" and "`-ḱ- -ṷ- -ṓ- -n-`".) As an alternative in Perl v5.12 and above, `\X` can be used. Update 2: E.g. `split /\X\K(?=\X)/, ...` Update: Made last sentence more clear.	[reply] [d/l] [select]
Re^2: How to split, join and trim leading / leading white space by thanos1983 (Parson) on Sep 06, 2017 at 08:12 UTC
Hello Your Mother, Thanks a lot for the time and effort. I was under the impression that it should be done the trim process after the split. It looks like a was wrong :). Thanks again, BR. Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re: How to split, join and trim leading / leading white space by Anonymous Monk on Sep 06, 2017 at 01:25 UTC
Next you'll be asking why there are two spaces between bytes instead of one. Here, maybe this will help. `use Data::Dump; my $banana = 'banana'; my @a = split /(..)/, $banana; dd(\@a); my @b = $banana =~ /(..)/g; dd(\@b); __END__ ["", "ba", "", "na", "", "na"] ["ba", "na", "na"]` [download]	[reply] [d/l]