in reply to Deleting intermediate whitespaces, but leaving one behind each word

RegExp Search/Replace is a better match for this kind of job. But I also would like to show how to do this via split/join. This might come in handy if you want to skip certain fields or need an array representation later on.

The split pattern

I see you're using the pattern 'a single space' / /, but since you want to fold whitespace anyway a better solution is to use 'multiple spaces' / +/ or 'any whitespace' /\s+/, the effect is this:

my $CPU = 'Intel(R) Xeon(R) CPU X5660 2.80GHz '; # field: 0 1 2 3..11 12 13 14 # using / / # field: 0 1 2 3 4 # using / +/

Usage of join

There are multiple ways to feed the generated array into join. In order to get the expected result you can either feed each field separately, use an array slice or just feed the whole array:

my @CPU_SPLIT = split / +/, 'Intel(R) Xeon(R) CPU X5660 2.80 +GHz '; # 1 - feed each field explicitly print join ' ', $CPU_SPLIT[0], $CPU_SPLIT[1], $CPU_SPLIT[2], $CPU_SPLI +T[3], $CPU_SPLIT[4]; print "\n"; # 2a - use an array slice - explicitly print join ' ', @CPU_SPLIT[0,1,2,3,4]; print ">n"; # 2b - use an array slice - via a range print join ' ', @CPU_SPLIT[0..4]; print "\n"; # 3 - feed the whole array print join ' ', @CPU_SPLIT; print "\n";

Replies are listed 'Best First'.
Re^2: Deleting intermediate whitespaces, but leaving one behind each word
by AnomalousMonk (Archbishop) on Dec 05, 2017 at 13:30 UTC

    Note that  split / +/, $string doesn't handle leading whitespace gracefully:

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my $CPU = ' Intel(R) Xeon(R) CPU X5660 2.80GHz '; dd $CPU; ;; my @CPU_SPLIT = split / +/, $CPU; dd \@CPU_SPLIT; ;; my $t = join ' ', @CPU_SPLIT; dd $t; " " Intel(R) Xeon(R) CPU X5660 2.80GHz " ["", "Intel(R)", "Xeon(R)", "CPU", "X5660", "2.80GHz"] " Intel(R) Xeon(R) CPU X5660 2.80GHz"
    Here, the special-case  ' ' split pattern is better (if you're going to split/join):
    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my $CPU = ' Intel(R) Xeon(R) CPU X5660 2.80GHz '; dd $CPU; ;; my @CPU_SPLIT = split ' ', $CPU; dd \@CPU_SPLIT; ;; my $t = join ' ', @CPU_SPLIT; dd $t; " " Intel(R) Xeon(R) CPU X5660 2.80GHz " ["Intel(R)", "Xeon(R)", "CPU", "X5660", "2.80GHz"] "Intel(R) Xeon(R) CPU X5660 2.80GHz"
    (Dealing with general whitespace was exactly why  ' ' was special-cased.)


    Give a man a fish:  <%-{-{-{-<