jockel has asked for the wisdom of the Perl Monks concerning the following question:

Hello all

I've been trying to figure this one out for a while now.
I have a file I want to import, each row looks something like this.
column1    |column two       |might be column3 And I want to delete the spaces between the acctual value
of the columns and the pipes '|'.See this example.
while ($string =~ /(\s+\|)/) { $string =~ s/$1/\|/; }
it loops endlessly..... =(

on the other way, if I replace the '|' with ';'.
and changing the code a little it works?!
while ($string =~ /(\s+\;)/) { $string =~ s/$1/\;/; }
Have I missed some basic regexp rule?

Replies are listed 'Best First'.
Re: regexp, substitution and the "pipe symbol"
by Abigail-II (Bishop) on Jan 29, 2004 at 16:14 UTC
    while ($string =~ /(\s+\|)/) { $string =~ s/$1/\|/; }
    There's no point in first doing a match, and then a substitution. If it doesn't match, the substitution won't happen, and if it match, the substitution happens as well. Furthermore, because your match includes a |, that pipe symbol will be in $1, and it will be unescaped. But you interpolate $1 in your substitution, so there the pipe symbol will be seen as an alternation.

    What you want is:

    $string =~ s/\s+\|/|/g;
    No while necessary.

    Abigail

       that pipe symbol will be in $1, and it will be unescaped

      Ofcourse!! ,, why didn't I think of that =)

      Thanx everyone for your help!
Re: regexp, substitution and the "pipe symbol"
by Limbic~Region (Chancellor) on Jan 29, 2004 at 16:16 UTC
    jockel,
    Sometimes it is easier to break the problem up into smaller pieces, and that's why it is so nice that there is always more than one way to do it.
    #!/usr/bin/perl -w use strict; my $string = 'column1 |column two |might be column3'; my @columns = split /\|/ , $string , -1; $_ =~ s/\s+$// for @columns; $string = join '|' , @columns;
    Cheers - L~R
    Updated: Added -1 as 3rd argument to split to account for possible trailing |||| suggested by Abigail
      Well, if you want to do it the long way around, do it correctly. Use a third argument to split, or lose your trailing vertical bars.

      Abigail

Re: regexp, substitution and the "pipe symbol"
by blue_cowdawg (Monsignor) on Jan 29, 2004 at 16:20 UTC

    jockel: Peruse the following:

    while (my $line=<DATA>) { chomp $line; $line =~ s/\s+\|/\|/g; # note the "g" on the end. printf "\"%s\"\n",$line; } __END__ B-52 |P-51 | P-61 P-48 |B-1B | JU-88 B-29 |ME-109

    When run it yields:

    "B-52|P-51| P-61" "P-48|B-1B| JU-88" "B-29|ME-109"

    The secret is in the use of the "g" at the end of the regex with says "do this for every match"

    Hope this helps.


    Peter L. Berghold -- Unix Professional
    Peter at Berghold dot Net
       Dog trainer, dog agility exhibitor, brewer of fine Belgian style ales. Happiness is a warm, tired, contented dog curled up at your side and a good Belgian ale in your chalice.