Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Sorry for posting such a simple problem but I've read through the Camel section on pattern matching, tried several things yet I can't seem to figure this out.

I have pipe delimited text data and I want to ensure that there is a at least one spaces between each pipe (I don't want to add any spaces to fields containing data). E.g., what I want is:

| | | | | | | | |red| | |blue| | |green| |
My current code looks like this:
use strict; while (<DATA>) { s/\|\|/\| \|/g; print; } __DATA__ ||||| ||||red|| |blue|||green||
and produces the following output:
| || || | || |red| | |blue| ||green| |
I realize that I can just execute the regex twice to get the desired results but that seems ugly. There must be something simple that I'm overlooking or don't understand. Any hints or pointers welcome. Thanks.

Replies are listed 'Best First'.
Re: Regex Backtracking
by Joost (Canon) on Jun 12, 2002 at 16:31 UTC
    I think you want look-ahead matching here:
    s/\|(?=\|)/\| /g;
    See also perldoc perlre though I'll admit it's not the easiest of topics :-)
    -- Joost downtime n. The period during which a system is error-free and immune from user input.
      Thanks! Thats did the trick. BTW, I did try various combinations of  ?=, ?: and ?! but their usage wasn't very detailed in the Camel (at least not enough for regex illiterates like me) so my attempts weren't even close. I'll check out the perldoc perlre.
Re: Regex Backtracking
by frankus (Priest) on Jun 12, 2002 at 17:00 UTC
    What is happening is as it stands the regular expression is evaluating two characters at a time.Whereas the frame for the matches should be one character. A sloppy fix would be:
    while (<DATA>) {
       $i=0 while s/\|\|/\| \|/g; # repeats as many times as is necessary
       print;  
    }
    
    As joost rightly says the forward lookahead operation is probably the best way to go.
    To make regexes easier to read try using the debug pragma:

    use re 'debugcolor'

    To get insight into how the regexes do what they do ;)

    I'd love to find a solution using the 'c' modifier .. but then I'm not Abigail

    --

    Brother Frankus.

    ¤

Re: Regex Backtracking
by codine (Initiate) on Jun 12, 2002 at 16:58 UTC
    I'm not a guru or anything close to super knowledgable when it comes to Perl. But, wouldn't changing ... s/\|\|/\| \|/g; to s/\|/\| /g; achieve the desired results? Well sort of. Except for the space in front of the blue, red, and green.