nneul has asked for the wisdom of the Perl Monks concerning the following question:

my $code = "fruit:red:apple"; my $parent = $code; $parent =~ s/:.*?$//o;
The result is "fruit" instead of "fruit:red". What am I missing here about how this regex is being handled?

Replies are listed 'Best First'.
Re: Why doesn't this s/regex// work?
by japhy (Canon) on Dec 19, 2005 at 18:55 UTC
    AGH!

    The problem is that, even though you're using non-greedy matching, Perl still works from left to right, return the left-most longest match that satisfies your regex. If you'd done s/:.*?// you would have been left with "fruitred:apple", but since you have s/:.*?$//, Perl needs to match "end of string", and the .*? then matches all the way from after "fruit:" to the end of the string. Using non-greedy matching and the $ anchor won't get you the result you expected; you'll need to use a more specific regex like s/:[^:]*$//.

    Also, the /o modifier is utterly useless here.


    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
Re: Why doesn't this s/regex// work?
by philcrow (Priest) on Dec 19, 2005 at 18:37 UTC
    Your regex says look for the first colon, then take anything and everything to the end of the string. Perhaps you need to use split or [^:]* instead of .*.

    Phil

Re: Why doesn't this s/regex// work?
by Fletch (Bishop) on Dec 19, 2005 at 18:37 UTC

    Even though you've told .* not to be greedy, you've still told the RE engine it needs to match at the end of the string so it happily gobbles up everything until then (one char at a time since it's being non-greedy). Perhaps you meant something more like s/:[^:]+$//?

Re: Why doesn't this s/regex// work?
by Roy Johnson (Monsignor) on Dec 19, 2005 at 19:04 UTC
    The other way to match "the last" something in a string is to skip as much as possible. To use it in a deleting expression, you have to substitute what you skipped back in:
    s/(.*):.*$/$1/;

    Caution: Contents may have been coded under pressure.
Re: Why doesn't this s/regex// work?
by Fang (Pilgrim) on Dec 19, 2005 at 22:04 UTC

    A few Monks have already explained why your regex was not working as you expected, let me add my two cents concerning other methods you can (and probably should) use with simple strings like the one you have there.

    # With split my $code = q(fruit:red:apple); my @parts = split /:/, $code; pop @parts; my $parent = join ':', @parts; # With substr and rindex (my favourite, and probably the fastest) my $code = q(fruit:red:apple); my $parent = substr($code, 0, rindex($code, ':'));
Re: Why doesn't this s/regex// work?
by tphyahoo (Vicar) on Dec 19, 2005 at 18:49 UTC
    I've never used the o option and am prettty clueless on what it does, but I don't think you want it here. Sounds like you want to delete everything from the last colon to the end of the line. s/:[^:]*$// should work. ie, colon, any number of "not a colon", out to the end of the line.