conditional match in regex

John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

Kanji suggested using enclosing angles for the pseudo-sigil to represent a file handle. I like that idea, since it's clear and obvious.

But it's less than elegant in the code. Everything has a prefix char, and here's one that has a postfix as well.

Consider how to simplify this regex:

/(^([$@*%]))|(^<.*>$)/
[download]

That is, what I really want is

/^([$@*%<)(.*)(>?)$/
[download]

with the stipulation that the $3 is present only if $1 was <, and absent otherwise.

I can certainly do it in a line or 3, especially with lots of comments for clarity. But I'm wondering if there is a really cool way to do it in one succinct bite (wishing for grammars like Perl6 here...).

On a more meditative note, I see that a more powerful pattern engine can make things clearer simply because you can, in analogy with English, sum up your selection with a simple statement of intent, rather than groping for an adjective but having to speak at length about this and that special case.

I see this in documentation, too. If the rule is simple it not only makes the code simple, but makes the documentation easy too.

—John

Comment on conditional match in regex Select or Download Code

Replies are listed 'Best First'.
Re: conditional match in regex by tachyon (Chancellor) on Nov 04, 2002 at 21:45 UTC
`/^([$@%])(.)$/ or /^(<)(.*)(>)$/` [download] cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply] [d/l]
Re: conditional match in regex by jryan (Vicar) on Nov 04, 2002 at 23:00 UTC
Well, here's how I would do it; this regex will populate $1 with the sigil, and $2 with the name if it is a variable like "$foo"; in the "<foo>" case, $1 will be undef and $2 will contain the name. `/ ^ # start (?:([\$\@\\%])\|<) # leading sigil or < (\w+) # name (?(?{$1}) \|> ) # if there was a leading sigil, match nothing; # otherwise, match > $ # end /x;` [download] Another way might be to use this one: `/ ^ # start ([\$\@\\%<]) # sigil (\w+) # name (?(?{$1 eq '<'}) >\| ) # if the sigil is a '<', match the end; # otherwise match nothing $ # end /x;` [download] This is different in that in the "<foo>" case, $1 will be '<'. In either case, theres no real reason to capture $3, since there is only one possibility that it could be(>), and you will know if the possibility is true or not depending on $1.	[reply] [d/l] [select]
Re: Re: conditional match in regex by petral (Curate) on Nov 05, 2002 at 01:23 UTC
This also seems to work (at least in perl 5.6.1): `/ ^ ( [\$%@] \| (<) ) ( . ) (? (2) > ) # ( (2) stands for $2 (the (<) above)) $ /x;` [download] p	[reply] [d/l]
Re: Re: Re: conditional match in regex by jryan (Vicar) on Nov 05, 2002 at 21:47 UTC
Yes, that will work, but yours has the problem that it creates $1, $2, and $3. I wanted to limit the regular expression so that $1 = type, and $2 = name.	[reply]
Re: Re: Re: Re: conditional match in regex by petral (Curate) on Nov 05, 2002 at 23:08 UTC
Re: Re: Re: Re: Re: conditional match in regex by jryan (Vicar) on Nov 06, 2002 at 08:09 UTC
Some notes below your chosen depth have not been shown here
Re: Re: conditional match in regex by John M. Dlugosz (Monsignor) on Nov 05, 2002 at 16:31 UTC
Thanks, that example of using code in a regex is exactly what I was wondering. The perlre page states that `(?{ code })` is always successful, but also says that it may be used in a conditional match. So I'm guessing that if used alone, the code has side-effects only and always succeeds. But if used as the condition of a `(?(condition)yes-pattern[\|no-pattern])`, then it does indeed use the result as the condition.	[reply] [d/l] [select]
Re: Re: Re: conditional match in regex by jryan (Vicar) on Nov 05, 2002 at 19:56 UTC
Yep, you got it.	[reply]