cghost23 has asked for the wisdom of the Perl Monks concerning the following question:

hello, i am fairly new to perl and am experienceing difficulty completely a regular expression, that, when applied to a line of perl source, will extract all variables and store them in an array. the basic regex i have is as follows: "@variables=m/\$[a-zA-Z0-9_]+/g" however this expression fails to deliver when it encounters an escaped dollar sign in code..such as 'print "i made \$1.00 today"' (the regex would add '$1' to @variables). i am not sure how negation works in regexs, and was wondering if someone could assist me in writing a regular expression that would place do what mine does but DOESNT place the variable name in the array if it starts with \$. thank you in advance, cghost23

Replies are listed 'Best First'.
Re: regex query
by tadman (Prior) on Apr 19, 2001 at 07:45 UTC
    First, I've got to say that I didn't know that a global pattern match returned an array by default (i.e. no memorization with brackets) of all the matches. I am only beginning to imagine what that could do in conjunction with map.

    So, all you really need is a better regexp:       @stuff = m#(?:[^\\]|^)(\$(?:[a-zA-z_][a-zA-z0-9]*|[0-9]))#g; This ensures that the variable is not prefixed by a backslash, or that it is at the beginning of the line ('^'). The 'questionable' brackets '(?: ...)' are just groupings that aren't memorized, in case you aren't familiar with that modifier.

    Still, you're open to scenarios where things that look like variables, but aren't, are picked up for whatever reason. This could include something you have in comments, single-quoted or qX-quoted strings, or wierd stuff getting pulled out of regular expressions.

    So, the regexp above is intentionally a bit tame, and it won't pick up on some common things, such as:
    \$x; # A reference to $x, but ignored by regexp ${x}; # Same as $x, but ignored $x[10]; # Shows up as $x, but is really @x
    So, the utility of this match is limited, but hopefully it will suit your needs.
      Actually, m//g matches return the whole match, like $&. So you don't need any extra parenthesis.
        In this case, though, I was trying to make certain that the non-backslash characters preceeding the variable did not get included in the match result.

        As you pointed out below, though, the look-behind assertion was what I had in mind, but for some reason I was lead to believe that they didn't work properly in some cases, but those cases seem to be pretty specific, or related to some earlier version of perl. merlyn had provided a demonstration on the perils of negative lookback, but I can't find it.
Re: regex query
by MrNobo1024 (Hermit) on Apr 19, 2001 at 08:01 UTC
    Sounds like you need a negative look-behind assertion.

    @variables = m/(?<!\\)\$[a-zA-Z0-9_]+/g;