Finding Variable References In Code

tadman has asked for the wisdom of the Perl Monks concerning the following question:

Is there a way to itemize variables referenced in a piece of code that is stored in a scalar? For example, if I had this:

my $foo = "$bar{foo}";
$foo .= $baz->method(@foo);
$foo .= join(',', @$bar{'foo','bar'});
[download]

Then you'd be able to get a list that would contain something like:

qw[ $foo @foo %bar $bar $baz ];

I have a feeling this is either very hard, or already in a module somewhere, though I've found nothing helpful.

Comment on Finding Variable References In Code Select or Download Code

Replies are listed 'Best First'.
Re: Finding Variable References In Code by chromatic (Archbishop) on Dec 06, 2002 at 06:29 UTC
PadWalker is the closest, but I'd probably use the `B::` modules. If you turn the subref into a B::OP descendant, you can walk the execution path, looking for ops that access either a pad or a stash. It's not terribly difficult to figure out which variable they want from there. The tricky part is figuring out whether it's a global or a lexical variable (and which pad holds the lexical).	[reply] [d/l]
Re: Finding Variable References In Code by graff (Chancellor) on Dec 06, 2002 at 06:28 UTC
I'm sorry, but I'm a bit confused by the example. Do you mean that all of this: `my $foo = "$bar{foo}"; $foo .= $baz->method(@foo); $foo .= join(',', @$bar{'foo','bar'});` [download] represents a single string that has been assigned to some scalar variable -- e.g. it's bracketed by something like: `my $code_string = <<EOS; ... # your example here EOS` [download] In this case, it wouldn't be all that hard to cook up a suitable tokenizer-plus-token-classifier that breaks up the contents of $code_string and tucks the relevant pieces into an array, based on looking at the sigils and brackets that bound each word. It might take a few hours of careful study and testing to get it mostly right, and there may even be some foibles of perl syntax that could defy the best attempts (abundant obfus to beat your head against...), but that's no reason not to try for something that handles the vast majority of stuff. (Anyway, sorry but I don't know of anything off-hand that already does this.) If, on the other hand, you mean to pull those various array elements out of the value being assigned to "$foo" in your example, then I just don't get it. update: The key point to simplify the task would be deciding that you only care about variables whose names consist of `/(\S)([\%\@\\$])(\w+)(\S)/` -- if that's your intention. another update:* That regex for spotting variable names would probably be more reliable after using B::Deparse on the target code, so that whitespace is normalized to a consistent style before you start.	[reply] [d/l] [select]
Re: Finding Variable References In Code by stefp (Vicar) on Dec 06, 2002 at 06:30 UTC
I would modify in B::Deparse the routines that deal with variable names: stash_variable, pp_aelem, pp_helem... In addition to printing, they would tally the encountered variable names. -- stefp	[reply]