I'm sorry, but I'm a bit confused by the example. Do you
mean that all of this:
my $foo = "$bar{foo}";
$foo .= $baz->method(@foo);
$foo .= join(',', @$bar{'foo','bar'});
represents a single string that has been assigned to some
scalar variable -- e.g. it's bracketed by something like:
my $code_string = <<EOS;
... # your example here
EOS
In this case, it wouldn't be all that hard to cook up a
suitable tokenizer-plus-token-classifier that breaks up
the contents of $code_string and tucks the relevant pieces
into an array, based on looking at the sigils and brackets
that bound each word. It might take a few hours of careful
study and testing to get it mostly right, and there may even be
some foibles of perl syntax that could defy the best attempts
(abundant obfus to beat your head against...), but
that's no reason not to try for something that handles the
vast majority of stuff. (Anyway, sorry but I don't know of anything
off-hand that already does this.)
If, on the other hand, you mean to pull those various array
elements out of the value being assigned to "$foo" in your
example, then I just don't get it.
update:
The key point to simplify the task would be deciding that
you only care about variables whose names consist of
/(\S*)([\%\@\*\$])(\w+)(\S*)/ --
if that's your intention.
another update: That regex for spotting
variable names would probably
be more reliable after using B::Deparse on the target code,
so that whitespace is normalized to a consistent style
before you start. |