fletcher_the_dog has asked for the wisdom of the Perl Monks concerning the following question:

Hi All, I looking for a module that can parse perl code into variables, functions, control structures, white space, etc, and stick those parts into a array that would look something like this:
my $foo = bar('thing');
goes to
$tokens= [ ['my' => "function"], [' ' => "whitespace"], ['$foo' => "scalar"], [' ' => "whitespace"], ['=' => "operator"], [' ' => "whitespace"], ['bar' => "function"], ['(' => "delimiter"], ['thing' => "string"], [')' => "delimiter"], [';' => "delimiter"] ];
I've been poking around cpan looking for something similiar to this, but haven't found anything. I want to use it for a source filter that will rearrange some code for my own brand of OO programming.

Replies are listed 'Best First'.
Re: Module to parse perl
by halley (Prior) on May 22, 2003 at 19:47 UTC

    You might check some of the B::* and O::* modules which come in the standard distribution, but as the saying goes, "nothing can parse Perl except perl." The tokenizer is black magic, the complexity in some subtle cases is amazing.

    --
    [ e d @ h a l l e y . c c ]

Re: Module to parse perl
by talexb (Chancellor) on May 22, 2003 at 19:48 UTC

    Check out Parse::RecDescent for giggles. Apart from that, someone is bound to tell you that only Perl can parse perl.

    --t. alex
    Life is short: get busy!
Re: Module to parse perl
by ChemBoy (Priest) on May 22, 2003 at 21:48 UTC

    The comments you've already gotten are certainly correct, but for giggles you might look inside Perl::Tidy, which does (internally) something that looks not unlike what you describe. If you're lucky, you can even get perltidy (same code, executable front-end) to do your whole source-filtering job for you, and save yourself writing anything more complex than a config file.



    If God had meant us to fly, he would *never* have given us the railroads.
        --Michael Flanders

•Re: Module to parse perl
by merlyn (Sage) on May 23, 2003 at 11:13 UTC
      Further to that adamk's module in question in that node is now on CPAN under the name of PPI quoting said node in the docs. Unfortunately it isn't in working order (without some module twiddling at least) but there's hope yet for parsing perl in perl (caveats acknowledged :)
      HTH

      _________
      broquaint