So far, I have learned a lot from this site and playing around with grammars using Parse::RecDescent, but I haven't really moved forward with the problem at hand.

The Real Problem(s)

I have a JavaScript file, which may or may not have global vars at the top, middle and bottom of the page. This file also has function declarations, which may have local vars defined in them as well.

  1. I wouldn't mind getting each of the global vars stored into an array (I'm talking storing both the var name and it's assignment), so I can peruse/iterate through all global vars in a given file.
  2. I want an array returned that contains each function and its contents, not just the function names like I've already done. The grammar I've written goes through the entire file, ignoring stuff that isn't a function and is very slow on moderately sized files. This array would let me say, jump to the 5th element in it, and either print it out, or do something to that function contents.

    When I break this problem down, it's basically, (a) skip any code until you reach a function (b) return everything in the function (including the function name and braces and stuff) and (c) once out of the function, ignore everything until we get another function.

    For me to return the entire function string contents, I'll have to make modifications to the productions like stuff_we_ignore, paren_statement and bracket_statement, so that they return stuff... but that means they'll return stuff when not found within a function. I can't seem to figure out how to write a grammar that returns stuff when in a function, and nothing otherwise...

Grammar to Parse a JavaScript file and return array of Function Names

I'm including the code I've written that returns the array of function names. There's got to be something simple that I'm missing that will let me do what I want...

#!/usr/bin/perl use strict; # Enforces safer, clearer code. use warnings; # Detects common programming errors use Time::HiRes qw(gettimeofday); use Parse::RecDescent; #use Data::Dumper; #--------------------------------------------------------------------- +- # Build the grammar. #--------------------------------------------------------------------- +- my ($grammar); my ($startCompile,$startCompile2) = gettimeofday; $startCompile += ($startCompile2/1000000); $grammar = q { statement: ( function_method (';')(?) { $return = $item +[1]; } | brace_statement (';')(?) { $return = $it +em[1]; } | stuff_we_ignore (';')(?) { $return = $it +em[1]; } )(s) function_method: 'function' identifier paren_statement brace_statement { $return = $item[2]; } brace_statement: '\{' statement '\}' { $return = $item[2]; } paren_statement: '(' statement ')' bracket_statement: '[' statement ']' stuff_we_ignore: ( paren_statement | bracket_statement | identifier | punctuators )(s?) { $return = ""; } identifier: /\w+/ punctuators: /[><=!~:&^%,\?\|\+\-\*\/\.]+/ }; #--------------------------------------------------------------------- +- # Grab the data and parse. #--------------------------------------------------------------------- +- my @localDeclaredVars = <DATA>; my $localDeclaredVar = join ' ', @localDeclaredVars; my $parser = new Parse::RecDescent ($grammar) or die "*** Bad grammar! +\n"; my $i = 1; my ($endCompile,$endCompile2) = gettimeofday; $endCompile += ($endCompile2/1000000); my $refParsedValues = $parser->statement($localDeclaredVar) || print " +*** $localDeclaredVar\n"; my ($parseEnd,$parseEnd2) = gettimeofday; $parseEnd += ($parseEnd2/1000000); #--------------------------------------------------------------------- +- # Flatten the array and print contents. #--------------------------------------------------------------------- +- #print Dumper(\@$refParsedValues); my (@flatarray); sub flatten_recurse { # Thanks Anomo map ref eq 'ARRAY' ? flatten_recurse(@$_) : $_, grep defined && le +ngth, @_; } @flatarray = flatten_recurse ($refParsedValues); print join "\n", @flatarray, "\n"; my ($appEnd,$appEnd2) = gettimeofday; $appEnd += ($appEnd2/1000000); print '-'x72 . "\n"; print "Compile Time: " . (sprintf "%2.3f", ($endCompile - $startCompil +e)) . " seconds\n"; print "Parse Time: " . (sprintf "%2.3f", ($parseEnd - $endCompile)) +. " seconds\n"; print "Flatten Time: " . (sprintf "%2.3f", ($appEnd - $parseEnd)) . " +seconds\n"; print "__________________________\n"; print "Total Time: " . (sprintf "%2.3f", ($appEnd - $startCompile)) +. " seconds\n"; #--------------------------------------------------------------------- +- # End of program. #--------------------------------------------------------------------- +- __END__ function functStart () {}; var g1, g2 = __QUOTE__; var g3 = 10000000; if (g1) { var XXXXXXX = __QUOTE__; } if ( ! defaultCookieCrumbNav ) { cookieCrumbNavBarHTML = __QUOTE__ ; } + else { function funct1 () { }; var xxx = __QUOTE__ ; } if (true == false) { alert(var1); } if (1) { if (1) { function funct2 (X) { x = funct3 (1,2); }; function +funct3 () { alert(1); } } } function funct4 (a,b) { alert (1,2,3,4); return (a + b); } function funct5 () { var aaa = 1; } var g4; var g7 = __QUOTE__; function funct6 () { var b = __REGEX__; c = __REGEX__; if (test333()) +{ return true; } } function funct7 () { var a = 111; } alert ( 3 ); funct5 ( funct6 ( funct2 () ) ); function functEnd () {}

Output

functStart funct1 funct2 funct3 funct4 funct5 funct6 funct7 functEnd ---------------------------------------------------------------------- +-- Compile Time: 0.270 seconds Parse Time: 0.691 seconds Flatten Time: 0.000 seconds __________________________ Total Time: 0.961 seconds

Wishful Output (feels like in my dreams)

function functStart () {}; function funct2 (X) { x = funct3 (1,2); }; function funct3 () { alert(1); } function funct4 (a,b) { alert (1,2,3,4); return (a + b); } function funct5 () { var aaa = 1; } function funct6 () { var b = __REGEX__; c = __REGEX__; if (test333()) +{ return true; } } function funct7 () { var a = 111; } function functEnd () {} ---------------------------------------------------------------------- +-- Compile Time: 0.000 seconds Parse Time: 0.000 seconds Flatten Time: 0.000 seconds __________________________ Total Time: 0.000 seconds

2002-03-13 Edit by Corion : Added READMORE tag


In reply to Rec::Descent Woes - Parsing JavaScript Files and Other Issues by Incognito

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.