in reply to Re: Re: Regex for stripping variable names from a JavaScript file
in thread Regex for stripping variable names from a JavaScript file
There are two problems here. First, your grammar is not quite right. And secondly, you aren't setting the return in your starting rule.
I'm not an expert with Parse::RecDescent, or with constructing grammars for YACC, Bison, etc.(far from it, actually); but IMHO, you probably don't want the { $return = $item[1] } on the 'var_name:' rule.
Instead, I think you want it on the 'statement:' AND 'varStatement:' rules (see below). Also, removing the 'comma:' rule and replacing its use with literal commas, prevents getting commas in the output (apologies for not using the lingo correctly). Here's my attempt:
use Parse::RecDescent; my $grammar = q { varStatement: 'var' statements endofvar { $return = $item[2] + } statements: <leftop: statement ',' statement> statement: var_name (operator assignvalue)(?) { $return = + $item[1] } comma_values: <leftop: assignvalue ',' assignvalue> assignvalue: equality | escapedRegex | escapedQuote | array_declaration | numeric_value | array_value | object_value | var_name array_declaration: 'new Array(' comma_values ')' array_value: array_name '[' integer ']' equality: '(' assignvalue equality_operator assignvalue +')' var_name: /\w+/ array_name: /\w+/ object_value: /[A-Za-z0-9_.]+/ numeric_value: real_number | integer integer: /\d+/ real_number: /\d+\.?\d*/ escapedRegex: '__REGEX__' escapedQuote: '__QUOTE__' operator: '=' equality_operator: '===' | '==' | '!=' endofvar: ';' }; my @localDeclaredVars = <DATA>; chomp @localDeclaredVars; print "\n\n"; $parser = new Parse::RecDescent ($grammar) or die "*** Bad grammar!\n" +; foreach my $localDeclaredVar (@localDeclaredVars) { print "$localDeclaredVar\n"; my $test = $parser->varStatement($localDeclaredVar) or print "*** Ba +d text!!!\n"; if ( ref($test) eq 'ARRAY' ) { print "==> ( @$test )\n"; } else { print "==> $test\n"; } } __END__ var myTest1 = 1; var myTest2 = 2, myTest3 = 3, myTest4; var myTest5 = new Array(__QUOTE__,__QUOTE__), myTest6; var myTest7 =__REGEX__; var myTest8 = myTest5.x; var myTest9 = myTest[0], myTest10 = myTest[0]; var myTest11 = (myTest1 == myTest2); var myTest12 = (myTest1 == myTest2), myTest13 = 2; var myTest14 = (myTest1 == myTest2), myTest15; var myTest16 = new Array(1, 2); var myTest17, myTest18;
and here is the output:
var myTest1 = 1; ==> ( myTest1 ) var myTest2 = 2, myTest3 = 3, myTest4; ==> ( myTest2 myTest3 myTest4 ) var myTest5 = new Array(__QUOTE__,__QUOTE__), myTest6; ==> ( myTest5 myTest6 ) var myTest7 =__REGEX__; ==> ( myTest7 ) var myTest8 = myTest5.x; ==> ( myTest8 ) var myTest9 = myTest[0], myTest10 = myTest[0]; ==> ( myTest9 myTest10 ) var myTest11 = (myTest1 == myTest2); ==> ( myTest11 ) var myTest12 = (myTest1 == myTest2), myTest13 = 2; ==> ( myTest12 myTest13 ) var myTest14 = (myTest1 == myTest2), myTest15; ==> ( myTest14 myTest15 ) var myTest16 = new Array(1, 2); ==> ( myTest16 ) var myTest17, myTest18; ==> ( myTest17 myTest18 )
This still misses array_names and variables within parenthesized expressions, but hey, it's a step in the right direction, I suppose. How would I be helping you if I solved your whole problem for you? :) At least you now have a debuggable chuknk of code.
dmm
If you GIVE a man a fish you feed him for a day
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re(3): Regex for stripping variable names from a JavaScript file
by Incognito (Pilgrim) on Feb 26, 2002 at 19:08 UTC | |
by dmmiller2k (Chaplain) on Feb 27, 2002 at 16:35 UTC |