There are two problems here. First, your grammar is not quite right. And secondly, you aren't setting the return in your starting rule.
I'm not an expert with Parse::RecDescent, or with constructing grammars for YACC, Bison, etc.(far from it, actually); but IMHO, you probably don't want the { $return = $item[1] } on the 'var_name:' rule.
Instead, I think you want it on the 'statement:' AND 'varStatement:' rules (see below). Also, removing the 'comma:' rule and replacing its use with literal commas, prevents getting commas in the output (apologies for not using the lingo correctly). Here's my attempt:
use Parse::RecDescent; my $grammar = q { varStatement: 'var' statements endofvar { $return = $item[2] + } statements: <leftop: statement ',' statement> statement: var_name (operator assignvalue)(?) { $return = + $item[1] } comma_values: <leftop: assignvalue ',' assignvalue> assignvalue: equality | escapedRegex | escapedQuote | array_declaration | numeric_value | array_value | object_value | var_name array_declaration: 'new Array(' comma_values ')' array_value: array_name '[' integer ']' equality: '(' assignvalue equality_operator assignvalue +')' var_name: /\w+/ array_name: /\w+/ object_value: /[A-Za-z0-9_.]+/ numeric_value: real_number | integer integer: /\d+/ real_number: /\d+\.?\d*/ escapedRegex: '__REGEX__' escapedQuote: '__QUOTE__' operator: '=' equality_operator: '===' | '==' | '!=' endofvar: ';' }; my @localDeclaredVars = <DATA>; chomp @localDeclaredVars; print "\n\n"; $parser = new Parse::RecDescent ($grammar) or die "*** Bad grammar!\n" +; foreach my $localDeclaredVar (@localDeclaredVars) { print "$localDeclaredVar\n"; my $test = $parser->varStatement($localDeclaredVar) or print "*** Ba +d text!!!\n"; if ( ref($test) eq 'ARRAY' ) { print "==> ( @$test )\n"; } else { print "==> $test\n"; } } __END__ var myTest1 = 1; var myTest2 = 2, myTest3 = 3, myTest4; var myTest5 = new Array(__QUOTE__,__QUOTE__), myTest6; var myTest7 =__REGEX__; var myTest8 = myTest5.x; var myTest9 = myTest[0], myTest10 = myTest[0]; var myTest11 = (myTest1 == myTest2); var myTest12 = (myTest1 == myTest2), myTest13 = 2; var myTest14 = (myTest1 == myTest2), myTest15; var myTest16 = new Array(1, 2); var myTest17, myTest18;
and here is the output:
var myTest1 = 1; ==> ( myTest1 ) var myTest2 = 2, myTest3 = 3, myTest4; ==> ( myTest2 myTest3 myTest4 ) var myTest5 = new Array(__QUOTE__,__QUOTE__), myTest6; ==> ( myTest5 myTest6 ) var myTest7 =__REGEX__; ==> ( myTest7 ) var myTest8 = myTest5.x; ==> ( myTest8 ) var myTest9 = myTest[0], myTest10 = myTest[0]; ==> ( myTest9 myTest10 ) var myTest11 = (myTest1 == myTest2); ==> ( myTest11 ) var myTest12 = (myTest1 == myTest2), myTest13 = 2; ==> ( myTest12 myTest13 ) var myTest14 = (myTest1 == myTest2), myTest15; ==> ( myTest14 myTest15 ) var myTest16 = new Array(1, 2); ==> ( myTest16 ) var myTest17, myTest18; ==> ( myTest17 myTest18 )
This still misses array_names and variables within parenthesized expressions, but hey, it's a step in the right direction, I suppose. How would I be helping you if I solved your whole problem for you? :) At least you now have a debuggable chuknk of code.
dmm
If you GIVE a man a fish you feed him for a dayIn reply to Re(3): Regex for stripping variable names from a JavaScript file
by dmmiller2k
in thread Regex for stripping variable names from a JavaScript file
by Incognito
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |