I've written a simple grammar for parsing SQL-like filter expressions. This is for the Persist library and for the Persist::Filter module in particular. (This is a library I wrote that bears some resemblance to Alzabo and an older Persistent library.) (Also, please excuse my verbosity; this problem has been somewhat difficult to explain concisely.)

Anyway, the library itself seems fine, but when I'm using the library in another project, I am getting some odd results. The grammar--in the latest version I'm working on--looks like this:

filter: expression end_of_file { $return = $item[1] } | { $Persist::Filter::errors .= "ERROR: Could not parse +filter.\n"; foreach (@{$thisparser->{errors}}) { $Persist::Filter::errors .= "ERROR Line $_->[1]: $ +_->[0]\n"; } $thisparser->{errors} = undef; } end_of_file: /^\Z/ expression: comparison logical_operator <commit> expression { $return = new Persist::Filter::Junction(@item{qw(compari +son logical_operator expression)}) } | comparison | /not/i <commit> expression { $return = new Persist::Filter::Not(lc($item[1]), $item{e +xpression}) } | <error?> <reject> comparison: first_operand <commit> comparison_operator second_opera +nd { $return = new Persist::Filter::Comparison(@item{qw(first +_operand comparison_operator second_operand)}) } | '(' <commit> expression ')' { $return = $item{expression} + } | <error?> <reject> first_operand: operand second_operand: operand operand: identifier | literal | placeholder logical_operator: /and/i { $return = lc($item[1]) } | /or/i { $return = lc($item[1]) } comparison_operator: '=' | '<>' | /<=?/ | />=?/ | /(?:not\s+)?i?like/i { $item[1] =~ s/\s+/ /; $return = lc($item[1]) } identifier: table_name '.' <commit> column_name { $return = new Persist::Filter::Identifier("$item{table_n +ame}.$item{column_name}") } | integer '.' <commit> column_name { $return = new Persist::Filter::Identifier("$item{table_n +ame}.$item{column_name}") } | column_name { $return = new Persist::Filter::Identifier($item{column_n +ame}) } | <error?> <reject> table_name: name column_name: name literal: string { $return = new Persist::Filter::String($item{string}) } | number { $return = new Persist::Filter::Number($item{number}) } placeholder: '?' { $return = new Persist::Filter::Placeholder('?') } name: /[a-z_][a-z0-9_]*/i string: "'" <commit> character(s) "'" { $return = "'".(join '', + @{$item{'character(s)'}})."'" } | <error?> <reject> integer: /\d+/ number: /[+-]?[0-9]*\.[0-9]+(?:e[+-]?[0-9]+)?/i { $return = lc( +$item[1]) } | /[+-]?[0-9]+\.?(?:e[+-]?[0-9]+)?/i { $return = lc($item[1 +]) } character: "\\\\'" { $return = "\\\\'" } | /[^']/

There is no problem with the grammar itself, but Parse::RecDescent seems to have some problem when I generate the parser in different situations. Specifically, if I use the parser from the command-line like this:

perl -MPersist::Filter=parse_filter,parse_errors -MData::Dumper -e \ "(\$ast = parse_filter(q(namespace = 'http://hanenkamp.com/Contentlet/ +Blank'))) or die(parse_errors); print Dumper(\$ast)"

I have this result:

$VAR1 = bless( [ bless( do{\(my $o = 'namespace')}, 'Persist::Filter:: +Identifier' ), '=', bless( do{\(my $o = '\'http://hanenkamp.com/Contentle +t/Blank\'')}, 'Persist::Filter::String' ) ], 'Persist::Filter::Comparison' );

However, in another project which runs under mod_perl performs some work through Persist::Driver::Memory to parse the exact same string I get:

[Thu Oct 16 11:05:08 2003] [error] [client 127.0.0.1] Unable to create + starter database 'Contentment/Starter/apache-test.pl': Canno t parse filter "namespace = 'http://hanenkamp.com/Contentlet/Blank'": +ERROR: Could not parse filter. in /home/sterling/projects/contentment/Contentment/ApacheDirector/t/. +./../../blib/lib/Contentment/Manager.pm on line 935. Compilation failed in require at (eval 108) line 1.

I've not been able to find a difference between the generated parsers as yet. However, when I set $::RD_TRACE, running from the command-line results in lines starting with Treating "filter:" as a rule declaration and Treating "|" as a new production. Yet, in the mod_perl environment, these same lines all say Treating "" as a rule declaration and Treating "" as a new production.

Furthermore, when the parser is run, I find that a similar nullification of return values is occuring in the mod_perl environment that isn't happening on the command-line.

This is very weird. I'm still working out the details, but has anyone encountered this before and have an idea on where I should look for the problem? I've written several parsers for Parse::RecDescent before and never experienced a problem like this.

My next recourse is to attempt to generate a PM file within each environment and compare the definitions to see if there are any differences. Anyway, when I find the solution, I will be sure to post it here, in case anyone else has a similar problem.


In reply to Where to find the source of this Parse::RecDescent oddness? by hanenkamp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.