in reply to ParseRecDescent and csv-like data

Although I agree that it might be better to consider 'apple_pear' as one token rather than two (as suggested by Elian and dreadpiratepeter above, the code below should do what you want:

#!/usr/bin/perl use strict; use warnings; use Parse::RecDescent; my $text = <<EOI apple_pear cherry munchy_nice_apricot mint_ _banana raspberry__pie EOI $Parse::RecDescent::skip = '[ \t]*'; my $grammar = q( { use Data::Dumper } data: line(s) line: term endofline term: word(s /_/) ...endofline { print Dumper(\%item); } word: /[a-z]+/ endofline: /[\n\r]+/ ); my $parser = Parse::RecDescent->new($grammar); if ($parser->data($text)) { print "ok\n"; }

Hope this helps, -gjb-

Update: changed the code to suite rkg's requirement that '_apple', 'apple__juice' and 'apple_' should not be accepted. The grammar happens to be more elegant now and I learned about separator patterns.

Update 2: removed an unused token from the grammar.

Replies are listed 'Best First'.
Re: Re: ParseRecDescent and csv-like data
by rkg (Hermit) on Aug 21, 2003 at 16:05 UTC
    Thanks for your help. You code, I think, accepts apple__pear (two underscores), _apple (leading underscore), and pear_ (trailing) as valid... in my desired grammar, they shouldn't be. Any thoughts? rkg ps And yes, I like your approach of treating a term as multiple tokens, vs. one. This is a part of a larger grammar, and I need the flexibility of ParseRecDescent.