Although I agree that it might be better to consider 'apple_pear' as one token rather than two (as suggested by Elian and dreadpiratepeter above, the code below should do what you want:
#!/usr/bin/perl use strict; use warnings; use Parse::RecDescent; my $text = <<EOI apple_pear cherry munchy_nice_apricot mint_ _banana raspberry__pie EOI $Parse::RecDescent::skip = '[ \t]*'; my $grammar = q( { use Data::Dumper } data: line(s) line: term endofline term: word(s /_/) ...endofline { print Dumper(\%item); } word: /[a-z]+/ endofline: /[\n\r]+/ ); my $parser = Parse::RecDescent->new($grammar); if ($parser->data($text)) { print "ok\n"; }
Hope this helps, -gjb-
Update: changed the code to suite rkg's requirement that '_apple', 'apple__juice' and 'apple_' should not be accepted. The grammar happens to be more elegant now and I learned about separator patterns.
Update 2: removed an unused token from the grammar.
In reply to Re: ParseRecDescent and csv-like data
by gjb
in thread ParseRecDescent and csv-like data
by rkg
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |