comment on

I've never really played with Parse::RecDescent until now, but here's what I came up with. Your date and time regexes had problems, you're inconsistent about returning an array reference or a hash reference, and my first 'line' rule (below) has an example of how to label the 'type', if that's the sort of result you want. Also you don't have start and end dates on every rule, so I leave that up to you to fix if necessary. I sort of stole merlyn's idea of how to organize the top level rules and ran with that :)

use Parse::RecDescent;
use strict;
use warnings;
use Data::Dumper;

# Make sure the parser dies when it encounters an error
$::RD_ERRORS = 1;
# Enable warnings. This will warn on unused rules &c.
$::RD_WARN   = 1;
# Give out hints to help fix problems.
$::RD_HINT   = 1;

# Create and compile the source file
my $parser = Parse::RecDescent->new( q(
    comma      : ","
    date       : /\b\d{1,2}\/\d{1,2}\/\d{1,2}\b/
    start_date : date
    end_date   : date
    time       : /\b\d\d:\d\d:\d\d\b/
    rate       : /\b\d+\.\d{4}\b/
    rates      : rate comma { $item{rate} }
    start_rate : rate
    end_rate   : rate
    change     : rate
    whitespace : /\s*/

    lines : line /\z/ { $item{line} }
    line  : "G017RATEBRKRL" comma rate comma start_date
      comma end_date comma time { $item{type} = $item[0]; \%item }
    line  : "G017CP111 D" comma start_rate comma end_rate comma
      change comma date comma time { \%item }
    line  : "G017RPAGO/N" comma rate comma whitespace comma
      whitespace comma date comma time { \%item }
    line  : "G017ONFD" comma rates(6) date comma time { \%item }
    line  : "G017PDFF" comma rates(4) date comma time { \%item }
) );

while ( my $quote_data = <DATA> ) {
    next if $quote_data !~ /\S/;
    my $result = $parser->lines( $quote_data );
 
    if ( defined $result ) {
        print Dumper $result;
    } else {
        print "Failed!\n";
    }
}
[download]

All that being said, I'm not sure I'd actually use Parse::RecDescent for this problem. I might just quickly get the first field, and use that as a key to a hash of subroutines which use regexes to parse the data and return the results. I'd consider how much you care about efficiency in this routine anyway.

In reply to Re: A Slough of ParseRecDescent Woes by runrig
in thread A Slough of ParseRecDescent Woes by Ovid

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.