Hi, guys, I am new to Perl and I have to use RecDescent module to parse a complex design log file. I got some problems of the parser's usage. Here is a simple example.
say I got a file like below:
======= START OF FILE ==========
No Name Score Prize
1 Pig 100 red flower
2 Han 80 bread
3 Hen 50 ass kicked
======= END OF FILE ==========
I need to parse the data of every row and print them one by one. My perl is:
############### START OF PERL ###################
#! /usr/local/bin/perl -sw
BEGIN {
close STDERR and open STDERR, '>./STDERR' or die $!;
}
use Parse::RecDescent;
#============================================
# GRAMMAR DEFINITION HERE
#============================================
$grammar =
q{
Para: List(s) /\Z/
| { use Data::Dumper 'Dumper';
print "$_->[0]\n" for @{$thisparser->{errors}};
exit;
}
List: Order Name Score Prize
Order: /\d+/ {print "@item\n";}
| <error: 1>
Name: /\w+/ {print "@item\n";}
| <error: Expecting a name!>
Score: /\d+/ {print "@item\n";}
| <error: 2>
Prize: /.*$/ {print "@item\n";}
};
#============================================
# MAIN PROGRAM STARTS HERE
#============================================
$parse = new Parse::RecDescent ($grammar);
while (<DATA>)
{
chomp;
$parse->Para($_);
}
__DATA__
1 Pig 100 red flower
2 Han 80 bread
3 Hen 50 ass kicked
############### END OF PERL ###################
You know it works fine. But when the input data format changed to the following:
======= START OF FILE ==========
No Name Score Prize
1 Pig
100 red flower
2 Han 80
bread
3
Hen
50
ass kicked
======= END OF FILE ==========
The perl won't work for it. I added <skip: qr/[\s\t\n]*/> to the List but it is fail to parse. How can I make it to deal some irregular text formats?
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.