bugsbunny has asked for the wisdom of the Perl Monks concerning the following question:

hi, I have a question regarding Parse::RecDescent and it is - how can I skip some stuff, the idea is to just not parse some of the text until i i'm clear how will parse more important part of the text...ugrhhh.. restating :") f.e. lets have following text repeated many times :
class htb 2:11 root leaf 13: prio 0 rate 3125bps ceil ..... Sent 16822354 bytes 15272 pkts (dropped 0, overlimits 0) lended: 15272 borrowed: 0 giants: 0 tokens: 38 ctokens: 38
at the moment the only interesting thing for me is the second line i.e. :
Sent 16822354 bytes 15272 pkts (dropped 0, overlimits 0)
so how to make the grammar in a such a way so that i skip other things an concentrate only on this one..(i don't mean i doesn't want to parse the other stuff, just want to temporarily skip them until i finish second row.) I've tried to use <resync> w/o succees.. Currently i'm tryng this w/o big success, where is my error ? I'm missing something very basic !!!
our $tcClassGrammar = q{ classStart : class(s) class : 'class' classinfo stats stats2 stats3 #{print "$item[0]\n +"; 1} classinfo : /.+?\n/ stats : 'Sent' /\d+/ 'bytes' /\d+/ 'pkts' '(dropped' /\d+/ ',' 'ov +erlimits' /d+/ ')' { print "item[0]"; 1} stats2 : /.+?\n/ stats3 : /.+?\n/ };

Replies are listed 'Best First'.
Re: RecDescent, flushing/skipping stuff
by halley (Prior) on Jul 11, 2003 at 13:49 UTC

    I guess I don't see why you'd want to use RecDescent for a sample that you don't fully parse. Your example is easier scanned with a couple regexes than with a recursive descent parsing grammar. Maybe you're just trying to learn the recursive-descent concept or package.

    Skip (or collect) all lines until you see the bit you're interested in, then collect all the lines that are a part of the interesting content, then parse that chunk in whatever fashion you like. I'd go with a one-line regex to collect fields, but if you want to use RecDescent on the chunk, have fun.

    To test this sort of thing, you might want to give it some input that doesn't have any extraneous junk first. Make sure the important line gets parsed the way you like. Then as you add line-skipping logic, add junk lines to your test input. It's easy to get confused when you always test with the entire input all at once.

    --
    [ e d @ h a l l e y . c c ]

Re: RecDescent, flushing/skipping stuff
by jryan (Vicar) on Jul 11, 2003 at 15:37 UTC

    Why not just parse it all and then just extract what you need? For instance, to get the stats rule:

    use Parse::RecDescent; use Data::Dumper; # capture all, trim rulename $::RD_AUTOACTION = q { [@item[1..$#item]] }; undef $::RD_WARN; my $grammar = q{ classStart : class(s) class : 'class' classinfo stats stats2 stats3 classinfo : stuff stats : 'Sent' /\d+/ 'bytes' /\d+/ 'pkts' '(dropped' /\d+?,/ 'overlimits' /\d+?\)/ stats2 : stuff stats3 : stuff stuff : /.*/ }; my $code = q{ class htb 2:11 root leaf 13: prio 0 rate 3125bps ceil ..... Sent 16822354 bytes 15272 pkts (dropped 0, overlimits 0) lended: 15272 borrowed: 0 giants: 0 tokens: 38 ctokens: 38 }; $code =~ s/^\s*//g; my $parser = new Parse::RecDescent($grammar) or die $!; my $data = $parser->classStart($code)->[0]; my @stats; # extract stats foreach (@$data) { push @stats, $_->[2]; } print Dumper \@stats;

    Its just that simple.

Re: RecDescent, flushing/skipping stuff
by bugsbunny (Scribe) on Jul 11, 2003 at 10:45 UTC
    oopsi i had some simple errors and the main problem was not in the grammar :"), sorry... but my question still stays, how do u do skiping, so that u build the grammar in a step by step manner..
    our $tcClassGrammar = q{ classStart : class(s) class : 'class' classinfo stats stats2 stats3 {print "ww: $item[0] +\n";} classinfo : stuff stats : 'Sent' /\d+/ 'bytes' /\d+/ 'pkts' '(dropped' /\d+?,/ 'ove +rlimits' /\d+?\)/ #{ print "xx: item[1]"; $item[1]} stats2 : stuff stats3 : stuff stuff : /.+\n/ {print "$item[1]\n"; $item[1]} };
Re: RecDescent, flushing/skipping stuff
by tzz (Monk) on Jul 11, 2003 at 13:55 UTC
    Why not just match the "stats" rule? You don't need to skip the rest of the data, the regex match will do that for you.
Re: RecDescent, flushing/skipping stuff
by bugsbunny (Scribe) on Jul 11, 2003 at 20:35 UTC
    yep, i know I can use regex to do that.. what also I know is that for example I used regexes to parse dhcpd.conf file.. yes it worked fine, but even the little change to it has caused me trouble rewriting.. So yes i try this a little bit to get more confortable with RecDescent, but also for easy support & extension... :")
    As u saw this is output from linux "Traffic control" and it can be quite different for classes, qdiscs, filters and then iptables and so on...
    A couple of grammars will work better than regexes (not faster, but ..).....

    thanx for your comments, i've figured how to do it...