in reply to Parse::RecDescent: problem with grammar and error reporting

I think that Parse::RecDescent is a bit overkill for such a simple, block+line based format. Here's my take that uses just regexes, and assembles a data structure from the input:

use 5.010; use strict; use warnings; sub parse_file { my $fh = shift; local $/ = ''; # paragraph mode my @paragraphs; my $line_no = 0; while (my $para = <$fh>) { push @paragraphs, [ map parse_line($_, ++$line_no), split /\n/ +, $para]; } return \@paragraphs; } sub parse_line { my ($_, $line_no) = @_; return unless /\S/; if (/^(.*)\s\@(\w+)/) { my $line = "$1"; my $font_size = "$2"; die "Invald \@fontsize declaration in line $line_no ($font_siz +e is not a number)\n" if $font_size =~ /\D/; return { text => $line, font_size => $font_size }; } else { return { text => $_ } } } use Data::Dumper; print Dumper parse_file(\*DATA); __DATA__ First slide Test line Second line Line with explicit font size @20

You can decide for yourself it it's too much of an unmaintainable mess to use :-).

Replies are listed 'Best First'.
Re^2: Parse::RecDescent: problem with grammar and error reporting
by kikuchiyo (Hermit) on Jan 20, 2012 at 11:23 UTC

    Yeah, I started out with something like this. Then I added provisions for embedding images, literal TeX syntax that could extend to more than one lines, inlined gnuplot scripts, etc., and it became pretty much unmaintainable.

    But you are perhaps right, I should at least have a go at refactoring the existing script using idioms I'm familiar with, before I jump into rewriting it with a tool I don't know (P::RD).

      Not so fast.   (IMHO...)   It does not take very long for code such as this, that is written without P:RD, to become very unmaintainable.   P:RD takes what you already know (regular expressions ...) and puts them into a very complete framework that otherwise you would have to build yourself.   It rather sneaks up on you ... suddenly there is some little twist on the format that you need to support, and something inside your lovingly-fashioned code goes, “snap.”

      Ugh.™

      P:RD does take some time to get to know.   (One thing that I quickly discovered is that you need to write a package of helper subroutines that you can use within the body of your generated grammar, so that you do not repeat yourself.   Grammar handlers should be short, should of course include use warnings; use strict;, but generally should leverage code that is used, not embedded verbatim.)   But, having spent the time to learn it, I find that I use it constantly.   Even “simple” requirements tend to grow, and it is not pleasant to discover that you have run out of tool.   With P:RD, that will never happen.