I'm trying to solve a problem similar to Precompiled Reg Exps. This is for decoupling the parsing and processing of a mess of log files. The whole process would look like

command(s) | parse.pl | process.pl

where process.pl is one of a few possible processing scripts.

For parse.pl, I'm pondering how to best apply a dispatch table approach. Here is what I have now (simplified example).

use strict; use warnings; use YAML; my $header = qr/^(?!START:|END:)([^*: ][^:]+):(.+)\n/; my $starttime = qr/^START:\s*(.+)\n/; my $endtime = qr/^END:\s*(.+)\n/; my $lineitem = qr/^(?:\* | )([^:]+):(\d+):(.+)\n/; my %dispatch = ( $header => \&header, $starttime => sub { $_[0]->{starttime} = $_[1] }, $endtime => sub { $_[0]->{endtime} = $_[1] }, $lineitem => sub { push @{shift->{items}}, \@_ }, ); sub header { my $r = shift; process_record($r); # first time will be empty record... %$r = (); # clear the record $r->{name} = $_[0]; $r->{desc} = $_[1]; } my $r = {}; # 'record' LINE: while (my $line = <DATA>) { for my $re (keys %dispatch) { my @m = (); if (@m = $line =~ /$re/) { $dispatch{$re}->($r, @m); next LINE; } } } process_record($r); sub process_record { my $r = shift; print YAML::Dump($r); } __DATA__ The first record:it has only 1 item START: Tue Feb 1 00:09:30 2005 END: Tue Feb 1 00:19:32 2005 Item1:10:comment1 The next record:might have more items START: Tue Feb 1 00:39:07 2005 END: Tue Feb 1 00:42:46 2005 Item1:4:comment2 * Item2:1:comment3

I'm happy to have removed the if/elsif madness, but now the dispatch table definition looks almost as ugly. How would you 'clean' it up? I've considered pulling some bits into a module or building the dispatch table out of a config file. The config file worries me because it would require putting data and code in the config. I don't see how a module could reduce the code substantially since only the dispatch loop is reusable. Thoughts?

--Solo

--
You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.

In reply to Munging with a regex dispatch table by Solo

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.