snafu has asked for the wisdom of the Perl Monks concerning the following question:

This is long, so I will use a readmore but to start off let me 'splain what I am doing. NOTE: I am looking for suggestions and constructive criticism as this is quite helpful for me in learning how to code more properly (considering I am a self-taught coder).

Ok, so, moving on. I have a configuration file that I parsing. I had an old parser but decided that the config file was just not human readable or friendly for that matter. So, I decided to go toward a new "look" which is posted below. Now, the thing I would like suggestions on is my technique for parsing out the config and whether or not I am missing out on a module that is written to do this kind of parsing already (why re-invent the wheel?).

I have the following configuration file (no where near the complete production config file but it serves my purposes for this writeup):

define { destination = "/u90/gvc_archive/new"; runonce = "port100"; } #####*******##### ## default macros #####*******##### macro arbor_ama { regex = "/F.*?-P.*?\.(\d+)\.ama/"; dfield = "$1"; } macro dex { regex = "/^P.*?_DSC_.*?\.(\d+)\.ama$/"; dfield = "$1"; } macro rpt { regex = ""; dfield = ""; } macro rptnull { regex = ""; dfield = ""; } macro rtcd_everything { regex = "/.*?/"; dfield = "2__"; } macro arbor1_1 { regex = "/^F.*?_D(\d+)_.*?PRI_1_1\.ama$/"; dfield = "$1"; } macro arbor { regex = "/^F.*?\.(\d+)\.ama$/"; dfield = "$1"; } macro usl1 { regex = "/^USL_.*?_(\d{6})_.*?$/"; dfield = "$1"; } macro usl2 { regex = "/^USL_.*?_(\d{6})$/"; dfield = "$1"; } macro uslnull { regex = "/^USL.*?$/"; dfield = "NULL"; } #####*******##### ## Individual Rulesets here #####*******##### ## port 11 # P11_02-04-02_01:02:00_020001.030001.41062.01.2 #10!11,41,61,77,85!rtcd_everything <-- old way rule 10 { port = "port11,port41, port61,port77,port85"; # space in here o +n purpose regex = ; # left blank on p +urpose dfield = "NULL"; macro = "rtcd_everything"; } rule 60 { port = "87"; # didn't use "po +rt##" on purpose regex = "/F.*?\-P.*?_FCC_(\d+)_.*?\.cdr/" dfield = "$1"; macro = "usl1"; macro = "usl2"; } ## port 100 stuff # P040_PRI_487460_487559.0204.ama # 100!100!/P\d+_(PRI|SEC|TPP)_.*?\.(\d{4})\.ama/!$2 <-- old rule 100 { port = "100"; regex = /P\d+_(PRI|SEC|TPP)_.*?\.(\d{4})\.ama/; dfield = "$2"; }

Now the following is the code I am using so far to parse through this file. It is nowhere *near* complete but I would like comments on the direction I am heading so far. Basicly, I don't want to get too far into it unless I am going the right direction.

#!/usr/local/bin/perl -w # test parser prior to plugging into larger script # to replace the old function. use strict; use Env; my $file = "$HOME/archive_bin/configs/dtfr_archiver.conf"; my ($class,$type,$var,$rval,%config); open(F,$file) or die("Can't open it: $!\n"); while ( <F> ) { chomp; my $current_line = $_; my $ok = "^(macro|define|rule)"; next if ( /^#/ ); # skip comments next if ( /^\n/ ); # skip newlines and all only-spaces my $check = (/$ok\s+\w+\s+\{/ .. /^\}/); do { if ( $check == 1 ) { $current_line =~ /^([a-z]+)/; $class = $1; print "class: $class\n"; # do more checking here $current_line =~ /^.*? (\w+)\s+\{/; # do more checking here $type = $1 # do more checking here print "type: $type\n\n"; # do more checking here } if ( $check !~ /EO/ and $check > 1 ) { if ( $current_line =~ /=/ ) { ($var,$rval) = split(/=/,$_); $var =~ tr/ //d; $rval =~ tr/" ;//d; print "var: $var and rval: $rval\n"; #$config{$type}{$var} = $rval; } } } if $check; }

output from the above code:

class: macro type: rtcd_everything var: regex and rval: /.*?/ var: dfield and rval: 2__ class: macro type: arbor1_1 var: regex and rval: /^F.*?_D(\d+)_.*?PRI_1_1\.ama$/ var: dfield and rval: $1 class: macro type: arbor var: regex and rval: /^F.*?\.(\d+)\.ama$/ var: dfield and rval: $1 class: macro type: usl1 var: regex and rval: /^USL_.*?_(\d{6})_.*?$/ var: dfield and rval: $1 class: macro type: usl2 var: regex and rval: /^USL_.*?_(\d{6})$/ var: dfield and rval: $1 class: macro type: uslnull var: regex and rval: /^USL.*?$/ var: dfield and rval: NULL

There we go. There is still a lot I will have to do since I need to do a lot of sanity checking. The end result will be assigning the values to a multidim hash or a hash of lists (or something along those lines) and pass that onto some other functions to do the work. As you can tell some of this is still conceptual which I usually code through my concepts.

Any thoughts are greatly appreciated.

TIA guys

_ _ _ _ _ _ _ _ _ _
- Jim
Insert clever comment here...

Replies are listed 'Best First'.
Re: Request For a lil Direction...
by patgas (Friar) on Mar 19, 2002 at 22:03 UTC

    Well, someone's going to mention it eventually, so it might as well be me...

    It sounds like you have control over how the configuration file looks. Have you considered using XML? There are already lots of parsers for it. It'll make things easier on your coworkers and future maintainers too.

    "As information travels faster in the modern age, as our days are crawling by so slowly." -- DCFC

      OOooooooo.....good idea! :) I have only *one* problem. I haven't ever even touched XML in my life. Strange, I know, but true. *sigh*, I should break out the docs on XML then? Frankly, I am not sure anyone here in my shop has even touched XML. I will look into it though! Thanks a bunch! Any good web references for casual learning/reading on XML that you recommend?

      _ _ _ _ _ _ _ _ _ _
      - Jim
      Insert clever comment here...

        I had never actually worked with XML until I started writing quickrep, and the docs for XML::Simple explained everything I needed to know. I was able to start working with it pretty quickly. Give it a try, and good luck!

        "As information travels faster in the modern age, as our days are crawling by so slowly." -- DCFC

(jeffa) Re: Request For a lil Direction...
by jeffa (Bishop) on Mar 20, 2002 at 03:21 UTC