Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that I'm attempting to generalize to the most general generalization :) In order to make my script as extensible and configurable as possible I've turned to an external config file to set up some data structures, general variables, etc. I originally skimmed CPAN and found a couple of modules that looked applicable, but after taking a look at them decided to just parse the file myself. All fine and good, but now I'm getting to the point where its becoming unmanageable, and it's exhausting my parsing skills ;).

The question: What is the best-practice out there for parsing a complex configuration file? I'm talking some nastily complex nested configuration directives with options dependant on other options, repeating blocks of "elements", etc.

A couple of points:

Please spare some brain power here and I'll lend you mine when you need it :)
thx,
-r0

Replies are listed 'Best First'.
Re: Configuration file parsing?
by Q*bert (Sexton) on Jun 22, 2000 at 10:18 UTC
    You can use Data::Dumper to write out arbitrary data structures to a file. Then you just require the file when you want to read the data back. This is my preferred way of constructing config files.

    chromatic wrote a good node about data persistence in general called Object Serialization Basics. It includes the Data::Dumper method as well as two others.

Re: Configuration file parsing?
by eduardo (Curate) on Jun 22, 2000 at 19:25 UTC
    heh, it seems like every damned post I put on here is touting the greatness of Parse::RecDescent, but yes, i agree with the previous poster 100% that you should look into this, and have Parse::RecDescent generate an automata for you that you can then distribute without having to have your users have Parse::RecDescent.

    However, I think it is a little bit more involved than that. The most important thing that I learned in both my compiler class and my programming language theory class in college was that belive it or not, we would ALL be using compiler theory and programming language theory for the rest of our professional lives. The professor made it very clear to us that we needed to understand that things such as UI's and configuration files, were really just a subset of the concept of a programming language, and if we didn't take the rules of proper programming language design into account, our users, and those that maintained our code would curse our names 'till the very end.

    If you are at a point whereby you are willing to scrap your current configuration system, then I suggest you do so, and you take a few days to think about a few things. This configuration language, is going to have to be expressive enough for you to accomplsih what you need accomplished, and extensible enough for you to be able to grow it out however it needs to be grown out in the future. Things to take into account are:

    • orthogonality
    • consistency
    • consiseness
    • determinism
    if your "meta language" fits all of those, then you will be able to use it for a long time successfully. my second bit of advice is spend some time upfront defining the BNF for it, so that if the enviroment changes, you will always have the formal language defenition to fall back on to create a parser. If you take all of these things into account, and you use the proper modules (i'm telling you, Parse::RecDescent is a gift from whatever god or gods may or may not exist) you should have no problems expanding this into the future.
Re: Configuration file parsing?
by btrott (Parson) on Jun 22, 2000 at 10:25 UTC
    Why don't you take one slight step further and just write your configuration file in XML. Then take a look at XML::Simple, which was originally built, I believe, for parsing XML config files.

    It uses XML::Parser internally, so you'll need to install that as well.

    use XML::Simple; my $conf = XMLin("/foo/bar.xml");
    To see the format of the data you get back, use Data::Dumper:
    use Data::Dumper; print Dumper $conf;
Re: Configuration file parsing?
by swiftone (Curate) on Jun 22, 2000 at 18:57 UTC
    lhoward pointed me to Parse::RecDescent when I was reading in a config file. It allows you to define just about any parsable text. It also lets you generate a module to parse the specified format, so if you distribute your program, the end user doesn't need to have Parse::RecDescent installed.
Re: Configuration file parsing?
by cleen (Pilgrim) on Jun 22, 2000 at 13:07 UTC
    personally, I think a better question is not "whats the best way to parse config files" but "How can I make my config file better for parsing?". Yes it might be taking a longer route (as in thought process wise) but can be truly helpful in the future.

    The way I always go about thinking about making a config file is saying to myself "how can I get this all in a hash?" hehe yeah it might not be good to think that way, but its how _I_ think.

    When I want to make an easily understood, easily extendable, strong config file, I make use of label's, static information, and alternatives. Ok so that doesnt make much sense...Lets put it this way:

    Label:<variable-label> Static-Info-1:<variable-label>: <some-info-1> Static-Info-2:<variable-label>: <some-info-2> Alternative:<variable-label>: Static-alt-Info-1 <static-alt-info-1> Alternative:<variable-label>: Static-alt-Info-2 <static-alt-info-2>
    So basically label becomes my holder for each static and alternative setting in my config file, so I can refer back to it at a future point in time with my program. In essence even the Label: part is a static peice of information.. To show how this teknique can acctually minimize parsing to almost nothing, I will give a small example:
    # Sample Config File Label:perl-monks:perl-monks Hostname:perl-monks:www.perlmonks.org Description:perl-monks:The Perl Monks Webpage Alt:perl-monks:MonkVar-1 SomeSubVar-1 SomeVar-1-value Alt:perl-monks:MonkVar-2 SomeSubVar-2 SomeVar-2-value Label:ackers:ackers Hostname:ackers:www.ackers.net Description:ackers:Just my Page Alt:ackers:AckVar-1 SomeSubVar-1 SomeVar-1-value Alt:ackers:AckVar-2 SomeSubVar-2 SomeVar-2-value
    and the perl code to read that
    #!/usr/bin/perl # Here I will define all the static parts of the config file in an arr +ay # this is a nessesity, not only for easily adding features to your con +fig # but to keep code changes down to a minimum :) my @carray = ("Label", "Hostname", "Description"); # pretend config.conf holds the information above hehe. open (CONF, "config.conf"); while (<CONF>) { s/\s+$//g; # remove whitespaces s/\s/ /g; # replace whitespace by space next if /^\s*\#/; # ignore comment lines s/\s*\#$//g; # ignore trailing comments next if /^\s*$/; # ignore empty lines ($one,$two,$three) = split(/:/); if ($one eq "Alt") { $alt = $one; $label = $two; $var = $three; foreach($three) { $$alt{$var} = "$label"; } } else { foreach ($one) { $$one{$two} = "$three"; } } } foreach $thing (keys %Label) { foreach $heh (@carray) { while ( ($k,$v) = each %$heh ) { print "Static Setting: $heh: $v\n" if ($k eq $thing); } } while ( ($k,$v) = each %Alt ) { print "Alternate Setting: $k\n" if ($v eq $thing); } print "\n"; }
    the Alt:whatever:vars thing is there becuase you can take the third part of that and make it into another hash..

    but I probably sound pretty stupid all the way through this...I guess Im just trying to give examples of easy extensibility in configuration.

      You're making heavy use of symbolic references in this code which means that it won't work under use strict. Symbolic references are usually best avoided for reasons discussed in perldoc perlref. A better solution might be to store all of the configuration information in a global hash.

      Also there are a couple of places where you loop over a scalar value, e.g.

      foreach ($one) { $$one{$two} = "$three"; }

      which seems a bit of a strange thing to do!


      --
      <http://www.dave.org.uk>

      European Perl Conference - Sept 22/24 2000
      <http://www.yapc.org/Europe/>
Re: Configuration file parsing?
by davorg (Chancellor) on Jun 22, 2000 at 11:41 UTC
RE: Configuration file parsing?
by radixzer0 (Beadle) on Jun 22, 2000 at 09:23 UTC
    DOH! Forgot which workstation I was on and didn't log in! This question was actually posted by me, so flame me and not the poor anony-s ;b
    -r0