three18ti has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks

We have a system that dumps out, for lack of a better term, config files. They don't adhere to any external standard, but are standardized within themselves.

Essentially, I have a colon delimited list of key value pairs:

foo: bar baz: 1234+54q - bar bar who: bob where: October Some other parameter: 42 What are we doing today Brain: Same thing we do every day Pinky

Each "config" file produces the same "keys", while the values change.

I suppose it would be easy enough to use some kind of regex trickery, but I'm really trying to avoid that as it's likely the config file format will change and I want someone who isn't me (or a Perl person for that matter) to be able to update the tool to parse the new format. This is where the idea of a "parsing template" came up. I'd like to be able to do something like:

foo: <string> baz: <number><string><number> who: <name> where: <month> Some other parameter: <int> What are we doing today Brain: <varchar>

I had looked at Marpa::R2 previously for parsing the sudoers file, it was a bit cumbersome and I never could quite get it to work properly (someday... but I have more pressing projects at the moment), and I don't know that a full parser is really necessary, but it does sort of fit the "parse from template" bill...

Does anyone have any better ideas? I'm certainly not married to the idea of using some kind of (EE)BNF for parsing, but it seems like it would be the most direct route to write a tool that can consume arbitrary "parsing templates".

Thanks for the advice

Replies are listed 'Best First'.
Re: Parsing an Arbitrary "config" file (based on a "template"?)
by roboticus (Chancellor) on Dec 31, 2014 at 22:22 UTC

    three18ti:

    If it's always that format, you don't need anything fancy. Just read the file, split each line into two chunks at the first colon, and package it all up into a hash. It should be something like this:

    sub read_config_file { my $file_name = shift; open my $FH, '<', $file_name or die "Can't read $file_name: $!\n"; my $result = {}; while (my $line = <$FH>) { my ($key, $value) = split /:/, $line, 2; $value =~ s/\r?\n$//; $result->{$key} = $value; } return $result; } my $config_data = read_config_file('config_file_name.cfg'); print "Daily mission: ", $config_data->{'What are we doing today Brain +'}, "\n"; print "Location: ", $config->data{where}, "\n";

    (Note: untested...)

    Update: I actually like Marpa::R2 a good bit. I just did a project at work with it last month where I had to write a parser for a programming language. I found it to be pretty nice to use in that capacity. But it's overpowered, in my opinion, if you have simple tag:value pairs in a configuration file.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      OK, I got a little bored at work, so I thought I'd code it up in Marpa just as an example. There's a minor bug in it, but I'll leave it in there. Source code:

      When I run it, I get:

      $ perl parse_cfg_file.pl Daily mission: Same thing we do every day Pinky Location: October All configuration data: { "baz" => "1234+54q - bar bar", "foo" => "bar", "Some other parameter" => 42, "What are we doing today Brain" => "Same thing we do every day Pinky +", "where" => "October", "who" => "bob", }

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

        This is great thanks!

        I do see what you mean about it being a bit overpowered, but it is clean.

        I especially like:

        $value =~ s/\r?\n$//;

        from your first example.

        I wonder which will be more maintainable in the long run.

        Thanks for the examples!