Parsing Config File with Multi-line Elements

pileofrogs has asked for the wisdom of the Perl Monks concerning the following question:

I'm parsing a config file (dhcpd.conf to be precise), and I'm not sure of a good way to parse it. Basically, it can have lines where there is no reliable line separator. Like this:

option foo bar, baz;
option foo
  bar, baz;
subnet 192.68.0.0 netmask 255.255.00 {
    option foo blat, boff;
}
subnet 192.68.0.0 netmask 255.255.00
    { option foo blat,boff; }
[download]

I need to know that the global option 'foo' equals 'bar, baz' and the option 'foo' associated with subnet 168.156.0.0 is 'blat, boff'.

It looks like I can separate my problem into key-value pairs ala option foo bar; and containers ala subnet ... { ... }.

I figure I'm going to have to slurp the whole file into a scalar and then walk through it with some regex magic, but I don't know the regex kung-fu.

Any suggestions?

Thanks -Pileofrogs

Comment on Parsing Config File with Multi-line Elements Select or Download Code

Replies are listed 'Best First'.
Re: Parsing Config File with Multi-line Elements by saintmike (Vicar) on Apr 10, 2006 at 23:33 UTC
Config::Scoped claims to be able to parse dhcpd config files.	[reply]
Re: Parsing Config File with Multi-line Elements by ikegami (Patriarch) on Apr 11, 2006 at 04:12 UTC
I actually worked with dhcp.conf back in '98 or so. We were converting extensive BOOTP tables to a dhcp.conf. I don't remember the spec any, though. ok, enough reminescing. :) You could use Parse::RecDescent. A start would be: my $grammar = <<'__END_OF_GRAMMAR__'; { use strict; use warnings; } parse : item(s?) /\Z/ { $item[1] } item : subnet \| option subnet : SUBNET IP_ADDR NETMASK IP_ADDR subnet_blk { [ @item[0,2,4,5] ] } subnet_blk : option(s?) option : OPTION option_list { [ $item[0], @{$item[2]} ] } option_list : ... # Reserved Words # ============== #SUBNET : IDENT { $item[1] eq 'subnet' ? $item[1] : undef } #NETMASK : IDENT { $item[1] eq 'netmask' ? $item[1] : undef } #OPTION : IDENT { $item[1] eq 'option' ? $item[1] : undef } SUBNET : /subnet(?![a-zA-Z-9_])/ NETMASK : /netmask(?![a-zA-Z-9_])/ OPTION : /option(?![a-zA-Z-9_])/ # Tokens # ====== IDENT : /[a-zA-Z][a-zA-Z-9_]/ IP_ADDR : /\d+\.\d+\.\d+\.\d+/ __END_OF_GRAMMAR__ [download] Update*: Optimized reserved words.	[reply] [d/l]