Something earlier in this thread seemed to indicate that you have some control over the syntax and formatting contained in the config file. Even if this is not the case, you will certainly save yourself a *lot* of time if you simply use a pre-existing data serialization format, instead of inventing your own. (see e.g., YAML, XML, JSON, WDDX).

The benefits of using a pre-established syntax are too numerous to mention here, but the only *disadvantage* is that you don't get the 'personal growth' experience of going through the tedium of the inventing/parsing/debugging cycle yourself. Learning how to write your own parsing code can be an educational experience, but do you really want to go through all that if all you are doing is reading config files?

Even if you cannot choose a pre-established syntax, you still are probably better off by simply *converting* the "custom" syntax into a pre-existing one. For example, here is some code that converts your sample data into YAML.

### begin_: init perl use strict; use warnings; ### p__: standard perl libraries use YAML; use Data::Dumper; ### begin_: get sample data my $sRaw = join '',<DATA>; ### begin_: convert to YAML for ($sRaw){ ### p__: scrub the top part s/libname=/\n- domain: begin\n libname: /gms; s/pathname=([^\s]+)/\n pathname: "$1"/gms; s/owner=([^\s]+)/\n owner: "$1"/gms; s/libaclinherit=([^\s]+)/\n libaclinherit: "$1"/gms; s/dynlock=([^\s]+)/\n dynlock: "$1"/gms; s/roptions=\x22//gms; ### p__: scrub the roption stuff for my $sOpt qw(datapath indexpath workpath metapath){ s/\s+$sOpt=\x28([^\x29]+)\x29/\n $sOpt: [$1]/gms; } ### p__: scrub the oddball stuff s/\n^\x20{4,}/,/gms; s/,\x2e{3}//gms; s/\x22;//gms; s/\x5d[\x2c\x20]+/\x5d/gms; $_ .= "\n"; }; ### begin_: display result ### p__: show raw converted to yaml print $sRaw; print "\n---\n"; ### p__: show yaml converted to perl my $oData = YAML::Load($sRaw); print Data::Dumper->Dump([$oData], [qw(oDomains)]); ### begin_: end_perl 1; __END__ libname=foo pathname=/path/to/metadata/foo owner=someuser libaclinheri +t=no dynlock=no roptions=" datapath=('/data/path1' '/data/path2' '/data/path3' ...) indexpath=('/indx/path1' '/indx/path2' '/indx/path3' ...) workpath=('/work/path1' '/work/path2' '/work/path3' ...) metapath=('/meta/path1' '/meta/path2' '/meta/path3' ...)"; libname=foo pathname=/path/to/metadata/foo owner=someuser libaclinheri +t=no dynlock=no roptions=" datapath=('/data/path1' '/data/path2' '/data/path3' ...) indexpath=('/indx/path1' '/indx/path2' '/indx/path3' ...) workpath=('/work/path1' '/work/path2' '/work/path3' ...) metapath=('/meta/path1' '/meta/path2' '/meta/path3' ...)";
The Raw-To-YAML conversion gives you something like this:
- domain: begin libname: foo pathname: "/path/to/metadata/foo" owner: "someuser" libaclinherit: "no" dynlock: "no" datapath: ['/data/path1','/data/path2','/data/path3'] indexpath: ['/indx/path1','/indx/path2','/indx/path3'] workpath: ['/work/path1','/work/path2','/work/path3'] metapath: ['/meta/path1','/meta/path2','/meta/path3'] - domain: begin libname: foo pathname: "/path/to/metadata/foo" owner: "someuser" libaclinherit: "no" dynlock: "no" datapath: ['/data/path1','/data/path2','/data/path3'] indexpath: ['/indx/path1','/indx/path2','/indx/path3'] workpath: ['/work/path1','/work/path2','/work/path3'] metapath: ['/meta/path1','/meta/path2','/meta/path3']
The YAML-To-Perl conversion gives you something like this: (this is all done for you by YAML, no parsing necessary)
$oDomains = [ { 'owner' => 'someuser', 'indexpath' => [ '/indx/path1', '/indx/path2', '/indx/path3' ], 'libaclinherit' => 'no', 'libname' => 'foo', 'workpath' => [...] ... ];

Even if you cannot store the config files as YAML, you can still use simple regex code to convert them. Sure, you will still have to do a little tweaking and debugging to make sure the YAML output is well-formed, but the leverage you get makes the task much simpler, *especially* if your perl skills are a tad rusty.

=oQDlNWYsBHI5JXZ2VGIulGIlJXYgQkUPxEIlhGdgY2bgMXZ5VGIlhGV

In reply to Re: Parsing a complex config file by dimar
in thread Parsing a complex config file by solitaryrpr

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.