Gorio3721 has asked for the wisdom of the Perl Monks concerning the following question:

This might be trivial and possibly a dumb question but I'm seeking the wisdom of my fellow perl monks. Consider the following script:
#!/usr/bin/perl -w # use strict; while (<DATA>) { print if my $que_def = /^QueueManager:\s*$/ ... /^QueueManager:\s*$/; # print if my $que_def = /^QueueManager:\s*$/ .. /^QueueManager:\s*$/; } exit 0; __END__ ClientExitPath: ExitsDefaultPath=/var/mqm/exits LogDefaults: LogPrimaryFiles=3 LogSecondaryFiles=2 LogFilePages=1024 LogType=CIRCULAR LogBufferPages=0 LogDefaultPath=/var/mqm/log QueueManager: Name=venus.queue.manager Prefix=/MQHA/venus.queue.manager/data Directory=venus!queue!manager QueueManager: Name=craig.queue.manager Prefix=/MQHA/craig.queue.manager/data Directory=craig!queue!manager QueueManager: Name=ha.qmgr1 Prefix=/MQHA/ha.qmgr1/data Directory=ha!qmgr1 QueueManager: Name=test.manager Prefix=/MQHA/test.manager/data Directory=test!manager
What I'm trying to do is to parse the <key> = <value> fields following each "QueueManager:" line. I am trying to understand and use the '..' and/or the '...' range operators to extract only the lines between the "QueueManager:" tags. I inserted an "EndQueueManager:" line before each "QueueManager:" line in the configuration file and my script worked fine but I don't have control of the format of the config file. (It's created by IBM's WebSphere/MQ Series software). I have managed to accomplish my objevtive in a more brute force method but now I'm trying to educate myself here and learn a more elegant solution. I need to start processing a given queue manager stanza on the first "QueueManager:" line and end it on the line immediately preceeding the next "QueueManager:" line. I also want to handle an aribitrary number of <key> = <value> pairs. If I use the '..' operator I get only the "QueueManager:" lines and not the data between them. If I use the '...' operator then I get every odd stanza and skip over the even stanzas. HELP!

Replies are listed 'Best First'.
Re: parsing a config file
by artist (Parson) on May 21, 2003 at 17:33 UTC
    You might want to look at Config Modules instead of re-inventing wheel. There are serveral config file parser, you can choose your style and work accordingly.

    artist

      I checked out the Config Modules and there is one specifically designed for MQ Series software. It looks like that will do exactly what I want. Thanks for the suggestion. However, I really DO want to re-invent the wheel in this case, if only to better understand how wheels are made.
Re: parsing a config file
by arturo (Vicar) on May 21, 2003 at 18:53 UTC

    The following is untested; it's also not uber-leet, but then what do I care about that? =)

    # place to put everything my @queue_managers; # a hash to store the key/value pairs for each queue manager my %conf = (); while (<DATA>) { # see if we have a new "queuemanager" line and if there's any data +to save if ( /^QueueManager/ && %conf ) { # save what we have already stored push @queue_managers, { %conf }; # clear the hash for the next one %conf=(); next; } # now check key/value pair lines and if we've got one, put its # data into the hash. I'm assuming that values are bracketed by # spaces, so I just grab the longest set of non-spaces if ( my ($key, $value ) = /^\s*(\w+)\s*=\s*(\S+)/ ) { $conf{$key} = $value; } } # ahh, the last entry probably hasn't been saved push @queue_managers, {%conf} if %conf;

    What this does is put each configuration entry, stored as a hash reference, into an array. If you just want to print the darn thing out, you could print out the hash instead of push ing an anonymous copy (which is what the { %conf } syntax does) onto the array.

    What this code gives you is an array, each of whose members corresponds to one of the "QueueManager" entries (where that means, "follows a QueueManager line and ends at a new QueueManager line or the end of the file, whichever comes first).

    Managing to do something with this data structure, or changing it to a diffferent kind of structure (it strikes me that what you'd really like if you're using this as a config file is a hash of hashes, where the key of each hash is the name of the queue manager, and whose values are the other values.) is something I'll let you figure out for yourself, with a tip of the hat to References quick reference, perlref, perllol, and perldsc.

    HTH and good luck!

    If not P, what? Q maybe?
    "Sidney Morgenbesser"

      This mostly works but it also picks up all the <key> = <value> pairs before the first queuemanager: stanza. I did actually use a hash of hases in my working solution. I just wanted to satisfy my curitosity by getting this to work. Thanks, Gorio
Re: parsing a config file
by TomDLux (Vicar) on May 21, 2003 at 19:17 UTC

    Basic pattern matching is not the best solution to this situation, but I'll focus on correct pattern matching.

    See Programming Perl, Chapter 3, "The Range Operator" (pg 103 in the third edition).

    '..' is a boolean operator, which is false by default. Once the left operator returns true, the '..' evaluates to true until the right operator evaluates to true.

    As soon as the left operator evaluates to true, stating that the beginning of the range has been reached, the right operator is tested, to see if the end of the active reagion has been reached. That's why you don't get the regions betwween header lines, with '..': as soon as you detect the beginning of the region, it also triggers the end of region.

    With '...', end of region is not tested when the start has just been detected, so you do get the regions between the header lines. But since the start and end patterns are the same, and the patterns match several times, you only get some of the regions.

    Your file has the structure:

    1. ClientExitPath:
    2. LogDefaults:
    3. QueueManager:
    4. QueueManager:
    5. QueueManager:
    6. QueueManager:

    You want to begin printing at header section 3, the first QueueManager. You want to process key/value pairs until you are no longer in a QueueManager section.

    Detect the beginning is fine the way you have it. The end of the region you want is marked by encountering a header line which is not a QueueManager header. (Or, of course, end of file) So a simple pattern you want is:

    print if /^QueueManager:/ ... (/:/ && ! /^QueueManager:/);

    i.e.: Begin printing when you encounter a QueueManager, and continue until you reach a non-QueueManager header line. But this is clumsy and relatively complicated. You COULD be daring and use a closing condition like /^^Q\w*:/, a header line which does not begin with a Q, but what if there's a QueryManager: header line? What you really want is a header line which is not a QueueManager header line. A negative look-behind assertion is perfect for this:

    print if /^QueueManager:/ ... /^(?<=:QueueManager):/;

    The closing condition is a ':' which follows any string ... any string other than 'QueueManager'.

    As a minor style difference, I don't bother matching the '\*$' after the colon.

    I typed "negative look-behind assertion", but then in the code I made a typo and used a positive look behind assertion. The code should be /^(?<!:QueueManager):/ ... BUT either a negative or positive look-behind assertion work. How did that happen? Tom

      Thanks Tom. This is exactly what I was looking for. I guess I have to dig out my 'Masteing Regular Expressions' book and figure out exactly what the heck a negative look-behind assertion really is and then I'll be on my way to becoming a guru!

      At the top of your post you said that basic pattern matching is not the best way to go about this (although it has eductaed me, which was my real intent here). What would be the best approach in your opinion?

      Thanks for the help!!!

Re: parsing a config file
by Not_a_Number (Prior) on May 21, 2003 at 20:15 UTC

    Or you could use this far from elegant solution:

    while (<DATA>) { print if /^QueueManager:$/ .. /^QueueManager$/; }
    (note no colon on the RHS)

    d

      So the RHS never matches, which works because the desired section is at the end of the file. But what if someone sets some configuration value which causes a new section if the config file, or cause the QueueManager sections to become the first instead of the last?

      I don't see anything elegant about that. Please explain?

      Tom

        Sorry if my semi-joke led to misunderstanding. If you read my message again, you will see that I actually describe my solution as "far from elegant", which means pretty much the opposite of elegant.

        In Real Life, I think that this thread could have finished after answer #1 :-)

        d

Re: parsing a config file
by naChoZ (Curate) on May 22, 2003 at 12:54 UTC
    One thing I did (it was mostly to improve my advanced data structure understanding) was make the config file a perl module.
    package conf; %enable = ( #qmail 'alias' => { 'directory' => $qmaildir, 'command' => "putdiff.sh /var/qmail/alias alias" }, 'aliases' => { 'directory' => $qmc , 'command' => "newaliases" }, ::snip:: };

    Then, in the main program, I just require "/usr/local/etc/mymodule.pm"; and then I can refer to it as:

    $conf::enable{'aliases'}->{'command'}

    ~~

    naChoZ