while (<CONFIG>) {
if (/^\s*#/) { # ignore comment line
} elsif (/^\s*$/) { # ignore blank line
} elsif (/(\w+)\s*=\s*[<]{2}(\w+)/) { # heredoc
(my $name, local $/) = ($1, "\n$2"); # ++ysth
$config{$name} = <CONFIG>;
chomp $config{$name}; # as etcshadow points out.
} elsif (/(\w+)\s*=\s*(.*?)\s*$/) { # regular pair
$config{$1}=$2;
} else {
warn "Ptooey: Could not parse config line: $_\n";
}
}
This does not handle the sorts of heredocs where
the type of quoting is specified (e.g.,
<<'HEREDOC'), however. That could
be a future improvement, if you need it.
if you have a regex could you explain how it works?
The first couple are pretty basic, assuming you
know that \s matches whitespace (spaces, tabs, and
so forth), so I'll let you figure those out on
your own. The other two bear more explaining...
I'll start with the last one:
/(\w+)\s*=\s*(.*?)\s*$/
\w matches a word character (letters, numbers,
underscore, ...). + means one or more, and
the parens capture those word characters to
$1. Then you have an equal sign (possibly
surrounded by zero or more whitespace characters).
After that, this variation slurps forward,
taking as few characters as possible (that's
what the ? is for, to make it non-greedy) for $2,
until it encounters the whitespace at the end
of the line.
The one you're probably most interested in is
the one that does the here document:
} elsif (/(\w+)\s*=\s*[<]{2}(\w+)/) { # heredoc
(my $name, local $/) = ($1, $2);
$config{$name} = <CONFIG>;
The first part is the same, matching the
name of the config option and the equal sign,
with any surrounding whitespace. I put the
less-than symbol in a character class because
I couldn't remember whether it's a special
character in the main part of a regular
expression. (I don't think so, but I wanted
to be safe and give you code I knew would
work.) the {2} is just a quantifier, telling
how many times we want to match that preceding
atom, so basically that all matches two
less-than symbols in a row. Then, as before,
it matches a series of one or more word
characters. Now, the trick is that I didn't
use the regex to match the rest of the here
document: I grabbed the key from the regex
and also the string used to mark the end of
the here document, then I set the input
record separator ($/), which causes any read
on the filehandle to go forward until it hits
that point. This does have a weakness, in that
a true here document can have that string in
the document as long as it's not on a line by
itself, but for config file purposes I figured
I'd take the shortcut. The local qualifier
on the assignment to $/ ensures that when the
elsif block is exited the input record separator
returns to its normal state, so that subsequent
reads on the filehandle work as per normal.
$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}}
split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/
|