monks,
It is quite often I find myself with the need to keep chunks of text along with a module or script - not enough to warrant extra files, but enough to make using quotes and heredocs quite ugly. After reinventing my favourite simple solution (again), I found myself wanting to encapsulate the functionality in a module - but being put off by the simplicity of the implementation. I realise modules dont have to be magic, but still...
Cut to the chase - I wrote the module. I have included the start of some POD
covering the general idea. Its all a bit rough'n'ready, but I would appreciate any opinions
at this stage (or just tales of weird uses of __DATA__ you like to employ ;-).
I have not come across a similar solution on CPAN - but I am not really sure
how such a module would be categorized (Tie::*, Text::*
and Data::*)?.
My meditations is this:
cheers
edit: updated POD to match code posted later
use Tie::DATA; foreach(keys %DATA) { print "$_ = $DATA{$_}\n"; } __DATA__ __foo__ yadda yadda yadda... __bar__ ee-aye ee-aye oh __baz__ woof woof
use Tie::DATA(':xml'); # predefined format foreach(keys %DATA) { print "$_ = $DATA{$_}\n"; } __DATA__ <foo>yadda yadda yadda...</foo> <bar> ee-aye ee-aye oh </bar> <baz> woof woof </baz>
use Tie::DATA ( sub{ ... }, # parse key/values from DATA sub{ ... } # process pairs ); ... __DATA__ ...
Tie::DATA provides a means to break a module or scripts' __DATA__ handle into named segments, accessible via the read-only package variable %DATA. Tie::DATA is not intended for configuration variables, but for medium-sized bodies of text that should be kept with the code (without being embedded in variable declarations).
%DATA's entries are created lazily; that is, when it is first used.
There are two stages to execution, both of which can be customized by arguments to use Tie::DATA
[foo] baz bar etc
<foo>baz bar etc</foo>
#define foo baz bar etc
<![foo[ baz bar etc ]]>
It is important to remember that by default, segments cannot be nested - in particular, :xml cannot have attributes.
Full customization of parsing can be gained by passing either a regex or sub reference as the first argument:
The subroutine reference should return a list of key value pairs.use Tie::DATA qr(<<<<<<<<(\w*?)>>>>>>>>); use Tie::DATA sub{split(/\s*:SEGMENT\s+(\w+)\s*/, shift);} use Some::Mad::Parser; use Tie::DATA \&Some::Mad::Parser::parse;
After parsing, if a callback has been registered as the second argument, then each Key-Value pair is passed to the callback function for further processing. This function is expected to return the actual Key-Value pair that will be used in %DATA.
For example, if you wanted to control how whitespace was treated for each segment individually, you might use something like:
There is no reason why the processing subroutine need be in the current module:use Tie::DATA(':ini', 'proc_kv'); foreach(keys %DATA) { print "$_ = $DATA{$_}\n"; } # our processing function, checks for # and removes processing hints in our keys # (see __DATA__) sub proc_kv { my ($k, $v) = @_; if($k =~ /:/) { my ($tag, $hint) = split(/:/, $k); $k = $tag; if($hint eq 'nowhitespace') { $v = ... } else { $v = ... } } return($k,$v); } __DATA__ [foo:nowhitespace] yadda yadda yadda... [bar] ee-aye ee-aye oh [baz] woof woof
use My::Big::Routine; use Tie::DATA(':ini', 'My::Big::Routine::go'); foreach(keys %DATA) { print "$_ = $DATA{$_}\n"; }
%DATA is read-only. Any attempt to modify it after the processing stage will cause the program to croak.
In reply to getting more from __DATA__ by Ctrl-z
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |