gamache has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a program which I'd like to be able to reload parts of itself on the fly. I'm doing this by storing some perl code after the __DATA__ marker, which the program will slurp and eval when requested. This is not a problem. My question is this: is the following a legitimate way to allow multiple reads of <DATA>?
my $datapos = tell DATA; my $code = join '', <DATA>; seek DATA, $datapos, 0;
tell and seek seem to do the right thing, but I just want to ensure that I am not relying on a private/internal feature. Thanks. -pete

Replies are listed 'Best First'.
Re: __DATA__, seek, and tell
by chromatic (Archbishop) on Oct 16, 2007 at 17:38 UTC

    I agree with some of the other monks that this approach isn't ideal, but your use of seek and tell is indeed legitimate and valid, not just for Perl but for Unix.

Re: __DATA__, seek, and tell
by shmem (Chancellor) on Oct 16, 2007 at 20:33 UTC
    Your usage is fine. See perldata:
    Text after __DATA__ but may be read via the filehandle "PACK- NAME::DATA", where "PACKNAME" is the package that was current when the __DATA__ token was encountered. The filehandle is left open pointing to the contents after __DATA__. It is the program's responsibility to "close DATA" when it is done reading from it. For compatibility with older scripts written before __DATA__ was introduced, __END__ behaves like __DATA__ in the toplevel script (but not in files loaded with "require" or "do") and leaves the remaining contents of the file acces- sible via "main::DATA".

    See SelfLoader for more description of __DATA__, and an example of its use. Note that you cannot read from the DATA filehandle in a BEGIN block: the BEGIN block is executed as soon as it is seen (during compi- lation), at which point the corresponding __DATA__ (or __END__) token has not yet been seen.

    So, there's even a core module using that feature. I'm not sure about scoping, though, i.e. what is happening when you have multiple files of the same package (e.g. via AutoLoader), and have thus multiple __DATA__ tokens present. I guess __DATA__ is file scoped, but I didn't try it. I've used that before, too. See Re: $. - smarter than you might think and follow-ups.

    You might want to check RFC: Sub::Auto - lazy loading revisited (now: AutoReloader) for another way of reloading changed files automatically, although that code is still alpha.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: __DATA__, seek, and tell
by Joost (Canon) on Oct 16, 2007 at 20:16 UTC
    As far as I know it should be safe to seek() and tell() on the handle without any problems under normal conditions. *

    It's possible that other modules (like source filters) may mess with this strategy, but using it in your own code should work.

    * update: it's even possible to seek to the beginning of the file and read the source before the __DATA__ line, and as far as I know that's a documented feature (though there probably isn't really a *good* reason to use it).

    update 2: for some more caveats see the beginning of the SelfLoader documentation.

Re: __DATA__, seek, and tell
by ikegami (Patriarch) on Oct 16, 2007 at 15:41 UTC
    Why not just load the data into a variable once and reuse the variable instead of reusing DATA?
      I'm using DATA because I would like to be able to edit the Perl file on disk, and have it reload the changes when I tell it to.

        Replacing parts of your programs sounds really ugly and dangerous. I can easily envision problems with global my variable, for example. Seems to me a frest start using exec $^X, $0, @ARGV; would be a much better way of achieving that.

        If you're determined to stay on your course, why now keep the reloadable code in a seperate file and just do that file instead of messing with DATA. Keep in mind that what you are doing with DATA is an undocumented side-effect of the implementation of DATA.

        A reply falls below the community's threshold of quality. You may see it by logging in.
Re: __DATA__, seek, and tell
by eric256 (Parson) on Oct 16, 2007 at 16:13 UTC

    I think you are fine as long as you make sure not to move the DATA line in the file. If in editing the file you move that line then you are going to be in trouble. If there is a chance this could happen you probably want to open the file and scan down to __DATA__ then read everything after it in. I would highly recommend building any code that might update like this as a plugin, then you can reload the plugin and the plugin can have specific hooks for being unloaded and reloaded.


    ___________
    Eric Hodges
      First, thank you for trying to answer my question rather than dissuade me from doing what I'm doing. :)

      I realized that moving the __DATA__ line would cause problems, and posted another snippet deep in my stack of replies to ikegami above:

      my $code = join '', <DATA>; seek DATA, 0, 0; while (<DATA> !~ /^__DATA__$/) {}
      This code does what I want. I feel a little better about whether I'm stomping on internals, too. I'm going to use it.

      As you identified, this approach is going to be used on plugin-style code.

Re: __DATA__, seek, and tell
by zentara (Cardinal) on Oct 16, 2007 at 17:34 UTC
    I've done it before too. You have to be careful because DATA resets 0 to the start of the script. You may be safer to use Inline::Files

    I'm not really a human, but I play one on earth. Cogito ergo sum a bum
A reply falls below the community's threshold of quality. You may see it by logging in.