bliako has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

My script, all of a sudden whenever I warn it gives me at the end this: <DATA> line 290. . In particular, try:

use Geography::Countries::LatLong; warn "xyz";

And true enough this module reads its __DATA__ section. But is there such thing as forgetting to close it? Because a) I don't want to have my scripts mention that someone else's __DATA__ is opened on some line urelated and b) I spent 5 minutes investigating and that's 5 minutes too much right now.

In this particular module, __DATA__ is read in a loop and that's it, no further reference to it.

So the questions are: in general can I close __DATA__ which my module/script opened? And should any module close its __DATA__ when done with it? Otherwise, will I have to live with <DATA> line 290 messages all my Perl life?

bw, bliako

Replies are listed 'Best First'.
Re: Do I need/want to close __DATA__?
by choroba (Cardinal) on Apr 13, 2020 at 13:02 UTC
    warn should include both the details, the position in the script/module, and the position in the input:
    #! /usr/bin/perl use warnings; use strict; print "" while <DATA>; warn 'here'; __DATA__ 1 2 3

    Stderr:

    here at /home/choroba/1.pl line 6, <DATA> line 3. # in the script: ~~~~~~~~~~~ # in the input: ~~~~~~~~~~~~~

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      I agree that messages intended for the developer should include both. However, messages intended for the end-user normally should not contain either. This is the choice that we do have.
      Bill
Re: Do I need/want to close __DATA__?
by haukex (Archbishop) on Apr 13, 2020 at 14:57 UTC

    As per Special Literals:

    The program should close DATA when it is done reading from it. (Leaving it open leaks filehandles if the module is reloaded for any reason, so it's a safer practice to close it.)

    So yes, any module reading from its DATA should close it. BTW, it's not Geography::Countries::LatLong at fault here, it doesn't have any __DATA__ sections, it's its parent, Geography::Countries. A bug/patch could be filed with that module, although it appears unmaintained.

    As for the general feature of appending the information to the error/warning messages, personally, I think it's a useful feature. I believe what it's using is the internal "last filehandle read", the same that eof and tell use when called without arguments, and that $. uses; see e.g. Filehandle, last accessed. Although not extra pretty, and maybe I'm missing an easier way to do this, this works:

    use warnings; use strict; sub resetfh { eof do { local *HANDLE; *HANDLE } } my @x = <DATA>; resetfh; warn "test"; __DATA__ 1 2 3
      It can be even shorter (but a bit more confusing) when using a name that's exempt from the "use only once" warning:
      sub refresh { eof local *% }
      map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

        Nice, I like it! ;-) This idiom (if I may call it that) is short enough that, IMHO, it doesn't even really need a sub and fits comfortably on a line with a comment explaining it:

        {do eof local *%} # reset Perl's internal "last accessed" filehandle
        

        (I had to put the do in there to get rid of a "Useless use of eof in void context"...) I find it funny that there are so many alternatives to *%, like ** and *! :-)

        ... but of course a sub is the more "best practice" solution here.

        Update: See replies.

Re: Do I need/want to close __DATA__?
by Tanktalus (Canon) on Apr 13, 2020 at 17:43 UTC

    A lot of responses have focused on the warn aspect. And, while they're interesting, I think they're missing the bigger question, though haukex touched on (first, where it belongs, IMNSHO).

    YES you should close filehandles you don't need. Modules that reference DATA should close DATA when they're done. IMO, this is a bug in the module you're using - it is leaving open a filehandle, which is a (relatively) scarce kernel resource, instead of closing it. With normal filehandles, you can (mostly) get away with using a lexical, and letting perl auto-close it (though I've read a few places indicating this is bad - you should explicitly close it and handle the errors that may result gracefully). But since DATA is a global, you get no such help. And so it stays open for the life of the program.

    For many programs, this turns out to be not that bad - it's a waste of resources, but the program runs, it dies, and everything gets cleaned up. That's nice. But if your app that uses said module has use for many filehandles (I had one managing stdin, stdout, and stderr for many parallel subprocesses simultaneously - 3 times the number of subprocesses right there, and we could run dozens of subprocesses), you may find that a bunch of modules chewing up filehandles needlessly could prove limiting. Now, again, in my case, I was probably fine - the systems were large which meant we didn't need as many subprocesses, but if I had to deal with a cluster of hundreds of machines (each subprocess was an ssh to another machine), this could have been annoying, without adding a burden from modules' bugs. And the fact you can get away with it for so long is why most developers don't even think about it.

    So, all that is nice and all, but what to do in your particular case? Well, some have helpfully suggesting a warn workaround. To me, that's still a workaround, albeit quite cheap. (It's not so cheap when you start to lose the extra information that could help debug other problems when it isn't giving you grief about this one.) Better would be to go to the original module owner with a bug report. However, given that the module hasn't been updated in over a decade, this may not get you anywhere. In this case, you may need to look at taking over ownership of the module so you can fix it and post it. Or you can just use a forked version locally, but I prefer the giving-back option over the not-giving-back option.

    Best of luck,

Re: Do I need/want to close __DATA__?
by BillKSmith (Monsignor) on Apr 13, 2020 at 11:52 UTC
    Use warn "xyz\n";. Refer: WARN
    warn LIST Prints the value of LIST to STDERR. If the last element of LIST does not end in a newline, it appends the same file/line number text as die does.
    Bill

      I use warn because I need that line number (the line 2 and not the <DATA> line 290.. I mean is helpful in knowing what's going on

        I have problems understanding your problem.

        you said in the OP

        > I don't want to have my scripts mention that someone else's __DATA__ is opened on some line urelated

        • So either the warn is for an end-user, then hide all coordinates by appending "\n"
        • Or it's for you, then ignore the DATA part.
        This might be a little irritating, but IMHO it's a not too tragic side-effect of Perl trying to be explicit.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

        warn "xyz at line ", __LINE__, "\n";


        Give a man a fish:  <%-{-{-{-<

      yes, but shouldn't that append file name and line number where the warn or die statement is, e.g. "example.pl line 2", but definitely not "<DATA> line 290"...
Re: Do I need/want to close __DATA__?
by LanX (Saint) on Apr 13, 2020 at 13:14 UTC
    Warn tells you not only the origin of the exception but also the the coordinates of the input you are reading.

    Closing DATA might help, but I'm not sure about side effects, because it's the handle from which the compiler read the source code.

    (If you seek DATA to 0 you can read your source again)

    Furthermore does every file have its own DATA section, so I'd expect this to be gone if it happens in a module. Hm ... OTOH is possible to deduce the file from the first coordinates.

    Anyway, I wouldn't care much about this, because a module shouldn't throw warnings in normal conditions or risk confusing the user anyway.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      Closing DATA might help, but I'm not sure about side effects, because it's the handle from which the compiler read the source code.

      Come on now

Re: Do I need/want to close __DATA__?
by bliako (Abbot) on Apr 14, 2020 at 06:57 UTC

    Thank you for the help and information. And from now on I will expect me and others to close DATA on finishing with it.

    Here is what happens when 2 separate packages leave their DATA opened. warn reports only the last one being opened. Is it because the interpreter will detect opened DATA and localise it?

    package X; while(<DATA>){ print "X:".$_ } #close(DATA); sub check_data { return defined fileno(DATA) } 1; __DATA__ 1 2 3
    package Y; while(<DATA>){ print "Y:".$_ } #close(DATA); sub check_data { return defined fileno(DATA) } 1; __DATA__ 10 20 30 40 50 60
    # main use lib '.'; use strict; use warnings; use X; use Y; warn "hehh"; print "X opened: ".X::check_data() ."\n"."Y opened: ".Y::check_data() ."\n";
    X:1 X:2 X:3 Y:10 Y:20 Y:30 Y:40 Y:50 Y:60 hehh at x.pm line 7, <DATA> line 6. Y opened: 1 Y opened: 1
      Here is what happens when 2 separate packages leave their DATA opened. warn reports only the last one being opened. Is it because the interpreter will detect opened DATA and localise it?

      As I said, the interpreter simply internally keeps track of the most recently accessed filehandle; "localise" is the wrong term here IMHO because it sounds like you mean something to do with local.

        Yes I meant to make DATA local just because I did not see it being reported in warn . But as you said warn reports only the most recently accessed filehandle, and of course both DATA are opened, just not both reported by warn.