Re^2: Identifying unmatched data in a database

Most esteemed prior, a humble pilgrim seeks to barge in and benefit from your wisdom.

you should get into the habit of [...] calling close on the filehandle when the file is no longer being used, and always checking open and close for success or failure

I hear your advice, but I don't understand why these are good habits.

I always regarded calling close as superfluous, unless I was either a) going to open the same file again (perhaps with different parameters, perhaps not), or b) concerned that the system itself would run out of file descriptors for open files (or perhaps c) opening pipes rather than files, but I've never done that). What does explicitely calling close -- much less at the very end of a script, with no further code following it -- accomplish?

On the same note, while it's of course always a good idea to check for errors, what would an inability to close a file signify for the script? I assume that the worst that could happen is that the file remains open; if close is not necessary to begin with, as above, this would not be a problem (since not calling close would leave the file open, anyway). Even if an explicit close is advisable, I'd expect that failure would at most warrant a warning in most situations.

But I'm not an experienced monk of Perl. Please enlighten me, brother!

Comment on Re^2: Identifying unmatched data in a database

Replies are listed 'Best First'.

Re^3: Identifying unmatched data in a database
by hippo (Archbishop) on Jun 29, 2014 at 11:16 UTC

What does explicitely calling close -- much less at the very end of a script, with no further code following it -- accomplish?

The short answer is: very little. However we all know that code has a tendency to both grow and propagate over time. It may be that later on either you or someone else will add a whole heap of extra processing onto the end of your script at which point it would be prudent to close the file first. By having the close in there anyway you (or the other programmer) can add their code after it without even thinking about what other housekeeping might be advisable.

While you are smart enough to consider limitations on the number of file descriptors, some other programmer who cargo-cults your routine into a massive loop over N different filehandles may not. So, the close does no harm and helps to avoid potential problems later.

while it's of course always a good idea to check for errors, what would an inability to close a file signify for the script?

More usually it is not the closing of the file per se which it is desirable to test, but rather the implicit flush and/or lock release. A failure of those may have serious consequences for data integrity so it is as well to inform the user of such a failure. Whether that constitutes a fatal error would depend on the context.

All just my opinion, of course.

[reply]

Re^4: Identifying unmatched data in a database

by soonix (Chancellor) on Jun 29, 2014 at 20:34 UTC

the implicit flush and/or lock release

I think this is the most important part. If you work with Windows (or cross-platform), having a file open for reading means that no one (including the same process) is allowed to rename or delete that file.
Probably seldom necessary, but makes e.g. PDF::API2::Simple failing one of ist tests on Windows...

[reply]

Re^3: Identifying unmatched data in a database
by Athanasius (Archbishop) on Jun 29, 2014 at 14:57 UTC

Hello AppleFritter,

I don’t have much to add to hippo’s excellent answer. But:

I always regarded calling close as superfluous, unless I was either a) going to open the same file again (perhaps with different parameters, perhaps not), ...

Well, according to close:

You don't have to close FILEHANDLE if you are immediately going to do another open on it, because open closes it for you. ... However, an explicit close on an input file resets the line counter ($.), while the implicit close done by open does not.

But the real issue for me is (usually) not whether the file is closed, but whether errors are detected. A well-written programme should make it clear:

that an error has occurred, so that data is not silently corrupted; and
where it occurred, to focus the programmer’s attention on the real problem and facilitate the debugging process.

If a file error occurs after a successful call to open, an explicit close may be the best, or only, location in the code where it can be detected. This is so even when using autodie. For example (assuming the file “fred.txt” does not exist):

 0:50 >perl -wE "open(my $fh, '<', 'fred.txt'); close $fh;"

 0:51 >perl -Mautodie -wE "open(my $fh, '<', 'fred.txt'); close $fh;"
Can't open 'fred.txt' for reading: 'No such file or directory' at -e l
+ine 1

 0:51 >perl -wE "open(my $fh, '<', 'fred.txt'); use autodie; close $fh
+;"
Can't close(GLOB(0x3bbb68)) filehandle: 'Bad file descriptor' at -e li
+ne 1

 0:51 >perl -wE "open(my $fh, '<', 'fred.txt'); use autodie;"

 0:52 >
[download]

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]

Re^3: Identifying unmatched data in a database
by Laurent_R (Canon) on Jun 29, 2014 at 15:24 UTC

AppleFritter

Granted, Perl is doing its best to do what you mean. And this means, inter alia, that it will flush the write buffers and close the file when the filehandle goes out of scope or when the program completes. So, most of the time, closing explicitly a filehandle seems to be unnecessary. But I still think it is good practice to explicitly close your filehandles (in Perl and in other languages automatically closing filehandles on exit) because:

- It makes your intent clearer to the chap that will have to maintain your code (and we all know that, six months from now, that chap may be you or I);

- The earlier a file is closed, the earlier resources associated to it are freed;

- The earlier a file is closed, the smaller the risk is to use it wrongly;

- the earlier an output filehandle is closed, the earlier the written file is in a stable form. If your program crashes violently, it might not be able to flush the write buffer to the file and close it properly before aborting. If the file was closed cleanly beforehand, everything is fine.

For these reasons, I (almost) always close explicitly my files, especially the output files, as soon as I no longer need them.

Having said that, I must admit that usually don't test the result of the close function.

Edit:

I had not seen Athanasius's answer when I wrote mine. I might not have answered if I had seen it.

[reply]