jkeenan1 has asked for the wisdom of the Perl Monks concerning the following question:

I've been asked to do a review of a Perl module which a colleague has extensively refactored. In general, it's very good work. But there are points where I'd like to be able to distinguish between "This is not a good way to do it" and "This is not the way I myself would do it, but it's otherwise okay."

The module in question is object-oriented and, inside its constructor, several FileHandle objects are created and assigned as data members to the object created by the constructor.

sub new { my $class = shift; my $self = {}; $self->{fhalpha} = new FileHandle(">alpha"); $self->{fhbeta} = new FileHandle(">beta"); $self->{fhgamma} = new FileHandle(">gamma"); ... bless $self, $class; return $self; }

Note that the FileHandle objects are opened within the scope of this subroutine but are not closed within its scope. Instead, the filehandles are used for print calls inside a second subroutine and closed inside a third subroutine called by the second.

Do any dangers lurk therein?

Personally, I don't often use FileHandle, IO::File or similar packages to open and close filehandles. I'm content to use Perl's built-in open and close functions and -- since I've begun to drink the PBP kool-aid -- I always use lexically-scoped filehandles. So I'm always very conscious of the scope in which I open and close filehandles and would never open a handle without closing it in the same scope.

But, while I think my own practice is good, I don't want to advise my colleague to change his approach unless there are clearly things wrong with it. Are there?

TIA.

Replies are listed 'Best First'.
Re: FileHandle objects as data members of another object
by xdg (Monsignor) on Oct 22, 2006 at 03:21 UTC

    FileHandle appears to inherit from IO::File, which inherits from IO::Handle. In the docs for IO::Handle, there is this comment:

    # # There is no need for DESTROY to do anything, because when the # last reference to an IO object is gone, Perl automatically # closes its associated files (if any). However, to avoid any # attempts to autoload DESTROY, we here define it to do nothing. # sub DESTROY {}

    So when the object goes out of scope and is destroyed, those FileHandle objects will be destroyed as well, closing the files. So the scope of the open files won't be longer than the scope of the object.

    I think the underlying question is the classic one of concurrency -- what if multiple processes are reading/write the same file. Keeping a file open for a limited scope minimizes chances of conflict, but doesn't eliminate them. There still needs to be thought given to file locking, file pointers, buffering, etc.

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: FileHandle objects as data members of another object
by chromatic (Archbishop) on Oct 22, 2006 at 03:00 UTC

    Besides the obvious problem (there's no keyword new in Perl, and indirect method invocation is broken and probably unfixable), is there a possibility that people using the API could call the methods that use and close the filehandles in the wrong order?

    Is there a need to check the return value of close on the filehandles?

    Are there locking implications to holding filehandles open so long?

Re: FileHandle objects as data members of another object
by jbert (Priest) on Oct 22, 2006 at 12:17 UTC
    On the plus side:
    1. the ctor establishes an invariant for the object, that it has open filehandles for its other methods to work with
    2. as pointed out elsewhere, the dtor (DESTROY) will undo the work of the ctor, releasing the resources
    3. the resource usage scope tracks that of the object, which fits the 'Resource Acquisition Is Initialisation' (RAII) admittedly more of a C++ thing but also easy to understand.
    On the minus side:
    1. You're consuming a resource (3 open filehandles). This is 'small' (~1024, configurable) and finite on some platforms (most Unices, not Windows). This might not matter, depending on how many of these objects you have, their lifetimes etc.
    2. You've added state to the object. This doesn't really matter if the filehandles are only closed by the dtor, but if a method can close them then the object now has 'open' and 'closed' states, with some methods only being valid in some states ('open'). This can lead to maintenance issues, since you've added an (implicit) ordering to the object methods.
    If there is a single, required ordering to the methods of the object (e.g. instantiate, call 'print_data', call 'close', destroy) then one thing to consider is making that ordering explicit - with a good old fashioned procedure (or class method). To avoid an over-long sub, you can leave the functionality broken out into methods, but make them 'private' (in perl, this is traditionally done with a leading underscore on the name and not documenting them as part of the object's interface). I guess the file handles could then live as lexically-scoped vars of that procedure.

    From what you say, its possible that the object has been created to break up a long subroutine, pushing some of the sub locals into object member vars. If so, bear in mind that the original solution had the sequencing of operations and that is lost (except at the calling site) by this refactoring. Since the ordering is part of the object's knowledge, it should live within it (as a class method) and not at the call site.

    Lastly, it can be worth thinking of the member vars as 'has-a'. If the object *has* filehandles, then make them member vars (e.g. a logging class). If it just interacts with them (e.g. opens file so object can be serialised) then perhaps that resource is better owned by something else and passed in as a parameter.

      You're consuming a resource (3 open filehandles). This is 'small' (~1024, configurable) and finite on some platforms (most Unices, not Windows).
      C:\temp\test>perl -e "{open my $f,'>',++$a or die qq|$a: $!|;push @a,$ +f;redo} 2046: Too many open files at -e line 1.
        Is that cygwin? I don't think Windows puts a limit on the number of file handles a process can open (other than limits regarding system memory and other mad upper bounds).
Re: FileHandle objects as data members of another object
by blazar (Canon) on Oct 22, 2006 at 16:48 UTC
    Personally, I don't often use FileHandle, IO::File or similar packages to open and close filehandles. I'm content to use Perl's built-in open and close functions and -- since I've begun to drink the PBP kool-aid -- I always use lexically-scoped filehandles. So I'm always very conscious of the scope in which I open and close filehandles and would never open a handle without closing it in the same scope.

    One of the good points of lexical filehandles is that just like IO::Handle objects (as xdg points out), you do not need to explicitly close them in the same scope, because they will automatically be closed when going out of it, or more precisely when the last reference to them will evaporate. Which is not meant to imply that you don't have to do so, because in some situations you will: just not always, and not most often, at least with regular files.