John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

Consider this simple function, which takes a file name and calls another function which processes a string. This wrapper sucks in the entire file and then calls the other function.

sub scanfile ($) { my $filename= shift; local $/; undef $/; my $handle; open $handle, '<', $filename or croak "can't open file [$filename] + "; return scan (<$handle>); }
Comments, anyone?

—John

Replies are listed 'Best First'.
Re: Code Review on Several Interesting Points
by chipmunk (Parson) on Nov 12, 2001 at 09:34 UTC
    The file will automatically be closed when there are no more references in memory to the filehandle. In your code snippet above, this occurs when scanfile() returns. However, if you had return $handle; at the end of your sub, then the filehandle would be returned to the calling code, and the file wouldn't be closed at that point.

    open my $handle, ... will work; you'll often see it written this way. In general, my $variable can be used anywhere. Keep in mind that any other occurences of $variable in the same expression refer to the previous $variable, e.g. $variable = 7; my $variable = 1 + $variable; assigns 8 to the lexical $variable. Additionally, avoid using my $variable with a statement modifier, such as my $variable = 1 if something();. The behavior will probably not be what you expect. :)

    Yes, the value of $/ has effect when <FILE> is executed, and you can change it between reads. (In Perl6, the $/ equivalent will be local to each filehandle.) BTW, local $/; undef $/; is redundant, because localizing $/ sets it to undef anyway.

    I usually use single quotes to distinguish a file name in error messages, but square brackets look good.

(jeffa) Re: Code Review on Several Interesting Points
by jeffa (Bishop) on Nov 12, 2001 at 10:09 UTC
    "I like the three-argument form of open."

    Some of us like *gasp* IO::File . . . :O

    my $handle = IO::File->new($filename,'<') or croak ...
    And i like the idea of using brackets instead of quotes here, a lot of log messages uses brackets as well.

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    F--F--F--F--F--F--F--F--
    (the triplet paradiddle)
    
Re: Code Review on Several Interesting Points
by dws (Chancellor) on Nov 12, 2001 at 09:35 UTC
    Another way to handle $/ is to localize it within a block (which is kind of silly in this case, since the function is so short, but here it is as an example). Localizing a variable also sets it to undef, so that additional assign is redundant.

    It often helps to know why an open fails, so printing $! helps. The error messages are usually verbose enough to render "can't open" redundant.

    sub scanfile ($) { my $filename= shift; open my $handle, '<', $filename or croak "[$filename] $!"; return do { local $/; scan(<$handle>) }; }
Re (tilly) 1: Code Review on Several Interesting Points
by tilly (Archbishop) on Nov 12, 2001 at 15:19 UTC
    Points.

    1. If you have already called local on $/, it is already undefined. So that line is redundant.
    2. You can indeed move the my into the open.
    3. I prefer seeing parentheses there. You don't actually need them, but I write them anyways. Besides which, if you aren't in the habit of writing them, it is too easy for someone copying you to decide they like writing or as || and not know they are losing the error check.
    4. Speaking of the error check, you aren't testing $!.
    5. If you are using a 3 argument open already so your code documents what you are doing, then I like having your error message say whether you are reading, etc.
    6. Your factoring of work bothers me. Opening a file and slurping it in are things you are likely to have to write a lot. Slurping it just to scan the file is less likely. So I would prefer to see a function to slurp in the file and a separate one to scan through the slurped in file. Then call one with input from the other. An advantage to this arrangement is that your global is not in an altered state while scan is being called...
      global in an altered state while scan is being called

      Good catch! Thanks.

Re: Code Review on Several Interesting Points
by CharlesClarkson (Curate) on Nov 12, 2001 at 10:07 UTC
    It bugs me that $/ is global.

    You could avoid $/ by using read which, I believe, is more efficient.

    sub scanfile ($) { my $filename = shift; open my $handle, '<', $filename or croak "$filename: $!"; read $handle, my $buffer, -s $handle; return scan ($buffer); }



    HTH,
    Charles K. Clarkson

    And he puzzled three hours, till his puzzler was sore. Then the Grinch thought of something he hadn't before!
    - Dr. Seuss

(tye)Re: Code Review on Several Interesting Points
by tye (Sage) on Nov 12, 2001 at 22:13 UTC

    I think it is a mistake to keep $/ undefined during the call to scan() and I'd close the file before that point as well. So I get something more like:

    sub scanfile { croak "Usage: scanfile(\$filename)" unless 1 == @ARGV; my $data= do { my $filename= shift; require 5.6; # Autovivification of file handle: open my $handle, '<', $filename or croak "Can't read file [$filename], $!"; local $/; <$handle>; }; return scan( $data ); }
    I don't like autovivification of file handles. Putting the my inside of the open and flagging the script as requiring 5.6 makes them tolerable. I'd put "use 5.6" in there but I think "use 5.6" gives strange errors on rather recent versions of Perl, so I'm holding off on using it for a while longer.

    At first I just had a normal block and set $data inside of the block. That is fine. I just switched to a do block so that it becomes clear that the point of the block is to set $data. YMMV.

            - tye (but my friends call me "Tye")