perlmeditation
haukex
<h1><c>open</c> Best Practices</h1>
<h2>TL;DR: <c>open my $fh, '<', $filename or die "$filename: $!";</c></h2>
<p>You will see styles of [doc://open] such as "<c>open FILE, $filename;</c>"
or "<c>open(LOG, ">$filename") || die "Could not open $filename";</c>" in many places.
These mainly come from versions of Perl before 5.6.0 (released in 2000), because
that version of Perl introduced lexical filehandles and the three-argument [doc://open].
Since then, these new features have become a best practice, for the reasons below.</p>
<h2>1. Use Lexical Filehandles</h2>
<p>Instead of <c>open FILE, ...</c>, <b>say: <c>open my $fh, ...</c></b>.</p>
<p>Lexical filehandles have the advantage of not being global variables,
and such filehandles will be automatically closed when the
variable goes out of scope. You can use them just like any other filehandle,
e.g. instead of <c>print FILE "Output"</c>, you just say <c>print $fh "Output"</c>.
They're also more convenient to pass as parameters to <c>sub</c>s.
Also, "bareword" filehandles like <c>FILE</c> have a potential for conflicts
with package names <small>(see [choroba]'s [id://11102698|reply] for details)</small>,
and they don't protect against typos like lexical filehandles do!
<small>(For two recent discussions on lexical vs. bareword filehandles,
see [id://11115734|this] and [id://11117517|this] thread.)</small></p>
<h2>2. Use the Three-Argument Form</h2>
<p>Instead of <c>open my $fh, ">$filename"</c>, <b>say: <c>open my $fh, '>', $filename</c></b>.</p>
<p>In the two-argument form of [doc://open], the filename has to be parsed
for the presence of mode characters such as <c>></c>, <c><+</c>, or <c>|</c>.
If you say <c>open my $fh, $filename</c>, and <c>$filename</c> contains
such characters, the [doc://open] may not do what you want, or worse, if
<c>$filename</c> is user input, this may be a security risk! The two-argument
form can still be useful in rare cases, but I strongly recommend to play it
safe and use the three-argument form instead.</p>
<p>In the three-argument form, <c>$filename</c> will always be taken as
a filename. Plus, the mode can include "[doc://PerlIO|layers]", so instead
of having to do a [doc://binmode] after the [doc://open], you can just
say e.g. <c>open my $fh, "<:raw", $filename</c>, or you can specify an encoding
such as <c>open my $fh, ">:encoding(UTF-8)", $filename</c>.
<b>Note:</b> As [doc://PerlIO|documented], <c>:encoding(UTF-8)</c> should be preferred
over <c>:utf8</c>, and on Windows, to decode UTF-16 properly, you need to say
<c>":raw:encoding(UTF-16):crlf"</c>, because otherwise the default <c>:crlf</c>
layer will incorrectly mangle the Unicode characters <c>U+0D0A</c> or <c>U+0A0D</c>.
Be aware that if you don't specify any layers, the layers in <tt>[doc://${^OPEN}]</tt>
are used (see that link for details).</p>
<h2>3. Check and Handle Errors</h2>
<pre>
open my $fh, '<', $filename; # Bad: No error handling!
open my $fh, '<', $filename || die ...; # <b>WRONG!</b><sup>1</sup>
open my $fh, '<', $filename or die "open failed"; # error is missing info
<b>open my $fh, '<', $filename or die "$filename: $!"; # good
open(my $fh, '<', $filename) or die "$filename: $!"; # good</b>
open(my $fh, '<', $filename) || die "$filename: $!"; # works, <b>but risky!</b><sup>1</sup>
use autodie qw/open/; # at the top of your script / code block
open my $fh, '<', $filename; # ok, but read [doc://autodie]!
</pre>
<p>You should check the return value of the [doc://open] function,
and if it returns a false value, report the error that is available
in the [doc://$!] variable. It is best to also report the filename
as well, and of course you're free to customize the message as needed (see the tips below for some suggestions).</p>
<p><sup>1</sup> It is a common mistake to use
<c>open my $fh, '<', $filename || die ...</c> -
because of the higher precedence of <c>||</c>, it actually means
<c>open( my $fh, '<', ($filename || die ...) )</c>. So to avoid mistakes,
I would suggest just staying away from <c>||</c> in this case <small>(as is
also highlighted in [id://11102770|these] replies by AM and [eyepopslikeamosquito])</small>.</p>
<p>Note that [doc://open] failing does not necessarily have to be
a fatal error, see some examples of alternatives [id://11101709|here].
Also, note that the effect of [doc://autodie] is limited to its lexical
scope, so it's possible to turn it on for only smaller blocks of code
<small>(as discussed in [kcott]'s [id://11102717|reply])</small>.</p>
<h2>4. Additional Tips</h2>
<ul>
<li>Make sure that the filename you're opening <i>always</i> matches the filename in the error message. One easy way to accomplish this is to use a single variable to hold the filename, like <c>$filename</c> in the above examples <small>(as described in [Eily]'s [id://11102693|reply])</small>.</li>
<li>Consider putting the filename in the error message in quotes or similar, such as <c>"'$filename': $!"</c>, so that it's easier to see issues arising from whitespace at the beginning or end of the filename <small>(as suggested in [Discipulus]'s [id://11102715|reply])</small>.</li>
<li>In addition, consider adding even more useful details to your error message, such as whether you're trying to read or write from/to the file, and put quotes around <c>$!</c> as well, so it's easier to tell everything apart <small>(as [id://11102747|suggested] by [haj])</small>.</li>
<li>On Windows, consider also displaying [doc://$^E] as part of the error message for more information <small>(as suggested in [Discipulus]'s [id://11102715|reply])</small>.</li>
<li>If you're setting global variables that will affect reading the file, like [doc://$/], it's best to use [doc://local] in a new block <small>(as mentioned in [stevieb]'s [id://11102687|reply])</small>.</li>
<li>Remember that it's possible for multiple processes to access the same file at the same time, and you may need to consider a way to coordinate that, such as file locking <small>(as mentioned in [davido]'s [id://11102708|reply])</small>.</li>
<li>For even more discussion, see Chapter 10, "I/O", in the book [https://www.oreilly.com/library/view/perl-best-practices/0596001738/|Perl Best Practices] by [TheDamian],
also the book [http://modernperlbooks.com/books/modern_perl_2016/index.html|Modern Perl] by [chromatic] is a great book about more modern Perl.</li>
</ul>
<hr>
<small>
<p><i>Fellow Monks:</i> I wrote this so I would have something to link
to instead of repeating these points again and again. If there's something
you think is worth adding, please feel free to suggest it!</p>
<p><i>Update 2019-07-12:</i> Added section "Additional Tips",
mentioned bareword filehandles, and added a bit more on <c>autodie</c>.
Thanks to everyone for your suggestions!
<i>2019-07-13:</i> Added more suggestions from replies, thanks!
<i>2020-04-19:</i> Added mention of typo prevention, as inspired by [id://11115734].
<i>2020-06-07:</i> Added links to threads about bareword vs. lexical handles and added note about <c>:crlf</c> and UTF-16 interaction on Windows.
<i>2022-02-08:</i> Updated notes on layers.
</p>
</small>