Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I am often torn when it comes to recommending Perl's glob (aka the <...> operator, when it isn't readline). On the one hand, it's built in and often shortens code, on the other, it has several caveats one should be aware of.

  1. glob does not list filenames beginning with a dot by default. For someone coming from a unixish shell, this might make perfect sense, but for someone coming from, for example, a readdir implementation, this might be surprising, and so it should at least be mentioned.

  2. Probably the biggest problem I see with glob is variables interpolated into the pattern. The default glob splits its argument on whitespace, which means that, for example, glob("$dir/*.log") is a problem when $dir is 'c:/program files/foo'. This can be avoided by doing use File::Glob ':bsd_glob'; (Update: except on Perls before v5.16, please see Tux's reply and below for alternatives), but that doesn't help with the next problem:

  3. If a variable interpolated into the pattern contains glob metacharacters (\[]{}*?~), this will cause unexpected results for anyone not aware of this list and expecting the characters to be taken literally.

  4. Lastly, File::Glob can override glob globally. If, for example, you use it in a module, and someone else overrides the default glob, then suddenly your code might not behave the way you expected.

  5. <update> glob in scalar context with a variable pattern also suffers from surprising behavior, as choroba pointed out in his reply - thank you! (additional info) </update>

That's why I think advising the use of glob without mentioning the caveats is potentially problematic. Perhaps one wants to create a backup of a folder and don't want to miss any files, say for example, .htaccess? And I also often see things like glob("$dir/*") going without comment.

Personally I find readdir, in combination with some of the functions from File::Spec, to be a decent, if slightly complicated, tool (one example). One better alternative among several is children from Path::Class::Dir, or methods from one of the other modules like Path::Tiny. (Modules like File::Find::Rule often get mentioned as alternatives, except that those of course recurse into subdirectories by default.)

use Path::Class qw/dir/; my @files = dir('foo', 'bar quz', 'baz')->children; # @files includes .dot files, but not . and .. # and its elements are Path::Class objects print "<$_>\n" for @files;

Now, of course this isn't to say glob is all bad, I've certainly used and recommended it plenty of times. If one has read all of its documentation, including File::Glob, and is aware of all the caveats, and especially if one is using fixed strings for the patterns, it can be perfectly fine. But I still think it should not be blindly used or recommended.

In reply to To glob or not to glob by haukex

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2022-05-25 07:46 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (85 votes). Check out past polls.