seattlejohn has asked for the wisdom of the Perl Monks concerning the following question:

Hello fellow monks,

I've got a command-line script that needs to run on multiple platforms, specifically Windows (both with and without Cygwin) and Un*x. The script takes a list of files and does some munging on them:

munge.pl a.txt b.txt c.txt

The issue arises when I process command-line args that contain wildcards:

munge.pl *.txt *.xml

bash seems to expand the wildcards into matching filenames before passing them to the perl script, whereas Windows just passes the arguments with wildcards intact.

Since I'd like behavior to be consistent (DWIM!) on both platforms, I experimented with using this code at the beginning of my script:

push @files, glob($_) foreach @ARGV;

This appears to have the desired effect: on Windows it forces the wildcards to expand, and on Cygwin/Linux it doesn't change anything because the wildcards have already been expanded.

My question for the wise folks here in the Monastery is: Is this a reasonably safe approach? Or have I opened myself up to sneaky problems I haven't thought about? Are there better ways to accomplish this task? (Search here didn't turn up much, but maybe I'm not asking the question in the right way...)

  • Comment on portable globbing behavior for command-line tool

Replies are listed 'Best First'.
Re: portable globbing behavior for command-line tool
by rsteinke (Scribe) on Jul 04, 2002 at 06:28 UTC

    This ignores the possibility of escaping wildcards in bash. For example, say you have a file foo*bar. Someone using your script in bash would do

    <code> munge.pl foo\*bar <code>

    Bash would replace the \* with *, and pass the script the filename foo*bar. You would then expand the wildcard, and fail to find the file.

    Ron Steinke rsteinke@w-link.net
      Aaah, good point. Perhaps I need to check $^O and do the globbing only on Windows...
Re: portable globbing behavior for command-line tool
by bronto (Priest) on Jul 04, 2002 at 08:28 UTC
    bash seems to expand the wildcards into matching filenames before passing them to the perl script

    You got the point. In general, UNIX shells expand metacharacters if they find a match.

    I don't see anything dangerous in your push, anyway, the glob work under the UNIX shell is useless. You could avoid it using $^O (or $OSNAME if you use English) with a conditional.

    I don't know what $OSNAME is under Windogs; assuming it begins with "Windows" you could just do something like:

    use English ; if ($OSNAME =~ /^Windows/i) { push @files, glob($_) foreach @ARGV; }

    Ciao!
    --bronto

    # Another Perl edition of a song:
    # The End, by The Beatles
    END {
      $you->take($love) eq $you->made($love) ;
    }

      $^O is MSWin32 under Windows (at least recent ones)... unfortunately, as Abigail points out above, using $^O turns out not to be a completely reliable way to determine whether the globbing behavior is safe. To take one simple example, some shells on MSWin32 (Cygwin bash) expand wildcards and others (MS-DOS) don't.

      The thing is that I would like to be able to do something that Does What I Mean -- or more precisely, Does What A Potentially Naive User Means. I suspect that somebody who uses DOS/Win and is familiar with commands like copy *.txt \over_here would expect to be able to write munge *.txt and have it do what he'd expect, not report File not found: *.txt.

      I suppose one kludgish approach would be to perform the globbing if ($filename =~ /[?*]/ && !-e $filename), for example. Since this particular munging process is non-destructive, there's probably no danger in trying to glob if and only if the filename passed in does not exist. (Yes, I realize that testing for /[?*]/ is a very DOS-centric approach because Unix shells may support more sophisticated wildcarding... this is just a "for example".)

      Interesting that this turns out to be a harder problem to solve that it seemed at first. As Abigail says, the real answer is probably to use a globbing vs. a non-globbing shell. I was just hoping there would be a way to make the behavior essentially consistent across platforms and shells without requiring that much thought on the part of a user. (I guess that's what GUIs are for ;-)

Re: portable globbing behavior for command-line tool
by rinceWind (Monsignor) on Jul 04, 2002 at 08:27 UTC
    I think that this is something specific to Windows, and MS-DOS. Other environments expect the command interpreter to glob out arguments. IIRC the POSIX standard defines this behaviour, so any operating system purporting to be POSIX compliant will exhibit this behaviour.

    Anyway, why shouldn't the Win32 perl kernel resolve this anomaly and make Windows conform to the expected behaviour? (Perl 6? wishlist?)

    By the by, VMS implements the globbing via the C runtime library. Command lines themselves are not globbed, but by the time they have become cast into (int argc, char *argv[]) they have been globbed.

    My $0.02 --rW

      Anyway, why shouldn't the Win32 perl kernel resolve this anomaly and make Windows conform to the expected behaviour?
      That would be too hard. There *are* shells available on Windows that do globbing. Furthermore, even on Unix, it's possible to pass in arguments that haven't been subject to shell expansion (for instance, when called with a call from the exec family from C (or in Perl, when exec is called with more than one argument)). Trying to figure out when to do globbing, and when not is bound to get it wrong sometimes.

      The bottom line is, if you want globbing, use a globbing shell.

      Abigail