http://qs1969.pair.com?node_id=1187638

yulivee07 has asked for the wisdom of the Perl Monks concerning the following question:

Hi fellow perlmonks,
I noticed a strange behaviour when using glob to expand filenames. I try to identify directories which contain nothing besides one special sort of file. I noticed that the return condition of my script failed to execute, even if glob returned something.

#!/usr/bin/perl -w use strict; use warnings; use File::Basename; sub remove_files_in_empty_directories($) { my ($directory) = @_; # do nothing if the directory is not empty # This is the problematic part return if glob("$directory/*"); # directory is empty, remove the .Ecurep-Lenovo* files foreach my $file ( glob("$directory/.EcuRep-Lenovo_*") ) { print $file,"\n"; } } my $list_file = shift @ARGV; die "list_file is missing - Usage: $0 list_file\n" unless $list_file; open(F,"<",$list_file) || die "can not open $list_file: $!\n"; while (my $line = <F> ) { chomp $line; my $dir = dirname $line; remove_files_in_empty_directories($dir); }
I pass this script a file containing a list of directories (filename is test):
/ecurep/hw/7/7042/21/EC/21EC3EC/.EcuRep-Lenovo_outofscope /ecurep/hw/7/7042/21/EC/21EC41C/.EcuRep-Lenovo_outofscope /ecurep/hw/7/7042/21/EC/21EC4DC/.EcuRep-Lenovo_outofscope

I call the script like this: perl test_perl test

Output:
/ecurep/hw/7/7042/21/EC/21EC4DC/.EcuRep-Lenovo_outofscope


This is kinda unexpected, as all three directories contain other directories. When I step through with the debugger, I can evaluate the glob with x, and it returns a list of directories every single time. If I evaluate x scalar glob("$directory/*") it also evaluates to the first entry from the list every time. But in case if the last pathname it steps further into the function anyway instead of executing my return statement. I cannot get my head wrapped around this - what is happening here?

Further testing of the problem:
If I delete the first line of the testfile, thereby removing the first directory from the list of directories to be one shorter, it works.

test2:
/ecurep/hw/7/7042/21/EC/21EC3EC/.EcuRep-Lenovo_outofscope /ecurep/hw/7/7042/21/EC/21EC41C/.EcuRep-Lenovo_outofscope
perl test_perl test2 No Output (as expected)

File Permissions to the directories:
drwxr-s--- 4 root swsupt 512 Apr 08 04:53 21EC3EC drwxr-s--- 4 root swsupt 512 Apr 08 05:24 21EC41C drwxr-s--- 5 root swsupt 512 Apr 11 12:12 21EC4DC [1:root@itcaix23:]/ecurep/tmp/perl_test # ls -l /ecurep/hw/7/7042/21/E +C/21EC41C total 0 -rw-r--r-- 1 root swsupt 25 Jun 24 2016 .EcuRep-Leno +vo_outofscope drwxr-s--- 2 root swsupt 512 Mar 31 09:06 2017-03-31 drwxr-s--- 2 root swsupt 512 Apr 07 09:05 2017-04-07 [1:root@itcaix23:]/ecurep/tmp/perl_test # ls -l /ecurep/hw/7/7042/21/E +C/21EC4DC total 0 -rw------- 1 root swsupt 29 Apr 11 12:12 .EcuRep-Leno +vo_outofscope drwxr-s--- 2 root swsupt 512 Mar 28 09:48 2017-03-28 drwxr-s--- 2 root swsupt 512 Apr 04 09:49 2017-04-04 drwxr-s--- 2 root swsupt 512 Apr 11 09:48 2017-04-11 [1:root@itcaix23:]/ecurep/tmp/perl_test # ls -l /ecurep/hw/7/7042/21/E +C/21EC3EC total 0 -rw-r--r-- 1 root swsupt 25 Nov 04 05:26 .EcuRep-Leno +vo_outofscope drwxr-s--- 2 root swsupt 512 Mar 31 06:26 2017-03-31 drwxr-s--- 2 root swsupt 512 Apr 07 06:26 2017-04-07
I am working with perl on AIX here. I have tested with perl 5.20 on AIX7.2 and perl 5.8.8 on AIX6.1. The behaviour is the same.
I tried the following variations on the code (just substituting the problematic part):
# makes no difference return if glob "'${directory}/*'"; # with this construct it works! my @files = glob("$directory/*"); return if scalar @files; # doesn't work either - prints "hi hi <directory-name> if ( glob("$directory/*") ) { print "hi\n"; return; }
I am not shure if I am misusing glob here of if there is a problem with my perl-version. Can someone help me understanding what is going on?

Kind regards, Yulivee

Replies are listed 'Best First'.
Re: Strange behaviour when using glob in if condition
by haukex (Archbishop) on Apr 11, 2017 at 18:02 UTC

    I looked into this further and was able to reproduce the issue using your code, and find an explanation and workaround. glob in scalar context acts like an iterator. However, the iterator state is attached to the glob call site, even when the argument to glob changes. Here's a simple way to reproduce the issue:

    $ touch foo bar $ perl -MData::Dump -e ' sub myglob {scalar glob($_[0])} dd myglob($_) for qw/foo bar/ ' "foo" undef

    One might expect the output here to be "foo" and "bar". There is an excellent discussion in RT#123404, and it doesn't sound like this behavior is going to change, as potentially confusing as it is.

    A simple workaround is to force list context:

    return if ()=glob("$directory/*");
      Brilliant analysis!

      Sounds like we need a warning if glob is used in scalar context with a pattern holding interpolated variables. :/

      I for my part will now only use for glob() constructs for looping over a static list.

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!

Re: Strange behaviour when using glob in if condition
by vrk (Chaplain) on Apr 11, 2017 at 13:55 UTC

    Is it just a difference between scalar and list context? See the glob help: In list context (like my @files = glob(...)), you get a list of filenames, or the empty list if nothing matches. In scalar context, you get an iterator. When you evaluate the iterator, it yields the next matching filename, or undef when it reaches the end. Since your code works correctly when glob is in list context, maybe some of the glob calls in scalar context actually share an iterator?

Re: Strange behaviour when using glob in if condition
by haukex (Archbishop) on Apr 11, 2017 at 13:31 UTC
    # do nothing if the directory is not empty return if glob("$directory/*");

    glob skips filenames beginning with a dot, try return if glob("$directory/{*,.*}");

    Update: Actually, upon re-reading the question I think I misunderstood what you are trying to do. Update 2: Yep, see my other reply for the solution.