LNEDAD has asked for the wisdom of the Perl Monks concerning the following question:

When using map to iterate over an array it appears that the elements that which are process are the same no matter how the array is changed during the map operation. Is this always going to be the case with map? On every system? See example code below.

I am writing a script to interact with a vendor program. I have the need to take an array of files and directories, iterate over the elements, and 1. add files to the array so that they are NOT "processed" by map and 2 add directories to the array to be "processed". The process step takes a directory, calls a program given by the vendor and returns a list of files and directories. I know this process should like recursion but given a directory it should take on call to the vendors program to get all the desired files that I need in my final output.

The result will be an array of files. Should I use something other than map to get my desired result.

Code: my @a = ( a1, a2, a3, a4, a5, ); map(&process, @a); print join " ",@a; sub process { push(@a,"x");; unshift(@a,"y");; print "array has " . scalar @a . " elements. This one is " ."====$_\n" +; } Output: array has 7 elements. This one is ====a1 array has 9 elements. This one is ====a2 array has 11 elements. This one is ====a3 array has 13 elements. This one is ====a4 array has 15 elements. This one is ====a5 y y y y y a1 a2 a3 a4 a5 x x x x x

Replies are listed 'Best First'.
Re: Unshift and push inside of map operation.
by jhourcle (Prior) on Sep 23, 2008 at 02:52 UTC

    I'd probably use two arrays -- one to be processed, and one of what's already been processed:

    my @toprocess = qw (a1 a2 a3 a4 a5); my @processed = (); my $i = 10; # avoid an infinite loop while (my $item = shift @toprocess) { push @toprocess, 'x' ; unshift @processed, 'y' ; push @processed, $item; print "array has ". scalar @toprocess. " items left. This one is +$item\n"; last unless $i--; # break out of infinite loop for this example }
      while (my $item = shift @toprocess)

      will fail if one of the items evaluates to false. You want to check "if an item has been returned", not "if an item has been returned and its true". Fix:

      while (my ($item) = shift @toprocess)

      A list assignment in scalar context returns the number of items being assigned.

      Update: Oops, I'm used to dealing with iterators that return () when out of data. As per tye's reply, my fix makes things worse. The following allows false values without introducing an infinite loop:

      while (@toprocess) { my $item = shift @toprocess;

      Due to the extra complexity, it might make more sense to the simpler original if it's safe.

        I suspect you haven't tested this. It might seem reasonable to assume that shift on an empty array returns an empty list. However, my testing on Perl 5.8.8 and 5.10 shows that shift @empty returns a single undef, not an empty list and your example produces an endless loop.

        - tye        

Re: Unshift and push inside of map operation.
by jethro (Monsignor) on Sep 23, 2008 at 02:52 UTC

    map and also foreach don't well work with changing arrays. You might use something like this:

    while (@a) { $dirtocheck= shift @a; ($dirs,$files)= vendorprogram($dirtocheck); unshift @a,@$dirs; push @result,$dirtocheck,@$files }

    This would give you a list of all the dirs and files the vendorprogram finds ordered depth first.

      map and also foreach don't well work with changing arrays.

      That's an odd thing to say about map since map has nothing to do with arrays. map iterates over a list, and it has no problem working with a list built from an array. Changes to the array are ignored because map doesn't iterate over arrays.

      foreach has optimizations which can causes unreliabilities when modifying the array over which it iterates, but map doesn't. It's very reliable in its behaviour.

        foreach doesn't, but that's an odd thing to say about map since map has nothing to do with arrays. map iterates over a list, and it has no problem working with a list built from an array.

        You are assuming a lack of optimizations and levels of optimization are very much subject to change. I strongly suspect that there are older versions of Perl that didn't optimize foreach over a single array. I suspect that there will be future versions of Perl that can optimize map/grep over a single array. Indeed, I'm not convinced that there aren't already versions of Perl that can optimize such (nor have I seen evidence that there are such, but I prefer to not assume things about levels of optimization exactly because they are so subject to change).

        It is merely an optimization that makes foreach treat a single array differently than some other list. There is little to stop map or grep from acquiring a similar optimization. So I find it "odd" for you to make such a stark distinction between them.

        - tye        

        I meant "doesn't work well" in the sense LNEDAD wanted to use it.
Re: Unshift and push inside of map operation.
by wol (Hermit) on Sep 23, 2008 at 13:56 UTC
    In my opinion (or "I personally believe" if you must) modifying an array with push/pop inside a map statement is one of those hints that there's a better way to go about a problem! :-)

    The canonical form of map (according to me) is used to return a mapped version of the input:

    @a = map { # A function of $_, eg: $_ . ".txt" } @a;
    One of the things I find particularly useful is that the block of code can evaluate to a list, including an empty one. Ie the output of map need not be the same length as the input. Eg
    # Emulate grep with map @a = map { elementShouldBeIncluded($_) ? $_ : () } @a; # Double up the list @a = map { ($_, $_) } @a;

    In this particular case, this feature may be exactly what you're after, depending on how what exactly your vendor supplied program does. Consider the following structure:

    @a = map { # Is it a dir? (-d $_) ? # Expand it (somehow) into a list of files turnDirIntoFilesUsingVendorProg($_) : # It's a file - pass it straight through $_ } @a;
    Hope that helps.

      The map solution was so cool and perlish that I had to try recursion with it!

      sub perlish { chomp; $_ = realpath($_); # follows links, makes full path ( -d ) ? ( $processed{$_} ? () : do { my $basedir = $_; $processed{$_} = 1; # prevent infinite recursion into link +s # turn results of ls into relative paths before recursion # and avoids using chdir map &perlish, ( map {"$basedir/$_"} `ls $_` ); } ) : $_; } %processed = ( getcwd() => 1 ); print join "\n", (map &perlish, `ls`),"\n";

      I love recursive maps now! It is MUCH prettier if you don't have to worry about symbolic links to dirs you already checked or relative paths:

      sub perlish { chomp; ( -d ) ? map &perlish, `find $_/* -maxdepth 0` : $_; } my @files = map &perlish, `find . -maxdepth 0`;
Re: Unshift and push inside of map operation.
by JavaFan (Canon) on Sep 23, 2008 at 10:45 UTC
    If you want to iterate over an array you are modifying (of which I won't say it's a smart idea), it may be best to use a C-style for loop. Just remember to increment the loop variable when you are unshifting. Here's an example:
    my @a = qw [a1 a2 a3 a4 a5]; for (my $i = 0; $i < @a; $i++) { my $element = $a[$i]; if ($element =~ /[135]/) { push @a, "x"; } else { unshift @a, "y"; $i++; # Don't forget this! } print "array has " . @a . " elements. This one is '$element'\n"; } print "array is now (@a)\n"; __END__ array has 6 elements. This one is 'a1' array has 7 elements. This one is 'a2' array has 8 elements. This one is 'a3' array has 9 elements. This one is 'a4' array has 10 elements. This one is 'a5' array has 11 elements. This one is 'x' array has 12 elements. This one is 'x' array has 13 elements. This one is 'x' array is now (y y y y y a1 a2 a3 a4 a5 x x x)
Re: Unshift and push inside of map operation.
by juster (Friar) on Sep 23, 2008 at 08:52 UTC

    It seems like you are trying to use some kind of priority queue or stack or something with directories getting processed out and files being stuck in. So you take a directory out of the queue (lets say on the right), process the directory and stick the files in the queue (on the left) which are never taken out. You keep doing this until there are no more directories, then you have a list of file results? Interesting!

    I made a recursive version before (I hope) I realized what was going on then tried to emulate what you were talking about.