ritontor has asked for the wisdom of the Perl Monks concerning the following question:

I have to admit up front, i'm not a terribly good perl programmer, and this little attempt at being "perlish" is stumping the hell out of me. Essentially, i'm trying to take the values in an array, split them on white space, append a newline, and then store it all in another array. I'm trying to be what I consider to be "tricky" by using map and an anonymous subroutine, but i seem to be missing something entirely.

here's a sample set of data:

foo bar
   munch
bar
foo bar

which should end up with each word stored in a seperate array element. this is the code i've tried:

@temp = map { join '\n', split /\s/, $_ }, @file;

anyone feel like lending a helping hand? I could do it with a bunch of foreach loops and temp variables, but i have this itch to do it the hard way that really needs to be scratched. ;)

Replies are listed 'Best First'.
Re: using map and anonymous subroutines
by dragonchild (Archbishop) on Apr 20, 2004 at 12:56 UTC
    Think about it this way - you're dealing with a group of stuff. If you want to remove things from that group, you use grep. If you want to transform each thing in that group, you use map. If you want to reorder the group, you use sort.

    Every single grep and map should be doing one and only one thing. You will want to chain them together, right-to-left, to make them do what you want.

    @temp = map { "$_\n" } map { split } @file;

    Read it from right to left.

    1. I'm starting with @file
    2. I take each element and apply split to it. (The default is to split $_ on whitespace.)
    3. I take each element of the new list and add a newline to it.
    4. Put this into @temp.

    Remember - you're going to have more things in your group after the first map.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

Re: using map and anonymous subroutines
by diotalevi (Canon) on Apr 20, 2004 at 12:49 UTC

    This isn't tricky, it just takes into account that the first map doing the split will create more elements and then the next map just appends newlines to everything it gets.

    @result = map "$_\n", map split, @file;
Re: using map and anonymous subroutines
by periapt (Hermit) on Apr 20, 2004 at 14:13 UTC
    Basically, I think dragonchild has it right. The "trick" is to build the map expression piece by piece. I'm unclear about the use of the array @file. If you are reading data from a file, or line by line in some way, and loading it into @file, you might consider processing the file directly as
    use strict; use warnings; my @tmp01 = (); push @tmp01, map{ $_."\n"} split " " while <DATA>; exit; __DATA__ foo bar munch bar foo bar
    The split " " construct ensures that the function splits on whitespace, /\s+\, after skipping any leading whitespace although when I tried split with and without the " " argument, it worked correctly both ways.

      The following variant of the above is how I would write it.
      # SNIP # my @temp = map { $_ . "\n" } map { split(/\s+/) } <DATA> # SNIP #
      It is functionally the same, I just find it easier to read, but then my views of readability don't always align with others.
Re: using map and anonymous subroutines
by flyingmoose (Priest) on Apr 20, 2004 at 13:36 UTC
    I love my map and anonymous subs! That's about the most fun part of Perl for me... :) Go go functional programming!

    Essentially, i'm trying to take the values in an array, split them on white space, append a newline, and then store it all in another array

    I'd do it like this:

    my @file = ("eddie van halen", "david lee roth", "alex van halen", "mi +chael anthony"); my @stuff = map { "$_\n"; } map { split /\s+/, $_; } @file; print @stuff;

    edit: doh! dragon beat me to it (that's what I get for not reading responses first), but ah well at least I arrived at the exact same solution (I'm just overly explicit -- call it paranoia) so I guess it proves we are likely equally insane or something like that :)

    If you want to remove things from that group, you use grep
    Not just that, but for those that don't know grep, it's also very nice for searching and counting! Essentially you are counting by removing all of the "non-hits" and then checking the scalar value of the array result, aka cardinality of the set. R0XX0R! (err, sorry, 1337 speak outbreak...)
Re: using map and anonymous subroutines
by Fletch (Bishop) on Apr 20, 2004 at 14:10 UTC

    Technically map takes either an expression or a block of code, not an anonymous sub.</pedant>

Re: using map and anonymous subroutines
by ihb (Deacon) on Apr 20, 2004 at 18:21 UTC

    A little note on efficiency.

    First create a file (win32):

    perl -wle"for ('00001'..'10000') { print join ' ', ($_) x 20; }" > + temp.tmp

    Then get @temp in various ways and look at peak memory consumption.

    open my $fh, 'temp.tmp' or die $!; # Alternative 1: 40 984 K my @temp = map { "$_\n" } map { split ' ' } <$fh>; # Alternative 2: 29 076 K my @temp = map { split ' ' } <$fh>; $_ .= "\n" for @temp; # Alternative 3: 14 816 K local $_; my @temp; push @temp => map "$_\n", split ' ' while <$fh>;

    Alternative 2 is less memory hungry (as well as faster) because Perl doesn't have to build up an extra list. Alternative 3 is even less memory hungry because it doesn't slurp the file before building up @temp. Of course, there could've been a fourth alternativ having alternativ 2's for loop, but I think the point is made already.

    (I use ActivePerl v5.8.0 built for MSWin32-x86-multi-thread, build 806.)

    Hope this helps,
    ihb

      We all have to have uses for our overly powerful pentium toaster units, don't we ? :)

      No, seriously, thanks for the feedback, though I would hope the case 1 and case2 would have been equivalent, essentially map is going to create an array, and we are saying that assignment for an array is slower than a push, or are we instead saying that the blocks { } are the slowdown? Either way, it seems we can get more granularity from this test to learn more about the internals.

      Anyhow, this all really matters when dealing with the size of input, usually memory is cheap and in surplus, so you just want speed ... yet not always...

      As a last note, the bestest fastest is not always the best way, given that readability and maintainability is important. Hence you might have to add blocks later, or maybe blocks add clarity, and definitely the push appears hackish -- if not indicative of a bug somewhere. I wouldn't leave production code using that final push.

      This feels very wrong to me.

      When you have data in an array, generally there should be no "\n" in the data.

      Once you are ready to output your data to the screen or to a file, then you might introduce newlines, either by using join, or by temporarily setting $" or by other similar means.

      --
      TTTATCGGTCGTTATATAGATGTTTGCA

Re: using map and anonymous subroutines
by ritontor (Acolyte) on Apr 20, 2004 at 16:30 UTC
    Thanks everyone, once again perlmonks has proven itself to be the greatest site on the entire internet. I think I am going to go absolutely out of my way to use maps for as many things as I can from now on, of course never breaking the "don't use in a void context" rule ;)
      I believe map in void-context was fixed a while back in Perl version-something-or-rather, some one else might remember when.

      As for maps, don't forget the elegance of a simple for/foreach for aliasing .. often it is more readable if you don't need to capture an array but just need to mangle one.

      $_++ for @array;

      Is the same as:

      @array = map { $_++ } @array;

      And also this can be occasionally cool:

      print for ( "Henry the eighth I am I am\n", "Henry the eighth I am\n", "I got married to the window next door\n", "She's been married 7 times before\n" );

      So anyway, map is cool, it has many good places, but often a foreach (even if it is more code) can be just as cool, and often more readable. I'm currently in the process of simplifying a functionally-styled CLI interface to use more foreach and less map, because after going home and coming back in the morning (and not having my Mountain Dew yet), the foreach is easier to read.

      But in your original example, yes, map seems to be the trick because the cardinality (err, size) of the list is changing and you don't want to mangle the original.