in reply to pushing similar lines into arrays

Use a hash to hold your arrays:
while (<>) { my ($key,$rest) = split; $hash{$key} ||= []; push(@{$hash{$key}}, $_); }
This'll give you a hash like:
( 243_405 => [ "243_405 35 23 13", "243_405 46 21 15" ], 241_333 => [ "241_333 65 32 20", "241_333 52 44 11" ] )
------------ :Wq Not an editor command: Wq

Replies are listed 'Best First'.
Re: Re: pushing similar lines into arrays
by davido (Cardinal) on Mar 18, 2004 at 06:28 UTC
    Just a few observations that might clean up your implementation.

    • Don't forget to chomp.

    • No need to do the $hash{$key} ||= []; thing. Yes, it creates a key pointing to an anonymous array. But so does your push function in the next line. Even with use warnings; you don't need the ||= thing.

    • I don't see any advantage to pushing $_ into the anonymous array, while discarding $rest. I would think the way to go would be to use the three argument version of split so that you can limit split's output to two items; one containing the key, and one containing everything else. Your current code puts the key in $key, the first value in $rest, tosses out all the rest of the values, and then ultimately tosses out $rest too, only to push everything into the anon-array. Kinda wierd, IMHO.

    See later in this thread for what I believe is an implementation closer to the OP's needs.


    Dave

      Eh, whatever. Those points are really just style issues. There are almost certainly some tiny performance ramifications to the way the split is done... but hardly an issue in such an example (really... very tiny ramifications).

      As for chomping the line... he said he wanted the line. He didn't say anything about the newline, so I didn't assume anything about the newline.

      As for the unnecessary ||= []: yes, I know that's unnecessary, but I do it anyways, always. It's just personal style. In anything other than a one-liner (where economy of characters is important) I avoid the implicit autovivification of undef into anonymous references, just because (I believe) it's clearer to someone reading the code if you are explicit. They don't have to wonder if that autovivification was an accident or not.

      Finally, for the splitting... (again personal style), I find the 3-arg form of split to be ugly. Split and join are very pure and beautiful functional notions (string <=> list), and unless there's a compelling reason to mess with split's default behavior, I don't. Again, taking the unwanted stuff in and explicitly discarding it is also for readability and clarity of intent. If anything, looking at it now, I should have said my ($key, @rest) = split; to be clear that it was a list of arbitrary length and I was only interested in the first item.

      Anyway, no need to get into an argument over style. I just wanted to make clear that I didn't do that because I failed to understand... it was merely how I like to do it. ++ to you for your attention to detail, though.

      ------------ :Wq Not an editor command: Wq
        I ++'ed your original response anyway. The concepts were sound. I just questioned the implementation. I still wonder about the reason for splitting on whitespace into ($key, $rest) when $rest really isn't getting the "rest"... it's only getting one element out of several; the rest are falling into the bit bucket. But then again $rest falls into the bitbucket too since it's never used later on. As you mention, ( $key, @rest ) makes more sense. But even more clear to the reader would be ( $key, undef ) = split;

        As for the chomp issue, I brought it up because without chomp you're not getting the same output that your post suggested you would get.

        Anyway, as you mentioned, these are mostly style issues, though we did have slightly different interpretations as to what kind of output the OP desired.


        Dave