fleurdmiller has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a little script that reads in a CSV file and stuffs each line into an array, as below:
[...] # Read in the file while (my $line = <FILE> ) { if ($csv->parse($line)) { @row = $csv->fields(); push @data, [ @row ]; } }
one row in @data looks about like this: (I produced the below line with this code:)
for ($i=0 ; $i < $lines; $i++) { for (my $j=0; $j < 6; $j++) { print "$data[$i][$j] "; } print "\n"; } QP 2/27/2008 555 1000.00 2493819320
Anyway, what I need to do is:
1) Generate the sum of element 3 where element 1 reoccurs in the array +. I.e., tally the day's sales. 2. Print out a line with the sum of the day's sales. 3. Print the row with each individual sale, with element 3 made negati +ve.
When I was merely producing one entry per row in the array, I used the code below, which worked just fine, but summing the sales by date is throwing me.
for ($i=0 ; $i < $lines; $i++) { $amount = $data[$i][3] * -1; print OUTFILE "some header information\n"; print OUTFILE "some more header info\n"; print OUTFILE "$data[$i][1]\t1350 Account\t$data[$i][3]\t$data +[$i][0]\t$data[$i][4]\n"; print OUTFILE "$trnsid\tt$data[$i][1]\t2600Other:$data[$i][2]\t$am +ount\t$data[$i][0] $data[$i][4]\n"; }

Replies are listed 'Best First'.
Re: Silly array-ish question.
by gwadej (Chaplain) on Oct 23, 2008 at 16:37 UTC

    Whenever I see a "sum this if that matches" type problem, I always think hash.

    Given your 2-d array above, how about:

    my %sums; for( my $i = 0; $i < $lines; ++$i ) { $sums{$data[$i][1]} += $data[$i][3]; }

    The %sums hash now contains the sums keyed by the day (if I'm understanding what field 1 is).

    Personally, I would tend to write this much more compactly, but this should work and is easier to understand.

    Update: fixed stupid typoes. (Thanks JadeNB).

    G. Wade

      Indeed. Though you are into some form of date processing if:

      • you cannot depend on every instance of the date for a given day being in exactly the same form.
      • you want to extract the sum for any given day.
      • you want to spit out the sums in any particular order.

      You're right. For a complete solution, I'd probably have done more. At the very least, I should have shown some code to walk the hash in some reasonable order to print the results.

      foreach my $date (sort by_date keys %sums) { print "$sums{$date}\n"; }

      where by_date is a subroutine that can compare the two dates in $a and $b appropriately for sort.

      For a more robust solution, you could also take the date values and convert them into some canonical form before using them as keys in the hash. (If that's easier to sort, so much the better.)

      To start with, though, I really didn't want to generate a response that was too complicated to understand in the simple case.
      G. Wade