Re^3: array of arrays

Ok, glad that you saw my 2 solutions with your 2 different data formats. This is easier if you make an AoA or a HoA and then as a second step, do the sums. In my HoA solution the "number of numbers" is just the scalar value of @{$HoA{$key}}. Something like print "elements in array=".@{$HoA{$key}}."\n" should work.

Now of course it is not necessary to even fiddle with an AoA or a HoA. You can just keep a running sum as you go. When the "line no" changes, print the current line results and start a "new line". The disadvantage is that the program logic is a bit more complicated, because you have to figure out "on the fly" when a new "line" starts and when it finishes.

As an example, I coded one way to do this without creating the intermediate AoA or the HoA. This code of course uses less memory, but that is probably not even a remote consideration for your application. Nowadays a temporary data structure with 100's of MB's is nothing! The "expense" of using less memory is the extra complication of more decisions. Not all lines of code are "equal". Lines that make decisions are more error prone than ones that don't. For short, non-critical "utilities" I prefer the simplest program logic that "gets the job done" because the code is less likely to have a bug. Sometimes I work on some module that although what it does is "simple", it must be made very efficient for the overall system to work (maybe it is used often or processes a lot of data). In that situation a lot more work in coding and testing is required. Programming is part science and part art.

So here is yet another way... If you want to have a count of the "number of numbers" in each line, then set up a variable that is incremented every time that $line_total is changed (either by assignment or by addition of a additional value). I leave that as an exercise should you desire. When looping, there are often 3 phases to consider: a)how to get loop started, b)what loop normally does and c) what happens to finish the loop. Rather than starting the coding with (a), with experience you will code (b) first and then figure out how make (a) and (c) happen.

I do hope that my point about avoiding indices when possible sunk in. Anyway, as a demo exercise, an algorithm that does not create a full memory representation of the data, but rather calculates as it goes:

#!usr/bin/perl
use strict;
use warnings;

my $line_total=0;
my $total = 0;
my $current_bucket = undef;

while (my $line = <DATA>)
{ 
   my ($bucket, $num) = $line =~ m/^\s*(\d+)\s*\|\s*(\d+)/;
   
   if (!defined($current_bucket))  # start the first "bucket". 
                                   # use of defined() instead of zero
                                   # as a flag allows for a "zero" 
                                   # bucket which I added as a 
                                   # test case.
   {
      $line_total = $num;
      $current_bucket = $bucket;
   }
   elsif ($bucket == $current_bucket) # "normal" case
   {
      $line_total += $num;    
   }
   else # a new "bucket" starts...
   {
      # output current bucket's results

      print "Line $current_bucket = $line_total\n";
      $total += $line_total;

      # We've already read a line for the next bucket.
      # Adjust values to start $line_total running for this
      # new "bucket"
      
      $line_total = $num;
      $current_bucket = $bucket;
   }
}

# print the last bucket's results to finalize output:

print "Line $current_bucket = $line_total\n";
$total += $line_total;

## This is the total result

print "total=$total\n";

=Prints
Line 0 = 10
Line 1 = 150
Line 2 = 75
Line 3 = 55
total=290
=cut


__DATA__
0|10
1|10
1|20
1|30
1|40
1|50
2|15
2|25
2|35
3|1
3|2
3|3
3|4
3|5
3|6
3|7
3|8
3|9
3|10
[download]

Comment on Re^3: array of arrays Select or Download Code