in reply to array of arrays

Yes, kevbot's solution with a hash is good. Here is what I typed while keybot was also typing - this preserves the numbers in each row if more calculations are needed. If there are only 30 rows, a hash of array is a good idea. If there are 100,000 rows, that advice would change. Here the hash gets rid of the need to deal with index[0] of the array of array (2-D array). A more efficient way can be done by processing row by row and outputting a line result when the first number changes.
#!usr/bin/perl use strict; use warnings; use Data::Dumper; my %HoA; # a Hash of Array while (my $line = <DATA>) { my ($bucket, $num) = $line =~ m/^\s*(\d+)\s*\|\s*(\d+)/; push @{$HoA{$bucket}},$num; } my $total; foreach my $key (sort {$a<=>$b} keys %HoA) { my $line_total; foreach my $num (@{$HoA{$key}}) { $line_total += $num; } print "Line $key total = $line_total\n"; $total += $line_total; } print "Grand Total = $total\n"; =Prints Line 1 total = 150 Line 2 total = 75 Line 3 total = 55 Grand Total = 280 =cut __DATA__ 1|10 1|20 1|30 1|40 1|50 2|15 2|25 2|35 3|1 3|2 3|3 3|4 3|5 3|6 3|7 3|8 3|9 3|10

Replies are listed 'Best First'.
Re^2: array of arrays
by Anonymous Monk on Jun 12, 2017 at 15:23 UTC

    Hi Marshall:

    Thanx for this solution.. I was going from bottom to top analyzing all suggestions and that's why in your next solution I asked for what you coded here. (I didn't see it till now).

    I wonder if you can help me print the size of each anonymous array ($count) and all the results (number, sub total and grand total) in one line inside the while loop something like this:

    my $e = '0'; while (my $line = <DATA>) { $e++; my ($bucket, $num) = $line =~ m/^\s*(\d+)\s*\|\s*(\d+)/; push @{$HoA{$bucket}},$num; my $grand_total += $num; ## Sub Total = the running total per line up to 150, 75 and 55.? my $sub_total = "????"; ## the size of each anonymous array ## @Count[$bucket] += $num; ## I know this is super wrong ## print"Row $e / Number Array ($bucket) / Num ($num) / Array Size ($C +ount[$bucket])/ Sub Total ($sub_total) / Grand_Total ($grand_total)/n +"; }


    Thanx beforehand

      Ok, glad that you saw my 2 solutions with your 2 different data formats. This is easier if you make an AoA or a HoA and then as a second step, do the sums. In my HoA solution the "number of numbers" is just the scalar value of @{$HoA{$key}}. Something like print "elements in array=".@{$HoA{$key}}."\n" should work.

      Now of course it is not necessary to even fiddle with an AoA or a HoA. You can just keep a running sum as you go. When the "line no" changes, print the current line results and start a "new line". The disadvantage is that the program logic is a bit more complicated, because you have to figure out "on the fly" when a new "line" starts and when it finishes.

      As an example, I coded one way to do this without creating the intermediate AoA or the HoA. This code of course uses less memory, but that is probably not even a remote consideration for your application. Nowadays a temporary data structure with 100's of MB's is nothing! The "expense" of using less memory is the extra complication of more decisions. Not all lines of code are "equal". Lines that make decisions are more error prone than ones that don't. For short, non-critical "utilities" I prefer the simplest program logic that "gets the job done" because the code is less likely to have a bug. Sometimes I work on some module that although what it does is "simple", it must be made very efficient for the overall system to work (maybe it is used often or processes a lot of data). In that situation a lot more work in coding and testing is required. Programming is part science and part art.

      So here is yet another way... If you want to have a count of the "number of numbers" in each line, then set up a variable that is incremented every time that $line_total is changed (either by assignment or by addition of a additional value). I leave that as an exercise should you desire. When looping, there are often 3 phases to consider: a)how to get loop started, b)what loop normally does and c) what happens to finish the loop. Rather than starting the coding with (a), with experience you will code (b) first and then figure out how make (a) and (c) happen.

      I do hope that my point about avoiding indices when possible sunk in. Anyway, as a demo exercise, an algorithm that does not create a full memory representation of the data, but rather calculates as it goes:

      #!usr/bin/perl use strict; use warnings; my $line_total=0; my $total = 0; my $current_bucket = undef; while (my $line = <DATA>) { my ($bucket, $num) = $line =~ m/^\s*(\d+)\s*\|\s*(\d+)/; if (!defined($current_bucket)) # start the first "bucket". # use of defined() instead of zero # as a flag allows for a "zero" # bucket which I added as a # test case. { $line_total = $num; $current_bucket = $bucket; } elsif ($bucket == $current_bucket) # "normal" case { $line_total += $num; } else # a new "bucket" starts... { # output current bucket's results print "Line $current_bucket = $line_total\n"; $total += $line_total; # We've already read a line for the next bucket. # Adjust values to start $line_total running for this # new "bucket" $line_total = $num; $current_bucket = $bucket; } } # print the last bucket's results to finalize output: print "Line $current_bucket = $line_total\n"; $total += $line_total; ## This is the total result print "total=$total\n"; =Prints Line 0 = 10 Line 1 = 150 Line 2 = 75 Line 3 = 55 total=290 =cut __DATA__ 0|10 1|10 1|20 1|30 1|40 1|50 2|15 2|25 2|35 3|1 3|2 3|3 3|4 3|5 3|6 3|7 3|8 3|9 3|10