Re: array of arrays
by kevbot (Vicar) on Jun 11, 2017 at 06:02 UTC
|
This can be done using a hash.
#!/usr/bin/env perl
use strict;
use warnings;
open my $fh, '<', 'test_file.txt' or die 'Can not open file';
my %data;
while(<$fh>){
chomp $_;
my ($key, $val) = split(/\|/, $_);
$data{$key} += $val;
}
foreach my $key ( sort { $a <=> $b } keys %data ){
print "the total is: (".$data{$key}.")\n";
}
exit;
| [reply] [d/l] |
|
|
Hello kevbot:
Thank you for your solution. Since this task is for me to learn, took me a while to study one by one from the bottom up. Yours was the last.
I would like to know if it is possible for you to show me how print on the fly each row of the while loop with the array size and the sum total of each array.
Something like this:
my %data;
$e = '0';
while(<$fh>){
$e++;
chomp $_;
my ($key, $val) = split(/\|/, $_);
push @{$data{$key}},$val;
$Grand_Total += "$val";
my $size = scalar(@{$data{$key}}); ## WRONG doesn't print
$Sub_Total[$key] += $data{$key}[$val]; ## WRONG doesn't prin
+t
print"Row ($e) / Array Size [$size] key ($key) / Amount ($val
+) / Sub Total ($Sub_Total[$key]) / Grand_Total = ($Grand_Total)<hr>"
+;
}
Hope you can do it
Thanx
virtualweb | [reply] [d/l] |
|
|
$Sub_Total[$key] += $data{$key}[$val];
with
$Sub_Total[$key] +=$val;
You should then come back and tell us what you did wrong in the original.
| [reply] [d/l] [select] |
|
|
Hello virtualweb,
My solution did not use an array. The contents of $data{$key} are a single scalar value (the running total of values for the given key). So, my code is not keeping track of how many data entries that are encountered for a given value of $key.
Here is a modified version of my code that will print out the information you request. Note, I'm still not using arrays. In this code, $data{$key} contains a hash reference with keys size and total. The value of size is the current number of elements found for the given $key. The value of total is the current total of values that have been encountered for the given $key.
#!/usr/bin/env perl
use strict;
use warnings;
open my $fh, '<', 'test_file.txt' or die 'Can not open file';
my %data;
my $row_number = 1;
my $grand_total = 0;
while(<$fh>){
chomp $_;
my ($key, $val) = split(/\|/, $_);
$data{$key}->{'size'}++;
$data{$key}->{'total'} += $val;
$grand_total += $val;
print "Row ($row_number) ".
"/ Array Size [$data{$key}->{'size'}] key ($key) ".
"/ Amount ($val) / Sub Total ($data{$key}->{'total'}) ".
"/ Grand_Total = ($grand_total)\n";
++$row_number;
}
foreach my $key ( sort { $a <=> $b } keys %data ){
print "the total is: (".$data{$key}->{'total'}.")\n";
}
exit;
| [reply] [d/l] [select] |
Re: array of arrays
by Marshall (Canon) on Jun 11, 2017 at 06:49 UTC
|
Yes, kevbot's solution with a hash is good. Here is what I typed while keybot was also typing - this preserves the numbers in each row if more calculations are needed. If there are only 30 rows, a hash of array is a good idea. If there are 100,000 rows, that advice would change. Here the hash gets rid of the need to deal with index[0] of the array of array (2-D array). A more efficient way can be done by processing row by row and outputting a line result when the first number changes.
#!usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %HoA; # a Hash of Array
while (my $line = <DATA>)
{
my ($bucket, $num) = $line =~ m/^\s*(\d+)\s*\|\s*(\d+)/;
push @{$HoA{$bucket}},$num;
}
my $total;
foreach my $key (sort {$a<=>$b} keys %HoA)
{
my $line_total;
foreach my $num (@{$HoA{$key}})
{
$line_total += $num;
}
print "Line $key total = $line_total\n";
$total += $line_total;
}
print "Grand Total = $total\n";
=Prints
Line 1 total = 150
Line 2 total = 75
Line 3 total = 55
Grand Total = 280
=cut
__DATA__
1|10
1|20
1|30
1|40
1|50
2|15
2|25
2|35
3|1
3|2
3|3
3|4
3|5
3|6
3|7
3|8
3|9
3|10
| [reply] [d/l] [select] |
|
|
Hi Marshall:
Thanx for this solution.. I was going from bottom to top analyzing all suggestions and that's why in your next solution I asked for what you coded here. (I didn't see it till now).
I wonder if you can help me print the size of each anonymous array ($count) and all the results (number, sub total and grand total) in one line inside the while loop something like this:
my $e = '0';
while (my $line = <DATA>)
{
$e++;
my ($bucket, $num) = $line =~ m/^\s*(\d+)\s*\|\s*(\d+)/;
push @{$HoA{$bucket}},$num;
my $grand_total += $num;
## Sub Total = the running total per line up to 150, 75 and 55.?
my $sub_total = "????";
## the size of each anonymous array ##
@Count[$bucket] += $num; ## I know this is super wrong ##
print"Row $e / Number Array ($bucket) / Num ($num) / Array Size ($C
+ount[$bucket])/ Sub Total ($sub_total) / Grand_Total ($grand_total)/n
+";
}
Thanx beforehand
| [reply] [d/l] |
|
|
Ok, glad that you saw my 2 solutions with your 2 different data formats. This is easier if you make an AoA or a HoA and then as a second step, do the sums. In my HoA solution the "number of numbers" is just the scalar value of @{$HoA{$key}}. Something like print "elements in array=".@{$HoA{$key}}."\n" should work.
Now of course it is not necessary to even fiddle with an AoA or a HoA. You can just keep a running sum as you go. When the "line no" changes, print the current line results and start a "new line". The disadvantage is that the program logic is a bit more complicated, because you have to figure out "on the fly" when a new "line" starts and when it finishes.
As an example, I coded one way to do this without creating the intermediate AoA or the HoA. This code of course uses less memory, but that is probably not even a remote consideration for your application. Nowadays a temporary data structure with 100's of MB's is nothing! The "expense" of using less memory is the extra complication of more decisions. Not all lines of code are "equal". Lines that make decisions are more error prone than ones that don't. For short, non-critical "utilities" I prefer the simplest program logic that "gets the job done" because the code is less likely to have a bug. Sometimes I work on some module that although what it does is "simple", it must be made very efficient for the overall system to work (maybe it is used often or processes a lot of data). In that situation a lot more work in coding and testing is required. Programming is part science and part art.
So here is yet another way... If you want to have a count of the "number of numbers" in each line, then set up a variable that is incremented every time that $line_total is changed (either by assignment or by addition of a additional value). I leave that as an exercise should you desire. When looping, there are often 3 phases to consider: a)how to get loop started, b)what loop normally does and c) what happens to finish the loop. Rather than starting the coding with (a), with experience you will code (b) first and then figure out how make (a) and (c) happen.
I do hope that my point about avoiding indices when possible sunk in. Anyway, as a demo exercise, an algorithm that does not create a full memory representation of the data, but rather calculates as it goes:
#!usr/bin/perl
use strict;
use warnings;
my $line_total=0;
my $total = 0;
my $current_bucket = undef;
while (my $line = <DATA>)
{
my ($bucket, $num) = $line =~ m/^\s*(\d+)\s*\|\s*(\d+)/;
if (!defined($current_bucket)) # start the first "bucket".
# use of defined() instead of zero
# as a flag allows for a "zero"
# bucket which I added as a
# test case.
{
$line_total = $num;
$current_bucket = $bucket;
}
elsif ($bucket == $current_bucket) # "normal" case
{
$line_total += $num;
}
else # a new "bucket" starts...
{
# output current bucket's results
print "Line $current_bucket = $line_total\n";
$total += $line_total;
# We've already read a line for the next bucket.
# Adjust values to start $line_total running for this
# new "bucket"
$line_total = $num;
$current_bucket = $bucket;
}
}
# print the last bucket's results to finalize output:
print "Line $current_bucket = $line_total\n";
$total += $line_total;
## This is the total result
print "total=$total\n";
=Prints
Line 0 = 10
Line 1 = 150
Line 2 = 75
Line 3 = 55
total=290
=cut
__DATA__
0|10
1|10
1|20
1|30
1|40
1|50
2|15
2|25
2|35
3|1
3|2
3|3
3|4
3|5
3|6
3|7
3|8
3|9
3|10
| [reply] [d/l] [select] |
Re: array of arrays
by Marshall (Canon) on Jun 11, 2017 at 09:02 UTC
|
I re-looked at your original code and my brain hurts!
It is of course possible to use indices to access a 2-D array in Perl, however this is not the normal situation. A far, far more normal situation is to access each row as an array of values. This is also true in C albeit with different syntax than this.
I re-wrote your code below.
These integer index buddies of i and j just don't appear that often in Perl code. Of course Perl allows that syntax. Note that by "not often", I do not mean "never". The most common errors in programming are memory allocation errors and "off by one" errors when using array indices or when looping. Perl for the most part takes care of memory allocation for you in a very efficient way - you don't have to worry about it unless you are doing something really fancy. This "off by one error" stuff can be much more problematic. In general don't use i or j indices unless you have to.
#!/usr/bin/perl
use strict;
use warnings;
my @AoA;
while (my $line = <DATA>)
{
my @tmp = split ' ',$line;
push @AoA, [ @tmp ];
}
## Print the totals for each line ###
## and the final grand_total ###
my $grand_total;
foreach my $row_ref (@AoA)
{
my $line_total;
foreach my $num (@$row_ref)
{
$line_total += $num;
}
print "Line Total: $line_total\n";
$grand_total += $line_total;
}
print "Grand Total: $grand_total\n";
=Prints:
Line Total: 150
Line Total: 75
Line Total: 55
Grand Total: 280
=cut
__DATA__
10 20 30 40 50
15 25 35
1 2 3 4 5 6 7 8 9 10
| [reply] [d/l] |
|
|
| [reply] |
|
|
| [reply] [d/l] [select] |
Re: array of arrays
by BillKSmith (Monsignor) on Jun 11, 2017 at 14:11 UTC
|
Your existing code will work with very little change.
while (<TESTFILE>) {
#@tmp = split; # Split elements into an array.
#push @AoA, [ @tmp ]; # Add an anonymous array reference
+to @AoA.
@tmp = split /\|/;
$AoA[$tmp[0]-1] = []if !defined $AoA[$tmp[0]-1];
push @{$AoA[$tmp[0]-1]}, $tmp[1];
}
| [reply] [d/l] |
|
|
| [reply] |
|
|
Modifying existing software to support new requirements is called "Maintenance". (A poor choice of words, but were stuck with it.) Few of us ever get the luxury to start over, even when it would be cheaper in the long term. In the short term, it is almost always faster to cram in one more change. In that spirit, I suggest:
my $Grand_Total = 0;
for $i ( 0 .. $#AoA ) {
$row = $AoA[$i];
for $j ( 0 .. $#{$row} ) {
$Total_Balance[$i] += "$row->[$j]";
}
print "the total is: ($Total_Balance[$i])<hr>\n";
$Grand_Total += $Total_Balance[$i];
}
print "Grand Total is: ($Grand_Total)<hr>\n"
_
But wouldn't you prefer to write:
use strict;
use warnings;
use List::Util qw(sum);
my @AoA;
while (<DATA>) {
my ($index, $value) = split /\|/;
push @{ $AoA[$index-1] }, $value;
}
my @Total_Balance = map {sum( @$_ )} @AoA;
print "The total is: ($_)\n" foreach @Total_Balance;
print "Grand total is: (", sum( @Total_Balance ), ")\n";
__DATA__
1|10
1|20
1|30
1|40
1|50
2|15
2|25
2|35
3|1
3|2
3|3
3|4
This design is neither fast nor small. Its merit is that each pass through the data can only do one thing. You can understand, validate, or modify any section without concern about side effects.
| [reply] [d/l] [select] |
Re: array of arrays
by Anonymous Monk on Jun 11, 2017 at 01:47 UTC
|
$ perl -F'\|' -nale '$x[$F[0]-1]+=$F[1]}{print"the total is: ($_)"for@
+x' input.txt
the total is: (150)
the total is: (75)
the total is: (55)
You might need different quoting on -F depending on your shell. Add option -MO=Deparse to see the longer code | [reply] [d/l] [select] |