Accumulating Column Total From a CSV for a Common Key Value

GeneV1 has asked for the wisdom of the Perl Monks concerning the following question:

I'm working with 100's of PS Account 'MONTHLY USAGE'
Reports. I have been able to parse the reports, do a
table lookup to match a Login ID with an Application
Name. My CSV up to this point is as follows:

Server,Date,Login ID,Application,CPU Minutes,Percent of Total
xsd00544,05/2007,s2pe,CTI Production ,8455,21.7%
xsd00544,05/2007,sybprd0l,OMS,5326,13.7%
xsd00544,05/2007,pctip01,CTI,5098,13.1%
xsd00544,05/2007,pitip01,CTI,1742,4.5%
xsd00544,05/2007,pipmp01,SSG-IPM,1742,4.5%
xsd00544,05/2007,maestro,SSG-RAC-Maestro ,1020,2.6%
xsd00544,05/2007,systro,SSG-RAC-Maestro ,836,2.1%
xsd00544,05/2007,f2pa,"DB2DARI  ""stored procedures"" prod",836,2.1%
xsd00544,05/2007,pomgp01,SSG-PMD-Omegamon,56,0.1%
xsd00544,05/2007,pptip01,PTI - Private Client Services,16,0.0%
xsd00544,05/2007,pbmwp01,BMW,3,0.0%
xsd00544,05/2007,s2pv,Merva,1,0.0%
[download]

I am trying to accumulate the Percent of Totals by
Application. A real example will have multiple Login
ID's (the 3rd field) per Application (the 4th field).
The code I am currently testing is as follows:

#!/usr/bin/perl
use strict;
use warnings;

my $i = 0;
my $mcpupct=0;
my (%total, $total);

 for my $file ("PSAMARTst1.csv") {

  open (my $SORTED,"<",$file) or die "Can't open file $file: $!";
  open (my $OUTCSV,">","PSAMARTst1out.csv") or die "Can't open OUT fil
+e: $!";

while (my $line = <$SORTED>) {
  chomp($line);

  $line =~ s/%//g;     ## Remove % Signs so that Percentages can be Op
+erated On

 next if $line =~ /^Server,Date/;

  my ($server,$date,$login,$appl,$cpumin,$cpupct) = (split(",",$line))
+;

##  if ($. == 0) {
##    print $OUTCSV "$server $date, \n";
##  print $OUTCSV " \n";
##  }

  $mcpupct += $cpupct;

  $total{$appl} += $cpupct;
}

foreach my $appl (sort keys %total) {
  printf $OUTCSV ("$appl,%3.1f%% \n", $total{$appl});
##  printf $OUTCSV ("$server,$date,$appl,%3.1f%% \n", $total{$appl});
}

  if (eof($SORTED)) {
     printf $OUTCSV ("TOTAL,%3.1f%% \n", $mcpupct);
##     printf $OUTCSV ("$server,$date,TOTAL,%3.1f%% \n", $mcpupct);
  }

close $SORTED or die "Can't close input SORTED file: $!";
close $OUTCSV or die "Can't close PSAMARTst1out.csv data file: $!";
}

##########################################################
#
#  Sort PSAMARTst1out.csv
#     Descending by CPU Percentage
#
##########################################################

 system("sort -t, -n -r -o PSASortTst2.dat +1 PSAMARTst1out.csv");
[download]

The problem I am having is that I need to retain the
$server and $date variables in the output csv created
prior to the system call to the unix sort command.

Since I am accumulating these totals in a foreach loop
using an array outside of the while loop (where the
$server and $date variables are assigned), there is
no way for me to output these variables with the rest
of the output record.

I am relatively new to perl (SAS is my weapon of
choice)- can these totals be accumulated in another
manner, so that I will be able to retain these 2
variables in my output? I have also read various
documentation concerning global variables and the our
statement in attempt to somehow retain these 2
variables. I have had no luck in determining how to
solve this problem up to this point - any assistance
would be kindly appreciated.

Comment on Accumulating Column Total From a CSV for a Common Key Value Select or Download Code

Replies are listed 'Best First'.
Re: Accumulating Column Total From a CSV for a Common Key Value by agianni (Hermit) on Aug 29, 2007 at 19:34 UTC
Instead of: `$total{$appl} += $cpupct;` [download] try: `$total{$server}->{$date}->{$appl} += $cpupct;` [download] Then you'll just need to do three nested for loops to go through the three levels of the hash: `for my $server ( keys %total ){ for my $date ( keys %{$total{$server}} ){ for my $appl ( keys %{$total{$server}->{$date}} ){ printf $OUTCSV ("$server,$date,$appl,%3.1f%% \n", $total{$ +server}->{$date}->{$appl}); } } }` [download] *Update:* code updated to correctly output total percentage. Which will allow you to sum up the `$cpupct` by date and by server and will add those two columns to you output. `perl -e 'split//,q{john hurl, pest caretaker}and(map{print @_[$_]}(joi +n(q{},map{sprintf(qq{%010u},$_)}(2*23074993,51016415261,75979 +36997,13177145131,3*26789167*181))=~/\d{2}/g));'` [download]	[reply] [d/l] [select]
Re^2: Accumulating Column Total From a CSV for a Common Key Value by GeneV1 (Initiate) on Aug 30, 2007 at 12:30 UTC
agianni, I ran your modified code with the input that I sent initially. When I run the script I am getting the use of an uninitialized variable message at the printf statement. `printf $OUTCSV ("$server,$date,$appl,%3.1f%% \n", $total{$appl});` [download] The $total($appl) must be wrong, as the output is as follows: `xsd00544,05/2007,PTI - Private Client Services,0.0% xsd00544,05/2007,CTI,0.0% xsd00544,05/2007,OMS,0.0% xsd00544,05/2007,CTI Production ,0.0% xsd00544,05/2007,SSG-RAC-Maestro ,0.0% xsd00544,05/2007,SSG-IPM,0.0% xsd00544,05/2007,Merva,0.0% xsd00544,05/2007,"DB2DARI ""stored procedures"" prod",0.0% xsd00544,05/2007,SSG-PMD-Omegamon,0.0% xsd00544,05/2007,BMW,0.0%` [download]	[reply] [d/l] [select]
Re^3: Accumulating Column Total From a CSV for a Common Key Value by agianni (Hermit) on Aug 30, 2007 at 13:06 UTC
Sorry, that was just off the top of my head, not tested. Hopefully you've figured this out on your own already, but just replace `$total{$appl}` with `$total{$server}->{$date}->{$appl}` and that should give you the values in the pct column you're looking for. If you didn't figure that out on your own, you should really read up on perl data structures. perldsc is a good place to start. `perl -e 'split//,q{john hurl, pest caretaker}and(map{print @_[$_]}(joi +n(q{},map{sprintf(qq{%010u},$_)}(2*23074993,51016415261,75979 +36997,13177145131,3*26789167*181))=~/\d{2}/g));'` [download]	[reply] [d/l] [select]
Re^4: Accumulating Column Total From a CSV for a Common Key Value by GeneV1 (Initiate) on Sep 06, 2007 at 13:10 UTC
Re: Accumulating Column Total From a CSV for a Common Key Value by moritz (Cardinal) on Aug 29, 2007 at 20:53 UTC
With DBI and DBD::CSV you can access the CSV file like a database, if you know a bit of SQL you can save a lot of programming work with these modules. Perl 6 in German -- Difficult Sudoku	[reply]
Re: Accumulating Column Total From a CSV for a Common Key Value by Limbic~Region (Chancellor) on Aug 29, 2007 at 22:40 UTC
GeneV1, It is considered bad form to post nearly the same question (Accumulating Column Total for a Particular Key) without referencing it. This is especially true when you don't bother to explain why the responses do not solve your problem. Cheers - L~R	[reply]