GeneV1 has asked for the wisdom of the Perl Monks concerning the following question:


I'm working with 100's of PS Account 'MONTHLY USAGE'
Reports. I have been able to parse the reports, do a
table lookup to match a Login ID with an Application
Name. My CSV up to this point is as follows:
Server,Date,Login ID,Application,CPU Minutes,Percent of Total xsd00544,05/2007,s2pe,CTI Production ,8455,21.7% xsd00544,05/2007,sybprd0l,OMS,5326,13.7% xsd00544,05/2007,pctip01,CTI,5098,13.1% xsd00544,05/2007,pitip01,CTI,1742,4.5% xsd00544,05/2007,pipmp01,SSG-IPM,1742,4.5% xsd00544,05/2007,maestro,SSG-RAC-Maestro ,1020,2.6% xsd00544,05/2007,systro,SSG-RAC-Maestro ,836,2.1% xsd00544,05/2007,f2pa,"DB2DARI ""stored procedures"" prod",836,2.1% xsd00544,05/2007,pomgp01,SSG-PMD-Omegamon,56,0.1% xsd00544,05/2007,pptip01,PTI - Private Client Services,16,0.0% xsd00544,05/2007,pbmwp01,BMW,3,0.0% xsd00544,05/2007,s2pv,Merva,1,0.0%

I am trying to accumulate the Percent of Totals by
Application. A real example will have multiple Login
ID's (the 3rd field) per Application (the 4th field).
The code I am currently testing is as follows:
#!/usr/bin/perl use strict; use warnings; my $i = 0; my $mcpupct=0; my (%total, $total); for my $file ("PSAMARTst1.csv") { open (my $SORTED,"<",$file) or die "Can't open file $file: $!"; open (my $OUTCSV,">","PSAMARTst1out.csv") or die "Can't open OUT fil +e: $!"; while (my $line = <$SORTED>) { chomp($line); $line =~ s/%//g; ## Remove % Signs so that Percentages can be Op +erated On next if $line =~ /^Server,Date/; my ($server,$date,$login,$appl,$cpumin,$cpupct) = (split(",",$line)) +; ## if ($. == 0) { ## print $OUTCSV "$server $date, \n"; ## print $OUTCSV " \n"; ## } $mcpupct += $cpupct; $total{$appl} += $cpupct; } foreach my $appl (sort keys %total) { printf $OUTCSV ("$appl,%3.1f%% \n", $total{$appl}); ## printf $OUTCSV ("$server,$date,$appl,%3.1f%% \n", $total{$appl}); } if (eof($SORTED)) { printf $OUTCSV ("TOTAL,%3.1f%% \n", $mcpupct); ## printf $OUTCSV ("$server,$date,TOTAL,%3.1f%% \n", $mcpupct); } close $SORTED or die "Can't close input SORTED file: $!"; close $OUTCSV or die "Can't close PSAMARTst1out.csv data file: $!"; } ########################################################## # # Sort PSAMARTst1out.csv # Descending by CPU Percentage # ########################################################## system("sort -t, -n -r -o PSASortTst2.dat +1 PSAMARTst1out.csv");

The problem I am having is that I need to retain the
$server and $date variables in the output csv created
prior to the system call to the unix sort command.


Since I am accumulating these totals in a foreach loop
using an array outside of the while loop (where the
$server and $date variables are assigned), there is
no way for me to output these variables with the rest
of the output record.


I am relatively new to perl (SAS is my weapon of
choice)- can these totals be accumulated in another
manner, so that I will be able to retain these 2
variables in my output? I have also read various
documentation concerning global variables and the our
statement in attempt to somehow retain these 2
variables. I have had no luck in determining how to
solve this problem up to this point - any assistance
would be kindly appreciated.

Replies are listed 'Best First'.
Re: Accumulating Column Total From a CSV for a Common Key Value
by agianni (Hermit) on Aug 29, 2007 at 19:34 UTC

    Instead of:

    $total{$appl} += $cpupct;

    try:

    $total{$server}->{$date}->{$appl} += $cpupct;

    Then you'll just need to do three nested for loops to go through the three levels of the hash:

    for my $server ( keys %total ){ for my $date ( keys %{$total{$server}} ){ for my $appl ( keys %{$total{$server}->{$date}} ){ printf $OUTCSV ("$server,$date,$appl,%3.1f%% \n", $total{$ +server}->{$date}->{$appl}); } } }

    Update: code updated to correctly output total percentage.

    Which will allow you to sum up the $cpupct by date and by server and will add those two columns to you output.

    perl -e 'split//,q{john hurl, pest caretaker}and(map{print @_[$_]}(joi +n(q{},map{sprintf(qq{%010u},$_)}(2**2*307*4993,5*101*641*5261,7*59*79 +*36997,13*17*71*45131,3**2*67*89*167*181))=~/\d{2}/g));'

      agianni,


      I ran your modified code with the input that I sent
      initially. When I run the script I am getting the use
      of an uninitialized variable message at the printf
      statement.

      printf $OUTCSV ("$server,$date,$appl,%3.1f%% \n", $total{$appl});

      The $total($appl) must be wrong, as the output is as
      follows:
      xsd00544,05/2007,PTI - Private Client Services,0.0% xsd00544,05/2007,CTI,0.0% xsd00544,05/2007,OMS,0.0% xsd00544,05/2007,CTI Production ,0.0% xsd00544,05/2007,SSG-RAC-Maestro ,0.0% xsd00544,05/2007,SSG-IPM,0.0% xsd00544,05/2007,Merva,0.0% xsd00544,05/2007,"DB2DARI ""stored procedures"" prod",0.0% xsd00544,05/2007,SSG-PMD-Omegamon,0.0% xsd00544,05/2007,BMW,0.0%
        Sorry, that was just off the top of my head, not tested. Hopefully you've figured this out on your own already, but just replace $total{$appl} with $total{$server}->{$date}->{$appl} and that should give you the values in the pct column you're looking for. If you didn't figure that out on your own, you should really read up on perl data structures. perldsc is a good place to start.
        perl -e 'split//,q{john hurl, pest caretaker}and(map{print @_[$_]}(joi +n(q{},map{sprintf(qq{%010u},$_)}(2**2*307*4993,5*101*641*5261,7*59*79 +*36997,13*17*71*45131,3**2*67*89*167*181))=~/\d{2}/g));'
Re: Accumulating Column Total From a CSV for a Common Key Value
by moritz (Cardinal) on Aug 29, 2007 at 20:53 UTC
Re: Accumulating Column Total From a CSV for a Common Key Value
by Limbic~Region (Chancellor) on Aug 29, 2007 at 22:40 UTC