Accumulating Column Total for a Particular Key

GeneV1 has asked for the wisdom of the Perl Monks concerning the following question:

I'm working with 100's of PS Account 'MONTHLY USAGE'
Reports. I have been able to parse the reports, do a
table lookup to match a Login ID with an Application
Name. My output CSV at this point is as follows:

Login ID,Application,CPU Minutes,Percent of Total
s2pe,CTI Production ,8455,21.7%
sybprd01,OMS,5326,13.7%
pctip01,CTI,5098,13.1%
lxsadr54,CTI,1742,4.5%
pipmp01,SSG-IPM,1742,4.5%
maestro,SSG-RAC-Maestro ,1020,2.6%
f2pa,"DB2DARI  ""stored procedures"" prod",836,2.1%
pomgp01,SSG-PMD-Omegamon,56,0.1%
pptip01,PTI - Private Client Services,16,0.0%
pbmwp01,BMW,3,0.0%
s2pv,Merva,1,0.0%
[download]

I am trying to accumulate the CPU Minutes and/or the
Percent of Totals by Application. A real example will
have multiple Login ID's per Application (the 2nd
field). I know how to do this using SAS, but an unsure
of how I might accomplish this in perl.

Comment on Accumulating Column Total for a Particular Key Download Code

Replies are listed 'Best First'.
Re: Accumulating Column Total for a Particular Key by Limbic~Region (Chancellor) on Aug 27, 2007 at 14:34 UTC
GeneV1, You probably want to take a look at DBD::CSV. Here is an untested example of doing it "by hand". #!/usr/bin/perl use strict; use warnings; use Text::CSV; use constant APP => 1; use constant CPU => 2; my $file = $ARGV[0] or die "Usage: $0 <input_file>"; open(my $fh, '<', $file) or die "Unable to open '$file' for reading: $ +!"; <$fh>; # Throw away header my $csv = Text::CSV->new(); my %data; while (<$fh>) { chomp; if ($csv->parse($_)) { my @field = $csv->fields; $data{$field[APP]} += $field[CPU]; } else { warn "parse() failed on '$_'\n"; } } for my $app (keys %data) { print "$app\t$data{$app}\n"; } [download] Cheers - L~R	[reply] [d/l]
Re^2: Accumulating Column Total for a Particular Key by GeneV1 (Initiate) on Aug 27, 2007 at 15:27 UTC
Thanks for the help, but unfortunately for some reason most of the CPAN modules are not installed on this server.	[reply]
Re^3: Accumulating Column Total for a Particular Key by jZed (Prior) on Aug 27, 2007 at 16:54 UTC
That is true of most servers. You can install them yourself in your own directory. On windows, it's as simple as `ppm install DBD-CSV`. If you need more help installing, there are many useful nodes on perlmonks including A Guide to Installing Modules and A guide to installing modules for Win32.	[reply] [d/l]
Re^3: Accumulating Column Total for a Particular Key by GertMT (Hermit) on Aug 28, 2007 at 06:54 UTC
withouth using modules something like this? Gert #!/usr/bin/perl -w use diagnostics; use strict; my $sum0 = 0; my $sum1 = 0; printf "%-9s\t%-35s\t%8s\t%14s\n", "Login ID", "Application", "CPU Min +utes", "Percent of Total"; print "-" x 88 . "\n"; while (<DATA>) { tr/%//d; chomp; next if /^Login/; my @F = split /,/, $_; $sum0 += $F[2]; $sum1 += $F[3]; printf "%-9s\t%-35s\t%5s\t\t%4.1f\n", "$F[0]", "$F[1]", "$sum0", " +$sum1"; } __DATA__ Login ID,Application,CPU Minutes,Percent of Total s2pe,CTI Production ,8455,21.7% sybprd01,OMS,5326,13.7% pctip01,CTI,5098,13.1% lxsadr54,CTI,1742,4.5% pipmp01,SSG-IPM,1742,4.5% maestro,SSG-RAC-Maestro ,1020,2.6% f2pa,"DB2DARI ""stored procedures"" prod",836,2.1% pomgp01,SSG-PMD-Omegamon,56,0.1% pptip01,PTI - Private Client Services,16,0.0% pbmwp01,BMW,3,0.0% s2pv,Merva,1,0.0% [download]	[reply] [d/l]
Re: Accumulating Column Total for a Particular Key by jZed (Prior) on Aug 27, 2007 at 17:17 UTC
Since Limbic~Region's excellent response mentions DBD::CSV but then gives an example that doesn't use it, here's an example that does use it: `#!/usr/bin/perl use warnings; use strict; use DBI; my $dbh = DBI->connect( 'dbi:CSV:', undef, undef, {RaiseError=>1,PrintError=>0} ); $dbh->csv_tables->{Log} = { file => 'yourfile.csv' }; my( $oms_cpu_total ) = $dbh->selectrow_array(" SELECT SUM(cpu_minutes) FROM Log WHERE application = ? ",{},'OMS'); printf "Application %s used %d CPU minutes\n", 'OMS',$oms_cpu_time; __END__` [download] Note : you will need to change the column headings to be valid SQL column names. The easiest way is to just change spaces to underscorse, e.g. "cpu_minutes".	[reply] [d/l]