Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Summing repeated counts for items stored in separate file

by gam3 (Curate)
on Jul 28, 2007 at 13:22 UTC ( [id://629291]=note: print w/replies, xml ) Need Help??


in reply to Summing repeated counts for items stored in separate file

Here is a really simple solution that will not use too much memory.
use strict; use vars qw (%data); open(KEYS, "keys.txt") or die "error: keys.txt $!"; open(VALS, "values.txt") or die "error: values.txt $!"; while (my $key = readline(*KEYS)) { my $val = readline(*VALS); chomp $key; chomp $val; print "$key = $val\n"; unless (exists $data{$key}) { $data{$key}{min} = $data{$key}{max} = $data{$key}{total} = $va +l; $data{$key}{count} = 1; } else { $data{$key}{min} = $val if $data{$key}{min} > $val; $data{$key}{max} = $val if $data{$key}{max} < $val; $data{$key}{total} += $val; $data{$key}{count}++; } } use Data::Dumper; print Dumper \%data;
-- gam3
A picture is worth a thousand words, but takes 200K.

Replies are listed 'Best First'.
Re^2: Summing repeated counts for items stored in separate file
by dmorgo (Pilgrim) on Jul 28, 2007 at 21:10 UTC
    Thanks. Nice looking code.

    Why readline(*KEYS) instead of <KEYS> ?

      The readline function is what is used "behind the scenes" to implement the <> operator, and takes a typeglob, hence *KEYS instead of <KEYS>;

      Keeping track of the min, max, total and count (for average value) may be faster than using the List utilities described earlier, or at least should take less memory, since you only need to keep track of a few values for each key. Although the utilities in question are extremely fast, you might need an array of thousands of values for each key, growing in size as you process the files.
        You said:
        The readline function is what is used "behind the scenes" to implement the <> operator, and takes a typeglob, hence *KEYS instead of <KEYS>;
        OK, I follow you, but what's the benefit of using readline instead of simply <>?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://629291]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (8)
As of 2024-03-28 10:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found