Hey All-
I want to create one large hash with similar keys that totals the
values from a large data file that looks like this:
ZIP, Money, Date, ID
12345, 200, 11062000, C1234
12345, 50, 11062000, C1234
67890, 50, 11072000, D5555
The end product should look something like this:
ZIP, Total (11062000), Total (11072000), ID
12345, 250, 0, C1234
67890, 0, 50, D5555
Essentially, I need to aggregate the based on the ZIP and create new columns for each date.
Here is the code I have thus far:
#Read a file that has the dates and IDs I need...each line of the file
+ looks like...11062000C12345678
open(DATA1, "<", "dates.txt") or die $!;
while (<DATA1>) {
my $dateindex = substr $_, 0, 8;
my $candID = substr $_, 8, 9;
#Now that I have the first date/ID I will now start to aggregate the m
+oney by Zip Code...this requires opening a second file which is forma
+tted like this...ZIP/MONEY/DATE/ID
open(DATA2, "<", "indv2000.txt") or die $!;
my %hash;
while (<DATA2>) {
my $zip = substr $_, 82, 5;
my $money = substr $_, 130, 7;
my $date = substr $_, 122, 8;
my $cand1 = substr $_, 0, 9;
#IF the date and candidate ID is equal to the one fed by the first whi
+le loop...then aggregate the money by zip code
if ($date == $dateindex && $cand1 eq $candID){
$hash{$zip} += $money;
}
}
#NOW that I have the total money for zip code for the day and candidat
+e ID in question....I will print the hash separated by commas with an
+ indication of the date/candidate ID used....I will use the dos comma
+nd > to dump the output into a text file
while ( ($k,$v) = each %hash ) {
print "ZIP, MONEY, DATE, CANDID \n";
print "$k,$v, $dateindex, $candID, HASH \n";
}
#repeat until the dates file is done...
}
The output from this code looks like this:
ZIP, MONEY, DATE, CANDID
38401,250, 11062000, C00003418
77024,200, 11062000, C00003418
ZIP, MONEY, DATE, CANDID
75711,1000, 11072000, C00003418
33480,5000, 11072000, C00003418
So forth and so on for every date and candidate. Does anyone have any
suggestions about how to modify this code in order to get the output in
this format:
ZIP, MONEY (11062000), MONEY (11072000), ...., CANDID
75711, 0, 1000, C00003418
33480, 0, 5000, C00003418
I would appreciate any help one can give.
Thanks!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.