Greetings once again,
Extracting valuable information from custom apache logs genetrates some 30meg files on a daily basis. What I need to do
is be able to come up with weekly/monthly/etc. summaries for visits, category visits returning users, and on and on. On the fly(next day)
the format of the daily files is as follows:
UniqueUser LastVisitTime PrimaryCategory SecondaryCategory PageViews MerchantClicks Sessions MarketingMode ZIP
0ce46e475f94ecb9 01/Jan/2003:16:00:08 Computers Computers 1 0 1 no_mode 00000
188c0530ac92475a 01/Jan/2003:16:00:02 Computers Computers 1 1 1 no_mode 44614
189a4a75d0cbad03 01/Jan/2003:16:00:01 No_category No_category 1 0 1 no_mode 00000
189e45678964fcf6 01/Jan/2003:16:00:07 Electronics Electronics 1 0 1 no_mode 00000
18a416ba3d3c7a8d 01/Jan/2003:16:00:12 No_category No_category 2 0 1 no_mode 00000
18aa11982e30e1ef 01/Jan/2003:16:00:07 No_category No_category 1 0 1 no_mode 00000
the daily files are the output of:
print OUTFILE "$ut\t${$users{$ut}}[0]\t${$users{$ut}}[1]\t${$users{$
+ut}}[2]\t${$users{$ut}}[3]\t${$users{$ut}}[4]\t${$users{$ut}}[5]\t${$
+users{$ut}}[6]\t${$users{$ut}}[7]\n";
as I'm going through the daily files, I'd like to be able, for a particular unique "ut" to keep it as a key, replace the date with the
last seen date, add all the numeric values such as page views, clicks, and sessions.
So here I am trying to "consolidate" all of this information, into a giant hash. I can already sense the disaproval
This however is the only thing that comes to mind. If anyone have any suggestions, on this, they would be greatly
appreciated. And if there is another way to do it, which by definition there is, I'd love to find out about it
Thank You in advance,
~vili
Addicted to sniffing 802.11
Minor typo / author's req. - dvergin 2003-08-22