Re^3: hash referencing...best approach?

Ok, let's see if this gets you started. I'm going to write this as mostly pseudocode comments. You get to fill in the code.

my %session_hash;
my %departments;
# for each of the 6 session files,
#     open and read line-by-line
#         parse out user_id, session_id, dept_code
          $session_hash{$session_id} = $dept_code;
          $departments{$dept_code}++;
#
# for each apache log
#     open and read line-by-line
#         parse out session_id
#         append the line to the file associated with $session_hash{$s
+ession_id}
[download]

If you have a relatively small number of departments (keys %departments), you can keep all the output files open for writing. Otherwise, you'll need to open for append each time you want to write a line of output. (You could also hold some number of lines in memory and write them out every so often, for a little less open and closing action.)

HTH.

The PerlMonk tr/// Advocate

Comment on Re^3: hash referencing...best approach? Download Code

Replies are listed 'Best First'.
Re: Re^3: hash referencing...best approach? by Anonymous Monk on Nov 25, 2003 at 00:06 UTC
I have about 40 possible departments, so I just wrote a small sub routine `sub write_it{ my ($dept,$output)=@_; my $output_file=$dept."_access.log"; open DATA,">> $output_file")\|\|die ("unable to open $output_file $!\n") +; print DATA $output."\n"; close DATA; }` [download] This works pretty great, I was able to read through one of the access logs and create the specific files in about 4 minutes. The one problem I have now is appending the user_code to the end of the access log line. I tried to change the way I 'write' to the session_hash while reading the session logs to : `$session_hash($sessio_id}{$user}=$dept_code;` [download] But this just screwed me up later down the line when I reading through the access logs. For this part I currently have: `open (HTTP,$access_log)\|\|die ("unable to open $access_log $!\n"); while (my $line2=<HTTP>) { chomp $line2; my @fields=split /\s+/, $line2; my $session=@fields[6]; my $session=substr($session,(index($session,"?")+12),(inde +x($session,"\|"))-(index($session,"?")+12) ); if (length($session) ==52) { &write_it($session_hash{$session},$line2); }` [download] I tried to incorporate the user_code part into this and ended up getting the hash address everywhere. In other words, my file names became hash addresses and my user_code values where null. Surely I am missing something minor here. Thanks for all the help thus far, it has proven most superb.	[reply] [d/l] [select]
Re: ^3+: hash referencing...best approach? by Roy Johnson (Monsignor) on Nov 25, 2003 at 02:51 UTC
You don't need `$user` as part of your key, since `$session_id` is unique. If you want to retrieve something, store it in the value part of the hash. The key is for stuff you want to look up by. So: `$session_hash{$session_id} = [$dept_code, $user]; #array ref` [download] and in `write_id`, the magical print will be `print DATA join(',', @$output), "\n"; # Or something like that` [download] Review `perldoc perlreftut` to get a better handle on how complex data structures are built in Perl. The PerlMonk `tr///` Advocate	[reply] [d/l] [select]
Re: Re: Re^3: hash referencing...best approach? by Roger (Parson) on Nov 25, 2003 at 00:25 UTC
Change your write it line from `&write_it($session_hash{$session},$line2);` [download] to - `write_it($session_hash{$session}{$_},$line2) for keys %{$session_hash{$session}}; # retrieve user names` [download] And it will work.	[reply] [d/l] [select]