I've done similar things, and I find it best to make a single pass through the file. To do so, here's the approach I take:
First, I use a hash to contain all current transactions. (I'm assuming that each thread is handling only one task at a time, so within a thread you're not getting multiple transactions intermingled.) In this case, I'd use the thread ID as the key.
Next, read each line. You're going to find that the line is one of:
my %TxnQ = (); while (<DATA>) { ##### TRANSACTION HEADERS ##### # HEADER: emit previous transaction (if any), start new one if ( m/(.{10}\s.{12})\s\((\d+)\)Authentication Request/ ) { # Emit previous transaction, if any complete_transaction($TxnQ{$2}) if exists $TxnQ{$2}; # Delete previous data by replacing with new data $TxnQ{$2} = (timestamp=>$1, type='Request'); } # ...etc... ##### INTERMEDIATE LINES ##### elsif ( m/.{10}\s.{12}\s\((\d+)\)Acct-Session-Id : String Value = +(.*$)/ ) { # Just add the additional data to the threads transaction reco +rd $TxnQ{$1}{'Acct-Session-Id'} = $2; } # ...etc... ##### TRANSACTION TERMINATORS ##### elsif ( m/.{10}\s.{12}\s\((\d+)\)User-Name : String Value = (.*$)/ + ) { # Add the final data item(s) (if req'd) $TxnQ{$1}{'User-Name'} = $2; # Process the transaction complete_transaction($TxnQ{$1}); # Delete the data delete $TxnQ{$1}; } # ...etc... ##### LINES WE DON'T CARE ABOUT ##### elsif ( m/frammistat/ | m/^\s*$/ | m/^#/ ) { # DO NOTHING We're explicitly ignoring these lines } else { print "LINE $.: Unrecognized line. Complete text:\n$_"; } } # Complete remaining transactions (hopefully complete transactions # that don't have explicit transaction terminator lines) for (keys %TxnQ) { complete_transaction($TxnQ{$_}); } sub complete_transaction { my $hr = shift; if (!defined $$hr{type}) { print "Incomplete transaction found!\n"; } elsif ($$hr{type} eq 'Request') { complete_request($hr) } elsif ($$hr{type} eq 'Response') { complete_response($hr) } # ...etc... else { print "ERROR: Unexpected transaction type: $$hr{type}!\n"; } }
Obviously, you'd need to add error handling and such as you see fit. Standard disclaimers apply: Untested code, use at your own risk, if it breaks you can keep all the pieces, etc. ad nauseum.
...roboticus
Update: And if I had read the entire thread, I would've noticed that ig had already given an example of how to do this. Ah, well, it happens when you don't get enough sleep. I also added the <readmore> tags, as the post was a bit longish.
In reply to Re: Interlaced log parser
by roboticus
in thread Interlaced log parser
by tzen
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |