If the format is as consistent as it appears to be, you could process each line at a time. This approach might be significantly faster since it uses simpler patterns rather than one long, complicated one (I haven't benchmarked it), but IMO the real win is in how much easier it would be to maintain.
use strict; use warnings; my %playerdata; while( my $line = <DATA> ) { if( $line =~ m/\[(.+?)] Player: (.+?) \(uid: (.+?)\)/ ) { $playerdata{timestamp} = $1; $playerdata{player} = $2; $playerdata{uid} = $3; } foreach my $stat ( qw( Score Kills Deaths Suicides ), 'Team Kills' + ) { if( $line =~ m/] $stat: (\d+)/ ) { $playerdata{$stat} = $1; last; } } if( $line =~ m/Objective: (\d+)/ ) { $playerdata{Objective} = $1; # ...do something with log entry... %playerdata = (); } } __DATA__ [Thu Sep 21 17:48:38 2006] [Thu Sep 21 17:48:38 2006] ------------------------------------------ [Thu Sep 21 17:48:38 2006] Server started. [Thu Sep 21 18:37:22 2006] Client connected: Alpha [Thu Sep 21 18:37:31 2006] Client connected: Bravo [Thu Sep 21 18:38:20 2006] Client connected: Charlie [Thu Sep 21 18:39:18 2006] Client connected: Delta [Thu Sep 21 18:53:36 2006] [Thu Sep 21 18:53:36 2006] *** Results for Map: Worlds\ReleaseMultipla +yer\Bypass [Thu Sep 21 18:53:36 2006] [Thu Sep 21 18:53:36 2006] Team: Team 1 [Thu Sep 21 18:53:36 2006] Score: 93 [Thu Sep 21 18:53:36 2006] [Thu Sep 21 18:53:36 2006] Player: Alpha (uid: ad7023b7f46271acd31e1bd +287613b6d) [Thu Sep 21 18:53:36 2006] Score: 55 [Thu Sep 21 18:53:36 2006] Kills: 14 [Thu Sep 21 18:53:36 2006] Deaths: 15 [Thu Sep 21 18:53:36 2006] Team Kills: 0 [Thu Sep 21 18:53:36 2006] Suicides: 0 [Thu Sep 21 18:53:36 2006] Objective: 0 [Thu Sep 21 18:53:36 2006] [Thu Sep 21 18:53:36 2006] Player: Bravo (uid: 5fdcc95043dc4dac9d7b4af +b8469eb4f) [Thu Sep 21 18:53:36 2006] Score: 38 [Thu Sep 21 18:53:36 2006] Kills: 11 [Thu Sep 21 18:53:36 2006] Deaths: 17 [Thu Sep 21 18:53:36 2006] Team Kills: 0 [Thu Sep 21 18:53:36 2006] Suicides: 0 [Thu Sep 21 18:53:36 2006] Objective: 0 [Thu Sep 21 18:53:36 2006] [Thu Sep 21 18:53:36 2006] Team: Team 2 [Thu Sep 21 18:53:36 2006] Score: 135 [Thu Sep 21 18:53:36 2006] [Thu Sep 21 18:53:36 2006] Player: Charlie (uid: e94839cae76debf1418ab +9dfaa4c01e8) [Thu Sep 21 18:53:36 2006] Score: 61 [Thu Sep 21 18:53:36 2006] Kills: 15 [Thu Sep 21 18:53:36 2006] Deaths: 14 [Thu Sep 21 18:53:36 2006] Team Kills: 0 [Thu Sep 21 18:53:36 2006] Suicides: 0 [Thu Sep 21 18:53:36 2006] Objective: 0 [Thu Sep 21 18:53:36 2006] [Thu Sep 21 18:53:36 2006] Player: Delta (uid: b2ea959c1b3fa5c35ef6a6e +576cdf2af) [Thu Sep 21 18:53:36 2006] Score: 46 [Thu Sep 21 18:53:36 2006] Kills: 10 [Thu Sep 21 18:53:36 2006] Deaths: 4 [Thu Sep 21 18:53:36 2006] Team Kills: 0 [Thu Sep 21 18:53:36 2006] Suicides: 0 [Thu Sep 21 18:53:36 2006] Objective: 0 [Thu Sep 21 18:53:36 2006] [Thu Sep 21 18:53:36 2006] [Thu Sep 21 18:57:47 2006] Client disconnected: Delta [Thu Sep 21 18:58:01 2006] Client disconnected: Alpha [Thu Sep 21 18:58:17 2006] Client disconnected: Bravo [Thu Sep 21 18:59:03 2006] Client disconnected: Charlie
You could add a number of optimizations to this (like cleaning up the regexes: (.+?), etc), but this should give you the idea. You could also add code to grab the fields in between the player blocks (for the team stats). You could also split the line into <timestamp> and <the rest> and process each independently. Finally, you could try to use paragraph mode ($/) to read in a whole record at a time, but I'm not sure that would help WRT maintenance.
Update: looks like graff beat me to it. :-)
In reply to Re: regex issues with /gc in log analysis...
by bobf
in thread regex issues with /gc in log analysis...
by EvanK
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |